WEBVTT

00:00:00.000 --> 00:00:01.919
OK, so let's get into this. We're looking at

00:00:01.919 --> 00:00:06.820
three really different stories about where AI

00:00:06.820 --> 00:00:10.400
is going right now. We have this new household

00:00:10.400 --> 00:00:13.480
robot that's designed to be, you know, cute on

00:00:13.480 --> 00:00:15.839
purpose. Right. Then you have these AI models

00:00:15.839 --> 00:00:18.039
just battling it out in these secret benchmarks.

00:00:18.320 --> 00:00:21.559
And on the other end, a group just dropped the

00:00:21.559 --> 00:00:24.620
entire blueprint for a major language model.

00:00:25.160 --> 00:00:28.739
True open source. And if you need proof of just

00:00:28.739 --> 00:00:31.660
how intense all of this is getting, get this.

00:00:32.219 --> 00:00:34.500
Meta is now asking the government for permission

00:00:34.500 --> 00:00:37.320
to trade electricity. Yeah, just to feed their

00:00:37.320 --> 00:00:40.039
AI, the stakes are getting incredibly high. Welcome

00:00:40.039 --> 00:00:43.000
to the deep dive. Our mission today is to unpack

00:00:43.000 --> 00:00:45.380
these sources for you and really figure out what

00:00:45.380 --> 00:00:47.539
matters most. We've got three main areas we're

00:00:47.539 --> 00:00:49.600
going to hit. First up, that dish doing robot

00:00:49.600 --> 00:00:52.659
and the surprisingly simple cheat code they found

00:00:52.659 --> 00:00:54.560
for training it. Then we'll get into the latest

00:00:54.560 --> 00:00:57.119
industry headlines. Who's winning the AI race

00:00:57.119 --> 00:00:59.259
and what does all that energy consumption really

00:00:59.259 --> 00:01:01.859
mean? And finally, we'll look at a really radical

00:01:01.859 --> 00:01:04.640
new definition of open source from the AI2 team

00:01:04.640 --> 00:01:06.920
that could change everything. Let's start in

00:01:06.920 --> 00:01:09.079
the home, in the kitchen, actually. Yeah, the

00:01:09.079 --> 00:01:11.629
cute robot. Sunday Robotics. just came out of

00:01:11.629 --> 00:01:14.310
stealth with their memo robot. And we've all

00:01:14.310 --> 00:01:16.549
been hearing that promise of a Rosie the robot

00:01:16.549 --> 00:01:19.569
for, what, decades now? Right. A real helper

00:01:19.569 --> 00:01:22.109
around the house. It's finally starting to feel

00:01:22.109 --> 00:01:24.090
a little more tangible, and I think it's because

00:01:24.090 --> 00:01:26.750
they solved the data problem in a really different

00:01:26.750 --> 00:01:29.930
way. The huge roadblock for home robots has always

00:01:29.930 --> 00:01:32.739
been the data. Right. Getting good data on how

00:01:32.739 --> 00:01:35.319
humans actually do things. Exactly. How do you

00:01:35.319 --> 00:01:39.060
open that specific drawer or pick up a mug without,

00:01:39.140 --> 00:01:42.459
you know, crushing it? It's all about dexterity.

00:01:42.500 --> 00:01:45.079
And traditionally that meant using a super expensive

00:01:45.079 --> 00:01:48.700
telop rig. Yeah. Where a skilled operator wears

00:01:48.700 --> 00:01:51.599
all this motion capture gear to control the robot

00:01:51.599 --> 00:01:54.659
remotely. It's precise, but it's also incredibly

00:01:54.659 --> 00:01:57.379
slow and costs a fortune per hour. So they just

00:01:57.379 --> 00:01:59.760
threw that whole idea out. Completely. They swapped

00:01:59.760 --> 00:02:03.819
it for the $200 skill capture gloves. $200. Yeah.

00:02:04.599 --> 00:02:06.840
Real people just put them on in their own homes

00:02:06.840 --> 00:02:10.110
and do chores. Live their lives. Pretty much.

00:02:10.210 --> 00:02:13.370
They're clearing tables, folding laundry, loading

00:02:13.370 --> 00:02:16.949
dishwashers, even pulling espresso shots. It

00:02:16.949 --> 00:02:20.009
captures pure human motor skills, but cheaply

00:02:20.009 --> 00:02:22.509
and at a massive scale. That's brilliant. You're

00:02:22.509 --> 00:02:24.710
just crowdsourcing the training data from the

00:02:24.710 --> 00:02:26.830
8 billion people who already know how to do the

00:02:26.830 --> 00:02:29.229
job. And the sources say this method generated

00:02:29.229 --> 00:02:32.729
10 million episodes of data. 10 million. That's

00:02:32.729 --> 00:02:35.990
just, that is a massive shortcut. It is. It just

00:02:35.990 --> 00:02:39.169
rapidly accelerates how you teach a robot complex

00:02:39.169 --> 00:02:42.009
home dexterity. So how good is it? Well, the

00:02:42.009 --> 00:02:44.050
results are pretty impressive. Memo can do a

00:02:44.050 --> 00:02:46.889
full table -to -dishwasher run. It's a sequence

00:02:46.889 --> 00:02:50.050
of 68 different dexterous moves. And the big

00:02:50.050 --> 00:02:52.490
question. The big question. Across more than

00:02:52.490 --> 00:02:55.909
20 live demos, they recorded zero broken wine

00:02:55.909 --> 00:02:58.729
glasses. Wow. Okay, that's the real test. That's

00:02:58.729 --> 00:03:00.469
the real test. That kind of delicate handling

00:03:00.469 --> 00:03:02.979
tells you the data quality is there. But wait,

00:03:03.080 --> 00:03:05.060
if the gloves are that cheap, isn't filtering

00:03:05.060 --> 00:03:07.990
all that messy crowdsourced data? Just an absolute

00:03:07.990 --> 00:03:09.889
nightmare. I mean, couldn't cheap data just lead

00:03:09.889 --> 00:03:12.449
to cheap Dex Purdy? That's the core challenge

00:03:12.449 --> 00:03:14.490
for sure. But look at the team. It's a bunch

00:03:14.490 --> 00:03:17.430
of Stanford PhDs and ex -Tesla FSD engineers.

00:03:17.770 --> 00:03:20.090
They lived through that whole millions of miles

00:03:20.090 --> 00:03:22.629
data collection problem. They know how to filter

00:03:22.629 --> 00:03:25.509
noise. They explicitly said they chose to leverage

00:03:25.509 --> 00:03:28.949
8 billion humans instead of millions of miles.

00:03:29.250 --> 00:03:31.370
OK, that background brings up some skepticism

00:03:31.370 --> 00:03:34.370
for me, though. Yeah. The FSD team was, let's

00:03:34.370 --> 00:03:37.219
be honest. Famous for overpromising on timelines?

00:03:37.599 --> 00:03:41.199
Mm -hmm. Does that affect how we should see their

00:03:41.199 --> 00:03:44.979
late 2026 ship date? It definitely raises an

00:03:44.979 --> 00:03:47.939
eyebrow, but I think the domain is more constrained

00:03:47.939 --> 00:03:50.500
here than full self -driving. And their other

00:03:50.500 --> 00:03:53.300
big choice is maybe just as important. The design.

00:03:53.500 --> 00:03:56.819
The design. The robot is soft. It's round. It

00:03:56.819 --> 00:03:59.379
wears a little hat. So cuteness is now a technical

00:03:59.379 --> 00:04:02.280
feature. Exactly. Because it builds trust. If

00:04:02.280 --> 00:04:04.939
this little non -threatening robot makes a mistake

00:04:04.939 --> 00:04:07.300
while it's learning, you're more likely to forgive

00:04:07.300 --> 00:04:09.479
it and let it keep trying. Instead of unplugging

00:04:09.479 --> 00:04:11.240
it and throwing it in the closet. Right. User

00:04:11.240 --> 00:04:13.219
empathy actually becomes part of the training

00:04:13.219 --> 00:04:16.879
loop. So if we boil this down, leveraging crowdsourced

00:04:16.879 --> 00:04:19.779
human experience. rapidly accelerates complex

00:04:19.779 --> 00:04:22.000
home dexterity. That's it. It just fundamentally

00:04:22.000 --> 00:04:24.439
changes the robotics timeline. Okay, let's shift

00:04:24.439 --> 00:04:28.279
from the kitchen counter to the wild world of

00:04:28.279 --> 00:04:30.300
AI rankings. Yeah, it's a high -stakes race.

00:04:30.420 --> 00:04:31.980
And these benchmarks, they used to be pretty

00:04:31.980 --> 00:04:34.800
stable. Not anymore. There's a new one, the Humane

00:04:34.800 --> 00:04:37.019
Benchmark, that just sent out some shockwaves.

00:04:37.199 --> 00:04:39.600
It's designed to test really modern stuff like

00:04:39.600 --> 00:04:42.100
advanced reasoning, handling images and text,

00:04:42.199 --> 00:04:46.060
and safety. And the big surprise? GPT came in

00:04:46.060 --> 00:04:48.759
at number eight. Number eight. I mean, that is

00:04:48.759 --> 00:04:51.720
shockingly low for a model that basically felt

00:04:51.720 --> 00:04:54.560
like the default for so long. If GPT is at eight,

00:04:54.639 --> 00:04:57.730
who's at the top? Top spot went to Gemini 2 .5

00:04:57.730 --> 00:05:00.329
Pro. After that's Tight Race. You've got Deep

00:05:00.329 --> 00:05:03.709
Seek, Mistroll, and then get this Grok 4 and

00:05:03.709 --> 00:05:06.970
Grok 3 took spots 4 and 5. So the performance

00:05:06.970 --> 00:05:09.990
gap is just closing incredibly fast. Faster than

00:05:09.990 --> 00:05:12.110
ever. And these technical leaps are leading to

00:05:12.110 --> 00:05:14.250
some pretty wild spectacle too. You mean the

00:05:14.250 --> 00:05:16.810
Gemini 3 Pro ad on the Las Vegas sphere? Yeah,

00:05:16.829 --> 00:05:19.069
that thing. People were genuinely arguing online

00:05:19.069 --> 00:05:21.370
about whether it was real footage or AI generated.

00:05:21.709 --> 00:05:23.509
Because the quality was just too good to tell.

00:05:24.379 --> 00:05:26.779
That line is getting very, very blurry. And it's

00:05:26.779 --> 00:05:29.769
not just visuals, it's speed. We just saw the

00:05:29.769 --> 00:05:32.509
first fully driverless race cars in Abu Dhabi.

00:05:32.529 --> 00:05:35.689
I saw that. Hitting 155 miles per hour, racing

00:05:35.689 --> 00:05:38.610
wheel to wheel. That's all algorithmic decision

00:05:38.610 --> 00:05:41.350
making at insane speeds. And then on the creative

00:05:41.350 --> 00:05:43.529
side, you have things like Google's Nano Banana

00:05:43.529 --> 00:05:46.649
Pro. Which is just doing crazy things with images.

00:05:47.410 --> 00:05:50.870
Fixing specific text inside a photo or blending,

00:05:50.970 --> 00:05:53.250
what, 14 different images together seamlessly.

00:05:53.610 --> 00:05:57.000
It's just... It's hard to even keep up. It really

00:05:57.000 --> 00:05:59.560
is. I have to say, even with all these powerful

00:05:59.560 --> 00:06:02.420
new tools, I still wrestle with prompt drift

00:06:02.420 --> 00:06:05.050
myself. Oh, really? Yeah. I'll find a model,

00:06:05.089 --> 00:06:07.350
gives me perfect output for like the first week

00:06:07.350 --> 00:06:09.629
it's out. Then a month later, the consistency

00:06:09.629 --> 00:06:12.410
just degrades and I'm back to reengineering my

00:06:12.410 --> 00:06:14.990
prompts all over again. Getting reliable output

00:06:14.990 --> 00:06:18.110
is still a huge challenge. That inconsistency

00:06:18.110 --> 00:06:20.550
makes the whole infrastructure problem even worse,

00:06:20.649 --> 00:06:22.649
doesn't it? We're talking about massive amounts

00:06:22.649 --> 00:06:24.930
of energy to run these things. Which brings us

00:06:24.930 --> 00:06:27.490
back to meta. Bloomberg is reporting that because

00:06:27.490 --> 00:06:29.449
the grid is struggling to keep up with their

00:06:29.449 --> 00:06:32.410
AI hunger, they've asked the feds for permission

00:06:32.410 --> 00:06:34.829
to trade electricity. They're trying to become

00:06:34.829 --> 00:06:37.170
an energy company. Essentially, yeah. A utility

00:06:37.170 --> 00:06:39.410
company just to service their own data centers.

00:06:39.730 --> 00:06:42.170
They're so power hungry, they're trying to shape

00:06:42.170 --> 00:06:44.769
national energy strategy just to keep the lights

00:06:44.769 --> 00:06:47.209
on for their models. Wow. So if we connect the

00:06:47.209 --> 00:06:50.569
dots here, the hidden cost of scaling these huge

00:06:50.569 --> 00:06:54.129
models is that AI is now powerful enough to dictate

00:06:54.129 --> 00:06:57.529
national energy infrastructure strategy. It changes

00:06:57.529 --> 00:07:00.449
the entire planning horizon for the energy sector.

00:07:00.670 --> 00:07:03.480
We hear the term open source. A lot in AI. And

00:07:03.480 --> 00:07:05.800
for a while now, that's usually just meant one

00:07:05.800 --> 00:07:08.660
thing, sharing the model weights. Right. And

00:07:08.660 --> 00:07:10.500
we should probably define that. The weights are

00:07:10.500 --> 00:07:12.860
just the final set of numbers that make the model

00:07:12.860 --> 00:07:15.360
work. It's the finished product, but not the

00:07:15.360 --> 00:07:18.259
recipe. Exactly. So if you're a researcher trying

00:07:18.259 --> 00:07:21.620
to figure out why a model is biased, just having

00:07:21.620 --> 00:07:23.959
the weights is not enough. It's a black box.

00:07:24.199 --> 00:07:26.319
You can use it, but you can't truly understand

00:07:26.319 --> 00:07:29.459
it. Which is why this new release from the AI2

00:07:29.459 --> 00:07:33.910
team, OMO3, is such a big deal. They've... They've

00:07:33.910 --> 00:07:35.930
radically redefined the term. They're providing

00:07:35.930 --> 00:07:38.230
everything, the full training data, every line

00:07:38.230 --> 00:07:40.449
of code, every training checkpoint. The checkpoints

00:07:40.449 --> 00:07:43.430
are like saved games, right? The model at different

00:07:43.430 --> 00:07:45.949
stages of learning. Exactly. And every decision

00:07:45.949 --> 00:07:48.269
they made along the way. It's a full -on anti

00:07:48.269 --> 00:07:51.170
-black box effort. It's all about traceability.

00:07:51.589 --> 00:07:53.670
That is a level of transparency we just haven't

00:07:53.670 --> 00:07:55.910
seen. It's like, imagine if Tesla didn't just

00:07:55.910 --> 00:07:58.189
give you the car blueprints, but handed over

00:07:58.189 --> 00:08:00.910
the entire assembly line. Okay. The notes from

00:08:00.910 --> 00:08:02.949
the engineers, the faulty parts they threw out.

00:08:03.019 --> 00:08:05.959
Every software tweak, it completely opens up

00:08:05.959 --> 00:08:08.519
the science behind it. Let's talk about the actual

00:08:08.519 --> 00:08:11.980
models. They released Olmo 3Think, a 32 billion

00:08:11.980 --> 00:08:14.740
parameter model. And that one is focused specifically

00:08:14.740 --> 00:08:16.459
on chain of thought reasoning, so it's perfect

00:08:16.459 --> 00:08:18.319
for researchers. And you have the base models,

00:08:18.500 --> 00:08:22.680
a 7B and a 32B. And these have a huge 65 ,000

00:08:22.680 --> 00:08:25.769
context window. That's 16 times bigger than the

00:08:25.769 --> 00:08:28.529
last version, so they can handle enormous documents

00:08:28.529 --> 00:08:31.250
for coding, math, comprehension. And there's

00:08:31.250 --> 00:08:34.669
also a smaller 7B instruct model, right? Tuned

00:08:34.669 --> 00:08:37.190
for chat and using tools, which is great for

00:08:37.190 --> 00:08:39.690
running locally. Yeah, but what's really revolutionary

00:08:39.690 --> 00:08:42.330
is the cost efficiency here. Right. The sources

00:08:42.330 --> 00:08:45.629
say the 32B model rivals the performance of something

00:08:45.629 --> 00:08:49.610
like Quinn. But it used six times fewer training

00:08:49.610 --> 00:08:52.409
tokens to get there. Six times. That's an incredible

00:08:52.409 --> 00:08:54.730
reduction in resources. Which means the training

00:08:54.730 --> 00:08:57.730
cost was only about two point two million dollars.

00:08:57.789 --> 00:09:00.929
That's that's. astonishingly cheap for a model

00:09:00.929 --> 00:09:03.110
this capable. It's like building a high -performance

00:09:03.110 --> 00:09:05.549
rocket on a motorcycle budget. It just shows

00:09:05.549 --> 00:09:08.669
what smart data curation can do. Whoa. I mean,

00:09:08.669 --> 00:09:11.509
imagine scaling that kind of full traceability

00:09:11.509 --> 00:09:14.809
to a billion queries, knowing exactly why you

00:09:14.809 --> 00:09:16.710
got a certain output, not just guessing. And

00:09:16.710 --> 00:09:19.210
it's all out there now. AI2, Playground, Hugging

00:09:19.210 --> 00:09:21.429
Face, you can run it locally. And it's all under

00:09:21.429 --> 00:09:24.169
the Apache 2 .0 license. That permissive license

00:09:24.169 --> 00:09:27.899
is key. It is. But that ability to trace a response

00:09:27.899 --> 00:09:30.320
all the way back to the original training recipe?

00:09:30.970 --> 00:09:33.950
That's the unique part. Researchers can now dig

00:09:33.950 --> 00:09:36.710
in and see exactly where a bias comes from or

00:09:36.710 --> 00:09:39.409
why the model learned a certain thing. So if

00:09:39.409 --> 00:09:42.909
you look past just the low cost, the single biggest

00:09:42.909 --> 00:09:46.470
benefit here is that complete transparency allows

00:09:46.470 --> 00:09:49.149
for critical investigation and much faster model

00:09:49.149 --> 00:09:51.549
improvement. It just eliminates all the guesswork.

00:09:51.769 --> 00:09:54.710
So we've really covered three huge shifts today.

00:09:54.870 --> 00:09:57.850
We have, first, robotics found a way to achieve

00:09:57.850 --> 00:10:00.500
complex dexterity, not with expensive... gear,

00:10:00.700 --> 00:10:03.539
but with a clever, cheap, crowdsourced data trick,

00:10:03.679 --> 00:10:07.740
the $200 gloves. Right. Then second, the AI performance

00:10:07.740 --> 00:10:09.919
race is shifting the whole landscape and putting

00:10:09.919 --> 00:10:13.120
this just unbelievable strain on our energy grid.

00:10:13.299 --> 00:10:15.419
Meta trying to become an energy trader is the

00:10:15.419 --> 00:10:17.419
perfect example of that. And finally, the very

00:10:17.419 --> 00:10:20.000
definition of open source has been expanded in

00:10:20.000 --> 00:10:21.779
a radical way. Yeah, showing that efficiency

00:10:21.779 --> 00:10:24.360
and transparency can actually rival sheer scale.

00:10:24.440 --> 00:10:27.159
These three things, dexterity, strain, and transparency,

00:10:27.320 --> 00:10:29.440
they're all colliding right now. And that collision

00:10:29.440 --> 00:10:31.779
is going to define the next few years in tech.

00:10:31.919 --> 00:10:33.659
That's the big takeaway. So here's a thought

00:10:33.659 --> 00:10:36.159
to leave you with. If these highly efficient,

00:10:36.320 --> 00:10:40.179
fully traceable models like OMO3 become the standard

00:10:40.179 --> 00:10:44.460
models that cost just a couple million dollars

00:10:44.460 --> 00:10:48.220
to train and give you total insight, how quickly

00:10:48.220 --> 00:10:51.659
do the old proprietary black box AIs become obsolete?

00:10:51.919 --> 00:10:54.559
The ones that demand meta -sized power grids

00:10:54.559 --> 00:10:56.860
start to look really expensive and you can't

00:10:56.860 --> 00:10:59.200
even see inside them. something for you to consider.

00:10:59.360 --> 00:11:01.159
That's a really important question for where

00:11:01.159 --> 00:11:02.779
this whole industry is headed. Thank you for

00:11:02.779 --> 00:11:04.740
joining us for this deep dive. We look forward

00:11:04.740 --> 00:11:06.779
to diving into the next stack of sources with

00:11:06.779 --> 00:11:06.899
you.
