1
00:00:00,000 --> 00:00:02,040
Okay, so get ready because this deep dive,

2
00:00:02,040 --> 00:00:05,240
well, it's going to take us way past just chatbots.

3
00:00:05,240 --> 00:00:07,240
We're stepping into the future of AI,

4
00:00:07,240 --> 00:00:09,120
like way into the future.

5
00:00:09,120 --> 00:00:11,200
Spatial intelligence.

6
00:00:11,200 --> 00:00:14,040
It's not just about AI processing language anymore.

7
00:00:14,040 --> 00:00:17,440
It's about AI understanding the world,

8
00:00:17,440 --> 00:00:19,320
like in 3D, just like we do.

9
00:00:19,320 --> 00:00:21,440
Like teaching a computer to actually see,

10
00:00:21,440 --> 00:00:22,680
not just process an image,

11
00:00:22,680 --> 00:00:24,680
but understand what it's looking at, the depth.

12
00:00:24,680 --> 00:00:25,760
Exactly, yeah.

13
00:00:25,760 --> 00:00:27,280
And to help us wrap our heads around this,

14
00:00:27,280 --> 00:00:30,760
we've got some serious AI rock stars in our corner today.

15
00:00:30,760 --> 00:00:32,720
Fei-Fei Liva and Justin Johnson.

16
00:00:32,720 --> 00:00:34,640
Oh, wow, the World Labs founders.

17
00:00:34,640 --> 00:00:37,040
Yep, the masterminds behind World Labs,

18
00:00:37,040 --> 00:00:38,640
and they've even got Martin Cassado

19
00:00:38,640 --> 00:00:40,800
from A16 STYZ jamming in.

20
00:00:40,800 --> 00:00:41,840
This is big.

21
00:00:41,840 --> 00:00:43,360
What's at you so hype?

22
00:00:43,360 --> 00:00:45,640
It's the way they're framing this whole thing.

23
00:00:45,640 --> 00:00:48,880
They're saying, forget what you thought you knew about AI.

24
00:00:48,880 --> 00:00:51,960
This spatial intelligence, this 3D understanding,

25
00:00:51,960 --> 00:00:54,400
this is the real deal, the real game changer.

26
00:00:54,400 --> 00:00:56,200
So they're saying move over language models,

27
00:00:56,200 --> 00:00:58,080
there's something bigger on the horizon.

28
00:00:58,080 --> 00:01:00,120
Exactly, they're basically saying,

29
00:01:00,120 --> 00:01:02,240
forget the first few chapters of AI,

30
00:01:02,240 --> 00:01:04,480
we're diving straight into the climax.

31
00:01:04,480 --> 00:01:05,640
It's interesting though,

32
00:01:05,640 --> 00:01:07,520
how they bring it back to those early days,

33
00:01:07,520 --> 00:01:08,640
the AI winters.

34
00:01:08,640 --> 00:01:09,480
Oh, right.

35
00:01:09,480 --> 00:01:12,120
Fei-Fei and Justin, they've been around since then,

36
00:01:12,120 --> 00:01:14,560
seeing the rise of deep learning firsthand.

37
00:01:14,560 --> 00:01:17,440
Oh, so they've seen it all, the ups and the downs.

38
00:01:17,440 --> 00:01:19,240
Yeah, and they were a huge part of it.

39
00:01:19,240 --> 00:01:22,160
Remember ImageNet, it's not just some tech term,

40
00:01:22,160 --> 00:01:23,240
that was their baby.

41
00:01:23,240 --> 00:01:26,720
ImageNet, the massive data set of labeled pictures.

42
00:01:26,720 --> 00:01:28,880
That's the one, that data set,

43
00:01:28,880 --> 00:01:30,760
it's like they taught computers how to see.

44
00:01:30,760 --> 00:01:32,920
Yeah, it was a game changer for sure.

45
00:01:32,920 --> 00:01:35,240
But this whole data thing, it's crucial, right?

46
00:01:35,240 --> 00:01:36,640
I mean, these AI models,

47
00:01:36,640 --> 00:01:39,200
they need massive amounts of data to function.

48
00:01:39,200 --> 00:01:40,280
Huge amounts, it's crazy.

49
00:01:40,280 --> 00:01:43,480
They call data the new oil for a reason.

50
00:01:43,480 --> 00:01:46,080
And it's even more true with spatial intelligence.

51
00:01:46,080 --> 00:01:48,960
Think about it, to just identify an object,

52
00:01:48,960 --> 00:01:50,640
you need tons of data, right?

53
00:01:50,640 --> 00:01:53,560
But to then understand its shape, its texture,

54
00:01:53,560 --> 00:01:54,920
how it moves in the world.

55
00:01:54,920 --> 00:01:57,000
You're talking about an entirely different level

56
00:01:57,000 --> 00:01:58,160
of complexity.

57
00:01:58,160 --> 00:02:01,200
And data demands.

58
00:02:01,200 --> 00:02:02,920
And that's where things get really interesting,

59
00:02:02,920 --> 00:02:06,000
because here's where we hit this fork in the road.

60
00:02:06,000 --> 00:02:08,720
You've got language models, powerful as heck,

61
00:02:08,720 --> 00:02:11,880
but they're stuck in this one dimensional world,

62
00:02:11,880 --> 00:02:13,640
the world of words in a line.

63
00:02:13,640 --> 00:02:14,520
I see what you're saying,

64
00:02:14,520 --> 00:02:16,000
but isn't that how we communicate?

65
00:02:16,000 --> 00:02:17,720
Yeah, but the world we live in,

66
00:02:17,720 --> 00:02:19,640
it's anything but a straight line.

67
00:02:19,640 --> 00:02:21,480
It's full of dimensions.

68
00:02:21,480 --> 00:02:24,200
So you're saying language is just one slice of the pie.

69
00:02:24,200 --> 00:02:26,000
It's like, think about it this way,

70
00:02:26,000 --> 00:02:29,320
reading a recipe versus actually baking a cake.

71
00:02:29,320 --> 00:02:32,120
One's information, the other's experience.

72
00:02:32,120 --> 00:02:33,560
Okay, I get you.

73
00:02:33,560 --> 00:02:37,320
So spatial intelligence, it's about bridging that gap.

74
00:02:37,320 --> 00:02:39,200
That's where spatial intelligence comes in.

75
00:02:39,200 --> 00:02:40,480
It's about bridging that gap

76
00:02:40,480 --> 00:02:43,120
between information and experience.

77
00:02:43,120 --> 00:02:44,440
Okay, I like where this is going.

78
00:02:44,440 --> 00:02:45,320
And it makes you think, right?

79
00:02:45,320 --> 00:02:47,720
Like, remember the first time you tried VR, those headsets?

80
00:02:47,720 --> 00:02:49,200
Oh yeah, how could I forget?

81
00:02:49,200 --> 00:02:51,120
It's like you were transported somewhere else.

82
00:02:51,120 --> 00:02:52,240
Spatial intelligence,

83
00:02:52,240 --> 00:02:54,600
it could unlock a whole new level of that,

84
00:02:54,600 --> 00:02:57,000
a whole new world of immersive experiences.

85
00:02:57,000 --> 00:02:58,920
Like, we wouldn't even be able to tell the difference

86
00:02:58,920 --> 00:03:00,960
between digital and physical anymore.

87
00:03:00,960 --> 00:03:03,400
I mean, blurring the lines between the real world

88
00:03:03,400 --> 00:03:04,280
and the virtual world

89
00:03:04,280 --> 00:03:06,320
sounds like something straight out of a movie,

90
00:03:06,320 --> 00:03:08,040
but what would that even look like?

91
00:03:08,040 --> 00:03:10,060
Okay, so we're not just talking about, like,

92
00:03:10,060 --> 00:03:11,240
better video games here.

93
00:03:11,240 --> 00:03:12,640
And that those are definitely coming.

94
00:03:12,640 --> 00:03:14,040
Oh yeah, for sure.

95
00:03:14,040 --> 00:03:17,920
But imagine a world where robots can navigate our homes

96
00:03:17,920 --> 00:03:20,760
as easily as we can, AR.

97
00:03:20,760 --> 00:03:22,840
Forget those clunky overlays.

98
00:03:22,840 --> 00:03:26,200
We're talking seamlessly integrated digital elements

99
00:03:26,200 --> 00:03:27,560
into our world.

100
00:03:27,560 --> 00:03:32,000
And experiences, experiences in 3D, tailored just for you.

101
00:03:32,000 --> 00:03:33,480
It's like stepping into the future.

102
00:03:33,480 --> 00:03:34,640
But what about right now?

103
00:03:34,640 --> 00:03:36,960
Where does World Labs fit into all of this?

104
00:03:36,960 --> 00:03:39,920
Right, so they're on it, and they're not messing around.

105
00:03:39,920 --> 00:03:42,760
Feifei and Justin, they put together this dream team,

106
00:03:42,760 --> 00:03:44,440
bringing in the best of the best,

107
00:03:44,440 --> 00:03:47,440
computer vision, robotics, 3D modeling, you name it,

108
00:03:47,440 --> 00:03:48,280
they've got it covered.

109
00:03:48,280 --> 00:03:49,800
Sounds like they're going all in on this.

110
00:03:49,800 --> 00:03:52,320
They're on a mission to unlock spatial intelligence,

111
00:03:52,320 --> 00:03:54,160
like the full potential of it.

112
00:03:54,160 --> 00:03:56,720
And they say now is the time to do it.

113
00:03:56,720 --> 00:03:58,520
This isn't some far off thing.

114
00:03:58,520 --> 00:04:01,560
They've got these concrete milestones they're aiming for.

115
00:04:01,560 --> 00:04:04,400
So they have a plan, a roadmap to make this happen.

116
00:04:04,400 --> 00:04:06,520
What kind of milestones are we talking about here?

117
00:04:06,520 --> 00:04:09,280
Well, for one, they're talking about creating a system

118
00:04:09,280 --> 00:04:12,880
that can generate complete interactive 3D worlds.

119
00:04:12,880 --> 00:04:14,440
Not just static models,

120
00:04:14,440 --> 00:04:17,260
but worlds you can actually step into and interact with.

121
00:04:17,260 --> 00:04:18,520
Like in real time.

122
00:04:18,520 --> 00:04:20,280
Wow, you're blowing my mind here.

123
00:04:20,280 --> 00:04:24,480
So instead of just creating a 3D model of say, a building,

124
00:04:24,480 --> 00:04:26,720
you'd be able to walk through it, open doors,

125
00:04:26,720 --> 00:04:27,800
turn on lights.

126
00:04:27,800 --> 00:04:29,360
Yeah, like it's the real deal.

127
00:04:29,360 --> 00:04:32,280
Think about the implications for things like architecture,

128
00:04:32,280 --> 00:04:33,920
gaming, it could be huge.

129
00:04:33,920 --> 00:04:35,600
This is bigger than I even imagined.

130
00:04:35,600 --> 00:04:39,280
It's like we're on the verge of something massive.

131
00:04:39,280 --> 00:04:41,320
But hold on a second, this all sounds incredible,

132
00:04:41,320 --> 00:04:43,000
but isn't there a catch?

133
00:04:43,000 --> 00:04:44,560
Building these 3D worlds,

134
00:04:44,560 --> 00:04:46,780
understanding them at that level of detail,

135
00:04:46,780 --> 00:04:49,840
it must require a crazy amount of data, right?

136
00:04:49,840 --> 00:04:51,760
You're hitting on one of their biggest challenges.

137
00:04:51,760 --> 00:04:53,920
It's not just about teaching AI

138
00:04:53,920 --> 00:04:56,200
to recognize an object in a picture.

139
00:04:56,200 --> 00:04:58,600
We're talking about understanding its entire form,

140
00:04:58,600 --> 00:05:01,460
the texture, how it moves in 3D space.

141
00:05:01,460 --> 00:05:03,420
So how do they even begin to tackle that?

142
00:05:03,420 --> 00:05:05,880
It's not like they can create a 3D scan

143
00:05:05,880 --> 00:05:07,500
of the entire planet.

144
00:05:07,500 --> 00:05:09,380
That's where it gets even wilder.

145
00:05:09,380 --> 00:05:11,580
They're looking at creating these massive,

146
00:05:11,580 --> 00:05:14,160
synthetic 3D environments.

147
00:05:14,160 --> 00:05:16,240
Synthetic, like not real.

148
00:05:16,240 --> 00:05:17,480
Exactly.

149
00:05:17,480 --> 00:05:19,200
Think of it like a digital playground.

150
00:05:19,200 --> 00:05:21,260
They can control everything.

151
00:05:21,260 --> 00:05:23,320
How light interacts with objects,

152
00:05:23,320 --> 00:05:24,480
the physics of movement.

153
00:05:24,480 --> 00:05:26,400
It's like they're building a virtual stage

154
00:05:26,400 --> 00:05:28,960
to rehearse for the main event in the real world.

155
00:05:28,960 --> 00:05:29,920
Precisely.

156
00:05:29,920 --> 00:05:32,440
Instead of trying to capture every single detail

157
00:05:32,440 --> 00:05:33,280
of the real world.

158
00:05:33,280 --> 00:05:34,400
Those would be impossible, by the way.

159
00:05:34,400 --> 00:05:35,220
Totally.

160
00:05:35,220 --> 00:05:36,560
They're creating these controlled environments

161
00:05:36,560 --> 00:05:37,720
to train their models.

162
00:05:37,720 --> 00:05:41,240
It's like a blend of computer vision and AI imagination.

163
00:05:41,240 --> 00:05:43,120
It's a fascinating mix.

164
00:05:43,120 --> 00:05:44,960
And it lets them experiment at a scale

165
00:05:44,960 --> 00:05:47,040
that would be impossible in the real world.

166
00:05:47,040 --> 00:05:47,880
I bet.

167
00:05:47,880 --> 00:05:49,120
Build a virtual city one day,

168
00:05:49,120 --> 00:05:50,840
try out different weather conditions the next.

169
00:05:50,840 --> 00:05:51,680
Exactly.

170
00:05:51,680 --> 00:05:55,040
Even populated with AI controlled people in cars.

171
00:05:55,040 --> 00:05:56,160
It's like SimCity,

172
00:05:56,160 --> 00:05:58,360
but for serious scientific advancement.

173
00:05:58,360 --> 00:06:00,260
This is next level stuff.

174
00:06:00,260 --> 00:06:03,080
But how do we know this isn't all just pie in the sky?

175
00:06:03,080 --> 00:06:05,480
Are there any real world examples

176
00:06:05,480 --> 00:06:07,960
of this kind of technology in action today?

177
00:06:07,960 --> 00:06:09,120
Actually, yeah.

178
00:06:09,120 --> 00:06:12,840
Think about those latest, most realistic video games.

179
00:06:12,840 --> 00:06:14,240
Those incredible graphics,

180
00:06:14,240 --> 00:06:16,800
the details, the immersive worlds.

181
00:06:16,800 --> 00:06:20,120
Gamers spend hours in those virtual environments

182
00:06:20,120 --> 00:06:23,160
and they feel real the way you can interact with things.

183
00:06:23,160 --> 00:06:24,120
And those games,

184
00:06:24,120 --> 00:06:27,200
they're built on really sophisticated 3D engines.

185
00:06:27,200 --> 00:06:29,740
The same tech that World Labs things could be used

186
00:06:29,740 --> 00:06:30,920
for spatial intelligence.

187
00:06:30,920 --> 00:06:32,520
So the building blocks are already there.

188
00:06:32,520 --> 00:06:33,400
Exactly.

189
00:06:33,400 --> 00:06:34,960
And it's not limited to just entertainment.

190
00:06:34,960 --> 00:06:38,000
Think architecture, engineering, manufacturing.

191
00:06:38,000 --> 00:06:40,960
They're already using 3D modeling and simulation.

192
00:06:40,960 --> 00:06:42,040
But spatial intelligence,

193
00:06:42,040 --> 00:06:44,320
that takes it up like 10 notches.

194
00:06:44,320 --> 00:06:46,160
It's like we're moving from a world of screens

195
00:06:46,160 --> 00:06:49,160
and keyboards to a world where we can reach into the digital

196
00:06:49,160 --> 00:06:50,880
and manipulate it like it's real.

197
00:06:50,880 --> 00:06:52,440
And that's where that collaborative approach

198
00:06:52,440 --> 00:06:54,120
they're so big on comes in, right?

199
00:06:54,120 --> 00:06:56,060
They're bringing in not just computer scientists,

200
00:06:56,060 --> 00:06:59,420
but robotics experts, even cognitive psychologists.

201
00:06:59,420 --> 00:07:03,440
They know that to build AI that really gets our world,

202
00:07:03,440 --> 00:07:06,800
they need to understand how we experience it as humans.

203
00:07:06,800 --> 00:07:08,060
Makes sense.

204
00:07:08,060 --> 00:07:09,880
But this is a massive undertaking.

205
00:07:09,880 --> 00:07:11,580
How do they even measure success?

206
00:07:11,580 --> 00:07:13,800
It's not like there's a finish line they're gonna cross.

207
00:07:13,800 --> 00:07:14,640
You're right.

208
00:07:14,640 --> 00:07:16,800
They talk about it more in terms of milestones,

209
00:07:16,800 --> 00:07:18,420
not endpoints.

210
00:07:18,420 --> 00:07:20,960
Like one of their first goals is to build a system

211
00:07:20,960 --> 00:07:24,000
that can generate a complete 3D world.

212
00:07:24,000 --> 00:07:25,520
And not just visually,

213
00:07:25,520 --> 00:07:27,560
but one that behaves realistically too.

214
00:07:27,560 --> 00:07:29,960
Okay, so not just a 3D model of a building,

215
00:07:29,960 --> 00:07:32,480
but an environment where you can open doors,

216
00:07:32,480 --> 00:07:34,600
turn on lights, really experience it.

217
00:07:34,600 --> 00:07:35,440
Yes.

218
00:07:35,440 --> 00:07:36,880
And it's not pre-programmed.

219
00:07:36,880 --> 00:07:39,620
Those interactions are driven by the AI's understanding

220
00:07:39,620 --> 00:07:42,080
of how things work in the real world.

221
00:07:42,080 --> 00:07:44,080
That level of fidelity, that interaction,

222
00:07:44,080 --> 00:07:45,080
it's mind blowing.

223
00:07:45,080 --> 00:07:47,240
It's like taking VR to a whole new level.

224
00:07:47,240 --> 00:07:49,560
What if instead of just escaping into these worlds,

225
00:07:49,560 --> 00:07:52,580
we could use this technology to actually shape our own?

226
00:07:52,580 --> 00:07:53,420
Now you're thinking,

227
00:07:53,420 --> 00:07:55,720
it's like we could give architects and engineers

228
00:07:55,720 --> 00:07:57,240
these incredible tools, right?

229
00:07:57,240 --> 00:08:00,480
Let them design and test their ideas in a virtual world

230
00:08:00,480 --> 00:08:03,240
first before building anything in the real one.

231
00:08:03,240 --> 00:08:04,520
Exactly.

232
00:08:04,520 --> 00:08:06,120
No more expensive prototypes

233
00:08:06,120 --> 00:08:08,760
or oops, we forgot something, moments.

234
00:08:08,760 --> 00:08:10,040
And think about education.

235
00:08:10,040 --> 00:08:13,120
Imagine if instead of just reading about ancient Rome,

236
00:08:13,120 --> 00:08:15,180
students could actually walk through the forum,

237
00:08:15,180 --> 00:08:16,520
talk to digital citizens.

238
00:08:16,520 --> 00:08:17,360
That's wild.

239
00:08:17,360 --> 00:08:18,760
History would come alive.

240
00:08:18,760 --> 00:08:21,080
It'd be like stepping right into the past.

241
00:08:21,080 --> 00:08:23,440
It's taking learning to a whole other level,

242
00:08:23,440 --> 00:08:26,360
making it immersive, engaging.

243
00:08:26,360 --> 00:08:28,720
But with all this talk about the amazing things

244
00:08:28,720 --> 00:08:30,760
this technology could do,

245
00:08:30,760 --> 00:08:32,640
do they ever talk about the downsides?

246
00:08:32,640 --> 00:08:34,320
I mean, this is powerful stuff.

247
00:08:34,320 --> 00:08:35,720
Couldn't it be misused?

248
00:08:35,720 --> 00:08:36,840
They definitely address that.

249
00:08:36,840 --> 00:08:40,080
They say any powerful technology has risks,

250
00:08:40,080 --> 00:08:41,960
but they seem really focused on making sure

251
00:08:41,960 --> 00:08:43,920
this is developed responsibly.

252
00:08:43,920 --> 00:08:46,080
Safeguards against bias, things like that.

253
00:08:46,080 --> 00:08:47,960
So they're thinking about the ethical side of it too.

254
00:08:47,960 --> 00:08:51,320
Yeah, they're big on transparency, accountability,

255
00:08:51,320 --> 00:08:53,160
bringing in experts from different fields

256
00:08:53,160 --> 00:08:54,760
to have those tough conversations.

257
00:08:54,760 --> 00:08:56,600
They wanna make sure it benefits everyone.

258
00:08:56,600 --> 00:08:58,120
That's really good to hear

259
00:08:58,120 --> 00:09:00,920
because it's not just about pushing technological boundaries,

260
00:09:00,920 --> 00:09:02,920
it's about making sure those advancements

261
00:09:02,920 --> 00:09:05,120
actually make the world a better place.

262
00:09:05,120 --> 00:09:05,960
That's exactly.

263
00:09:05,960 --> 00:09:07,880
It's been an amazing conversation.

264
00:09:07,880 --> 00:09:10,400
I mean, this whole deep dive, it really makes you think

265
00:09:10,400 --> 00:09:12,240
it's not just about the technology itself,

266
00:09:12,240 --> 00:09:14,520
but how it could change our lives,

267
00:09:14,520 --> 00:09:17,160
how we work, how we learn, everything.

268
00:09:17,160 --> 00:09:18,000
It really is.

269
00:09:18,000 --> 00:09:19,880
It's a story about what humans can do

270
00:09:19,880 --> 00:09:22,080
when they put their minds together, you know?

271
00:09:22,080 --> 00:09:25,400
The creativity, the drive to solve complex problems.

272
00:09:25,400 --> 00:09:26,480
It's inspiring.

273
00:09:26,480 --> 00:09:29,240
And it makes you wonder if we can teach machines

274
00:09:29,240 --> 00:09:32,080
to see the world like we do, what else is possible?

275
00:09:32,080 --> 00:09:33,560
It's a question worth asking.

276
00:09:33,560 --> 00:09:36,360
So to our listeners, we'll leave you with that thought.

277
00:09:36,360 --> 00:09:38,040
What do you think this new era

278
00:09:38,040 --> 00:09:40,240
of spatial intelligence will bring?

279
00:09:40,240 --> 00:09:42,840
Will it be the ultimate challenge for AI?

280
00:09:42,840 --> 00:09:44,640
And what will it mean for our future?

281
00:09:44,640 --> 00:09:46,920
Keep those questions in mind as we keep diving deeper

282
00:09:46,920 --> 00:10:03,920
into these cutting edge technologies.

