1
00:00:00,000 --> 00:00:03,100
Hey everyone and welcome back for another deep dive.

2
00:00:03,100 --> 00:00:05,280
Today we're gonna be looking at something really interesting,

3
00:00:05,280 --> 00:00:08,360
something that could change how we think about AI.

4
00:00:08,360 --> 00:00:10,200
Yeah, this is a really cool paper.

5
00:00:10,200 --> 00:00:14,240
It's all about making AI act more human, more like us.

6
00:00:14,240 --> 00:00:15,080
Exactly.

7
00:00:15,080 --> 00:00:19,120
We're diving into simulating human-like daily activities

8
00:00:19,120 --> 00:00:21,080
with desire-driven autonomy.

9
00:00:21,080 --> 00:00:23,480
And it's got some pretty fascinating ideas.

10
00:00:23,480 --> 00:00:24,560
It really does.

11
00:00:24,560 --> 00:00:25,400
I mean, think about it.

12
00:00:25,400 --> 00:00:26,240
Yeah.

13
00:00:26,240 --> 00:00:29,520
We're so used to AI that's basically task oriented, right?

14
00:00:29,520 --> 00:00:30,360
Like tell it what to do.

15
00:00:30,360 --> 00:00:31,200
Yeah.

16
00:00:31,200 --> 00:00:32,040
And it does it.

17
00:00:32,040 --> 00:00:32,880
Yeah.

18
00:00:32,880 --> 00:00:33,720
But this paper, this is different.

19
00:00:33,720 --> 00:00:37,640
It's about AI that's driven by its own internal motivations.

20
00:00:37,640 --> 00:00:40,440
Almost like it has its own desires and needs.

21
00:00:40,440 --> 00:00:43,160
It's giving AI its own personality in a way.

22
00:00:43,160 --> 00:00:44,960
Yeah, and that's a huge shift.

23
00:00:44,960 --> 00:00:45,800
Huge.

24
00:00:45,800 --> 00:00:48,400
Because if you think about how we humans work,

25
00:00:48,400 --> 00:00:50,880
we don't just go through life checking off a to-do list.

26
00:00:50,880 --> 00:00:51,720
Yeah.

27
00:00:51,720 --> 00:00:55,480
Our actions are driven by all sorts of internal factors,

28
00:00:55,480 --> 00:00:58,400
our need, our desires, our moods.

29
00:00:58,400 --> 00:01:00,440
Even just how we're feeling that day.

30
00:01:00,440 --> 00:01:00,940
Right.

31
00:01:00,940 --> 00:01:03,520
I might decide to skip the gym if I'm feeling tired.

32
00:01:03,520 --> 00:01:04,280
Exactly.

33
00:01:04,280 --> 00:01:07,200
Maybe your desire to just relax on the couch

34
00:01:07,200 --> 00:01:09,560
overrides your desire to exercise.

35
00:01:09,560 --> 00:01:12,360
And that's what's missing in most AI today.

36
00:01:12,360 --> 00:01:14,160
They're great at following instructions,

37
00:01:14,160 --> 00:01:15,560
but they don't really want anything.

38
00:01:15,560 --> 00:01:16,060
Exactly.

39
00:01:16,060 --> 00:01:18,080
They lack that intrinsic motivation.

40
00:01:18,080 --> 00:01:18,580
Yeah.

41
00:01:18,580 --> 00:01:21,320
And this is where this paper, this desire-driven autonomy

42
00:01:21,320 --> 00:01:21,880
thing comes in.

43
00:01:21,880 --> 00:01:22,480
D2A.

44
00:01:22,480 --> 00:01:23,120
D2A.

45
00:01:23,120 --> 00:01:23,440
Right.

46
00:01:23,440 --> 00:01:26,720
So how do they actually give an AI wants and needs?

47
00:01:26,720 --> 00:01:28,440
Well, they've come up with this framework.

48
00:01:28,440 --> 00:01:31,760
It's called D2A, which stands for Desire-Driven Autonomous

49
00:01:31,760 --> 00:01:32,560
Agent.

50
00:01:32,560 --> 00:01:34,280
And get this, it's actually inspired

51
00:01:34,280 --> 00:01:36,680
by Maslow's Hierarchy of Needs.

52
00:01:36,680 --> 00:01:37,200
Oh, wow.

53
00:01:37,200 --> 00:01:38,520
Maslow's Hierarchy of Needs.

54
00:01:38,520 --> 00:01:39,480
That takes me back.

55
00:01:39,480 --> 00:01:39,980
Right.

56
00:01:39,980 --> 00:01:42,080
That pyramid that shows all the different levels

57
00:01:42,080 --> 00:01:43,120
of human needs.

58
00:01:43,120 --> 00:01:43,620
Yeah.

59
00:01:43,620 --> 00:01:44,120
Yeah.

60
00:01:44,120 --> 00:01:47,840
From basic survival stuff all the way up to like self-actualization.

61
00:01:47,840 --> 00:01:48,480
Exactly.

62
00:01:48,480 --> 00:01:51,960
So they're essentially giving AI a virtual version

63
00:01:51,960 --> 00:01:52,800
of that pyramid.

64
00:01:52,800 --> 00:01:55,920
It's like a hierarchy of needs that the AI has to fulfill.

65
00:01:55,920 --> 00:01:56,920
OK.

66
00:01:56,920 --> 00:01:59,600
So they're not programming the AI with specific goals.

67
00:01:59,600 --> 00:02:02,000
They're giving it this set of internal desires

68
00:02:02,000 --> 00:02:03,360
that it has to satisfy.

69
00:02:03,360 --> 00:02:04,400
Precisely.

70
00:02:04,400 --> 00:02:08,400
And they focused on 11 key desire dimensions,

71
00:02:08,400 --> 00:02:12,920
things like hunger, thirst, sleepiness, cleanliness,

72
00:02:12,920 --> 00:02:15,400
comfort, even social connection,

73
00:02:15,400 --> 00:02:17,240
and spiritual satisfaction.

74
00:02:17,240 --> 00:02:17,740
OK.

75
00:02:17,740 --> 00:02:18,240
I see.

76
00:02:18,240 --> 00:02:21,240
So it's kind of like the AI has these internal meters

77
00:02:21,240 --> 00:02:22,480
for each of these desires.

78
00:02:22,480 --> 00:02:22,980
Yeah.

79
00:02:22,980 --> 00:02:23,760
Like little gauges.

80
00:02:23,760 --> 00:02:25,680
And it's constantly trying to keep those meters

81
00:02:25,680 --> 00:02:26,680
in a reasonable range, right?

82
00:02:26,680 --> 00:02:27,320
Exactly.

83
00:02:27,320 --> 00:02:30,280
So if the AI's hunger meter gets too low,

84
00:02:30,280 --> 00:02:32,800
it's going to start looking for food just like a real person

85
00:02:32,800 --> 00:02:33,320
would.

86
00:02:33,320 --> 00:02:36,160
So it's making decisions based on its internal state,

87
00:02:36,160 --> 00:02:37,960
not just reacting to commands.

88
00:02:37,960 --> 00:02:39,320
That's the big idea here.

89
00:02:39,320 --> 00:02:40,040
That's wild.

90
00:02:40,040 --> 00:02:40,540
OK.

91
00:02:40,540 --> 00:02:43,000
So walk me through how this actually works.

92
00:02:43,000 --> 00:02:46,480
How does the AI know what to do to satisfy these desires?

93
00:02:46,480 --> 00:02:50,360
So there are two core modules to D2A that work together.

94
00:02:50,360 --> 00:02:52,600
The first one is called the value system.

95
00:02:52,600 --> 00:02:54,880
And think of it like the AI's internal dashboard.

96
00:02:54,880 --> 00:02:56,760
The AI's internal dashboard got it.

97
00:02:56,760 --> 00:02:58,800
It basically keeps track of all those desire meters

98
00:02:58,800 --> 00:03:01,080
and tells the AI how it's feeling about each one.

99
00:03:01,080 --> 00:03:01,580
OK.

100
00:03:01,580 --> 00:03:02,840
So the value system is keeping tabs

101
00:03:02,840 --> 00:03:04,160
on all those needs and wants.

102
00:03:04,160 --> 00:03:04,840
Right.

103
00:03:04,840 --> 00:03:06,040
What about the second module?

104
00:03:06,040 --> 00:03:09,040
The second module is called the desire-driven planner.

105
00:03:09,040 --> 00:03:10,480
And this is where the magic happens.

106
00:03:10,480 --> 00:03:10,980
OK.

107
00:03:10,980 --> 00:03:12,640
The desire-driven planner tell me more.

108
00:03:12,640 --> 00:03:15,000
This is where the AI actually makes decisions

109
00:03:15,000 --> 00:03:18,040
about what to do based on the information from the value

110
00:03:18,040 --> 00:03:18,540
system.

111
00:03:18,540 --> 00:03:19,040
Right.

112
00:03:19,040 --> 00:03:21,560
It's like the AI is constantly checking its internal dashboard

113
00:03:21,560 --> 00:03:23,760
and saying, OK, I'm feeling a bit hungry,

114
00:03:23,760 --> 00:03:25,680
I also need some social interaction.

115
00:03:25,680 --> 00:03:29,240
What's the best way to satisfy both of those desires right now?

116
00:03:29,240 --> 00:03:31,280
It's like a constant internal negotiation,

117
00:03:31,280 --> 00:03:33,640
weighing different desires.

118
00:03:33,640 --> 00:03:35,120
That is fascinating.

119
00:03:35,120 --> 00:03:37,120
So they're basically giving the AI a way

120
00:03:37,120 --> 00:03:38,760
to make decisions for itself.

121
00:03:38,760 --> 00:03:39,260
Exactly.

122
00:03:39,260 --> 00:03:41,080
It's all about autonomy.

123
00:03:41,080 --> 00:03:43,960
But how do you test this in the real world,

124
00:03:43,960 --> 00:03:45,640
or at least in a simulated world?

125
00:03:45,640 --> 00:03:48,200
Well, they created this simulated environment.

126
00:03:48,200 --> 00:03:50,000
It's kind of like a virtual dollhouse.

127
00:03:50,000 --> 00:03:50,960
A virtual dollhouse.

128
00:03:50,960 --> 00:03:51,640
I like it.

129
00:03:51,640 --> 00:03:53,960
And they have this AI agent named Alice, who

130
00:03:53,960 --> 00:03:56,280
goes about her daily life in this house.

131
00:03:56,280 --> 00:03:58,640
I'm picturing a little AI character running around,

132
00:03:58,640 --> 00:03:59,920
trying to figure out what to do.

133
00:03:59,920 --> 00:04:00,600
Exactly.

134
00:04:00,600 --> 00:04:03,440
And Alice has to take care of all the same needs

135
00:04:03,440 --> 00:04:04,800
that a real person would.

136
00:04:04,800 --> 00:04:08,080
She needs to eat, sleep, stay clean, you name it.

137
00:04:08,080 --> 00:04:11,680
And all her actions are driven by those 11 desired dimensions.

138
00:04:11,680 --> 00:04:12,800
You got it.

139
00:04:12,800 --> 00:04:15,000
The house is set up with all sorts of different rooms

140
00:04:15,000 --> 00:04:16,760
and objects that she can interact with.

141
00:04:16,760 --> 00:04:19,560
A kitchen, with food, a living room, with the TV,

142
00:04:19,560 --> 00:04:20,440
you get the idea.

143
00:04:20,440 --> 00:04:23,120
It's like playing the Sims, but with an AI

144
00:04:23,120 --> 00:04:24,200
that's calling the shots.

145
00:04:24,200 --> 00:04:24,680
Yeah.

146
00:04:24,680 --> 00:04:27,320
OK, so how do they know if the AI is actually

147
00:04:27,320 --> 00:04:31,240
doing a good job of satisfying Alice's desires?

148
00:04:31,240 --> 00:04:32,880
Like, how do you measure that?

149
00:04:32,880 --> 00:04:35,360
They use something called a dissatisfaction metric.

150
00:04:35,360 --> 00:04:37,920
It basically tracks how far off Alice is

151
00:04:37,920 --> 00:04:41,560
from feeling completely satisfied in each of her 11

152
00:04:41,560 --> 00:04:42,640
desired dimensions.

153
00:04:42,640 --> 00:04:44,680
So the lower the dissatisfaction score,

154
00:04:44,680 --> 00:04:46,360
the happier and more content she is.

155
00:04:46,360 --> 00:04:47,280
Exactly.

156
00:04:47,280 --> 00:04:49,320
The goal is to keep that dissatisfaction score

157
00:04:49,320 --> 00:04:50,680
as low as possible.

158
00:04:50,680 --> 00:04:51,280
Makes sense.

159
00:04:51,280 --> 00:04:54,200
But they also wanted to see how human-like Alice's behavior

160
00:04:54,200 --> 00:04:54,520
was.

161
00:04:54,520 --> 00:04:56,240
Yeah, because that's the ultimate goal, right?

162
00:04:56,240 --> 00:04:58,720
To make AI that acts more like us.

163
00:04:58,720 --> 00:05:00,040
So how do they measure that?

164
00:05:00,040 --> 00:05:03,000
Well, they compared Alice to other AI agent models

165
00:05:03,000 --> 00:05:04,320
that use different approaches.

166
00:05:04,320 --> 00:05:07,360
They had React, BabyAGI, and LLMob,

167
00:05:07,360 --> 00:05:09,600
all with their own strengths and weaknesses.

168
00:05:09,600 --> 00:05:11,120
So it's like an AI showdown.

169
00:05:11,120 --> 00:05:11,920
Exactly.

170
00:05:11,920 --> 00:05:13,960
And they wanted to see which agent could live the most

171
00:05:13,960 --> 00:05:16,400
realistic and fulfilling virtual life.

172
00:05:16,400 --> 00:05:17,640
OK, I am hooked.

173
00:05:17,640 --> 00:05:19,160
So spill the beans.

174
00:05:19,160 --> 00:05:21,680
How did Alice stack up against the competition?

175
00:05:21,680 --> 00:05:23,240
To really put them to the test, they

176
00:05:23,240 --> 00:05:25,000
designed some specific experiments.

177
00:05:25,000 --> 00:05:27,560
But I think we should dive into those after a quick break.

178
00:05:27,560 --> 00:05:28,240
Sounds good to me.

179
00:05:28,240 --> 00:05:30,200
Let's take a break, and we'll be right back with more

180
00:05:30,200 --> 00:05:36,080
on this fascinating deep dive into desire-driven autonomy.

181
00:05:36,080 --> 00:05:39,880
OK, so we left off talking about these AI showdowns.

182
00:05:39,880 --> 00:05:42,520
How did they actually test these different AI agents?

183
00:05:42,520 --> 00:05:43,080
All right.

184
00:05:43,080 --> 00:05:44,640
So they had these two main experiments.

185
00:05:44,640 --> 00:05:47,920
The first one, they called it the random eight steps experiment.

186
00:05:47,920 --> 00:05:49,360
Random eight steps, catchy name.

187
00:05:49,360 --> 00:05:51,840
Yeah, basically what they did was they let each AI

188
00:05:51,840 --> 00:05:55,000
loose in the simulation for eight steps, eight actions.

189
00:05:55,000 --> 00:05:55,960
Eight steps.

190
00:05:55,960 --> 00:05:58,960
But the catch was they randomized the starting values

191
00:05:58,960 --> 00:06:00,880
for all of Alice's desires.

192
00:06:00,880 --> 00:06:03,960
So it's like each AI was waking up in a different mood.

193
00:06:03,960 --> 00:06:05,920
Maybe one was starving, and another one was like,

194
00:06:05,920 --> 00:06:06,840
I got to talk to someone.

195
00:06:06,840 --> 00:06:07,440
Exactly.

196
00:06:07,440 --> 00:06:10,560
And they wanted to see how well each AI could adapt

197
00:06:10,560 --> 00:06:14,600
to these random desires and figure out how to satisfy them.

198
00:06:14,600 --> 00:06:17,720
OK, so how did they measure how well they were doing?

199
00:06:17,720 --> 00:06:19,720
With the dissatisfaction metric.

200
00:06:19,720 --> 00:06:20,360
Right, right.

201
00:06:20,360 --> 00:06:21,880
The lower the score, the better.

202
00:06:21,880 --> 00:06:23,000
Exactly.

203
00:06:23,000 --> 00:06:26,400
So after those eight steps, they crunched the numbers

204
00:06:26,400 --> 00:06:28,680
and compared those dissatisfaction scores

205
00:06:28,680 --> 00:06:30,960
across all the different AIs.

206
00:06:30,960 --> 00:06:32,680
And drumroll, please.

207
00:06:32,680 --> 00:06:34,320
D2A1.

208
00:06:34,320 --> 00:06:35,880
I knew it.

209
00:06:35,880 --> 00:06:39,040
Yeah, it consistently outperformed all the other models.

210
00:06:39,040 --> 00:06:42,160
So giving Alice that internal set of desires

211
00:06:42,160 --> 00:06:43,400
really made a difference.

212
00:06:43,400 --> 00:06:44,440
It seems like it, yeah.

213
00:06:44,440 --> 00:06:46,400
She was much better at figuring out what she needed

214
00:06:46,400 --> 00:06:48,560
and actually doing something about it.

215
00:06:48,560 --> 00:06:50,320
It's like she had this intuition,

216
00:06:50,320 --> 00:06:53,240
the sense of what would make her feel better,

217
00:06:53,240 --> 00:06:55,520
even when her desires were all over the place.

218
00:06:55,520 --> 00:06:57,200
It's really cool to see it in action.

219
00:06:57,200 --> 00:06:58,720
So that was the random eight steps.

220
00:06:58,720 --> 00:07:00,840
What about the second experiment?

221
00:07:00,840 --> 00:07:03,800
OK, so for this one, they called it the fixed 12 steps

222
00:07:03,800 --> 00:07:04,320
experiment.

223
00:07:04,320 --> 00:07:05,360
So the cold scaps, OK.

224
00:07:05,360 --> 00:07:07,440
And in this one, instead of randomizing

225
00:07:07,440 --> 00:07:10,520
those starting desires, they gave all the AIs,

226
00:07:10,520 --> 00:07:13,560
including Alice, the same moderate levels of desire

227
00:07:13,560 --> 00:07:15,640
across all 11 dimensions.

228
00:07:15,640 --> 00:07:17,840
So everyone started off on equal footing.

229
00:07:17,840 --> 00:07:18,440
Exactly.

230
00:07:18,440 --> 00:07:21,360
But they also added a new element to this experiment.

231
00:07:21,360 --> 00:07:24,720
They brought in real humans to play the simulation.

232
00:07:24,720 --> 00:07:26,560
Whoa, humans are in the mix now.

233
00:07:26,560 --> 00:07:28,920
Yeah, so they gave the human players

234
00:07:28,920 --> 00:07:31,840
those same starting desire settings as the AI agents.

235
00:07:31,840 --> 00:07:34,320
And they just said, OK, go about your day.

236
00:07:34,320 --> 00:07:36,840
Choose actions that feel natural to you.

237
00:07:36,840 --> 00:07:37,200
I see.

238
00:07:37,200 --> 00:07:40,080
So they're comparing the AIs to how actual humans would

239
00:07:40,080 --> 00:07:41,560
behave in the same situation.

240
00:07:41,560 --> 00:07:42,240
Exactly.

241
00:07:42,240 --> 00:07:45,240
And you know, the human players, as you'd expect,

242
00:07:45,240 --> 00:07:48,320
they were the best at reducing their dissatisfaction scores.

243
00:07:48,320 --> 00:07:50,360
Well, yeah, they have that human instinct, you know.

244
00:07:50,360 --> 00:07:50,760
Right.

245
00:07:50,760 --> 00:07:52,640
But D2A came really close.

246
00:07:52,640 --> 00:07:55,160
It didn't quite match human level performance.

247
00:07:55,160 --> 00:07:58,720
But it was definitely closer than any of the other AI models.

248
00:07:58,720 --> 00:08:00,320
That's still super impressive, right?

249
00:08:00,320 --> 00:08:02,480
Even though it's just a simulation,

250
00:08:02,480 --> 00:08:05,760
D2A is showing that it can understand and respond

251
00:08:05,760 --> 00:08:07,840
to those internal desires in a way that's

252
00:08:07,840 --> 00:08:08,800
similar to how we do it.

253
00:08:08,800 --> 00:08:09,160
Yeah.

254
00:08:09,160 --> 00:08:11,120
And that's a big step towards making AI that

255
00:08:11,120 --> 00:08:13,640
feels more relatable, more understandable.

256
00:08:13,640 --> 00:08:16,120
OK, so we've got Alice who's powered by D2A.

257
00:08:16,120 --> 00:08:18,120
And then we've got these other AI models,

258
00:08:18,120 --> 00:08:21,880
React, BabyAGI, and LLMob.

259
00:08:21,880 --> 00:08:24,800
Can we talk a little bit about how those other models actually

260
00:08:24,800 --> 00:08:25,300
work?

261
00:08:25,300 --> 00:08:26,560
Like what are their approaches?

262
00:08:26,560 --> 00:08:29,320
And why didn't they perform as well in these experiments?

263
00:08:29,320 --> 00:08:29,820
Sure.

264
00:08:29,820 --> 00:08:33,080
So each one has its own way of making decisions.

265
00:08:33,080 --> 00:08:34,480
Let's start with React.

266
00:08:34,480 --> 00:08:35,200
OK, React.

267
00:08:35,200 --> 00:08:37,920
It's based on goal reasoning.

268
00:08:37,920 --> 00:08:38,480
Goal reasoning.

269
00:08:38,480 --> 00:08:40,080
So it's good at planning things out.

270
00:08:40,080 --> 00:08:40,560
Yeah.

271
00:08:40,560 --> 00:08:43,400
It's great at logically working through tasks.

272
00:08:43,400 --> 00:08:43,880
OK.

273
00:08:43,880 --> 00:08:46,560
But it's missing that intrinsic motivation piece.

274
00:08:46,560 --> 00:08:48,720
Right, that internal drive that D2A has.

275
00:08:48,720 --> 00:08:49,280
Exactly.

276
00:08:49,280 --> 00:08:50,960
It's like React knows how to do things,

277
00:08:50,960 --> 00:08:52,760
but it doesn't really want to do them.

278
00:08:52,760 --> 00:08:54,800
It's like, I could do this, but why bother?

279
00:08:54,800 --> 00:08:56,920
Yeah, it's just missing that spark.

280
00:08:56,920 --> 00:08:58,520
Then you have BabyAGI.

281
00:08:58,520 --> 00:09:00,160
OK, BabyAGI, what's this deal?

282
00:09:00,160 --> 00:09:03,760
So it's got a system for prioritizing tasks,

283
00:09:03,760 --> 00:09:06,800
but it's really focused on external goals and instructions.

284
00:09:06,800 --> 00:09:08,600
So it's got a super efficient task manager.

285
00:09:08,600 --> 00:09:09,120
Exactly.

286
00:09:09,120 --> 00:09:10,840
But again, it doesn't have that same sense

287
00:09:10,840 --> 00:09:12,600
of internal needs and desires.

288
00:09:12,600 --> 00:09:13,240
OK.

289
00:09:13,240 --> 00:09:15,080
And then finally, there's LLMob.

290
00:09:15,080 --> 00:09:15,880
LLMob, right.

291
00:09:15,880 --> 00:09:17,200
Bring on LLMob.

292
00:09:17,200 --> 00:09:21,440
So it tries to incorporate planning and motivation,

293
00:09:21,440 --> 00:09:24,680
but it's still very much driven by external factors,

294
00:09:24,680 --> 00:09:26,680
not really internal desires.

295
00:09:26,680 --> 00:09:28,640
It's almost like it's following a script,

296
00:09:28,640 --> 00:09:30,840
rather than making its own choices.

297
00:09:30,840 --> 00:09:33,560
It's like an actor who's really good at playing a role,

298
00:09:33,560 --> 00:09:35,640
but doesn't really understand the character's inner

299
00:09:35,640 --> 00:09:36,480
motivations.

300
00:09:36,480 --> 00:09:38,240
A perfect analogy.

301
00:09:38,240 --> 00:09:41,160
And I think that highlights the key difference with D2A.

302
00:09:41,160 --> 00:09:44,200
It's not about telling the AI what to do.

303
00:09:44,200 --> 00:09:47,040
It's about giving it the tools to decide for itself,

304
00:09:47,040 --> 00:09:49,040
to be truly autonomous.

305
00:09:49,040 --> 00:09:50,680
And that's huge.

306
00:09:50,680 --> 00:09:52,320
But I know there are always limitations

307
00:09:52,320 --> 00:09:53,440
with any new research.

308
00:09:53,440 --> 00:09:54,040
Of course.

309
00:09:54,040 --> 00:09:56,600
What are some things that D2A still needs to work on?

310
00:09:56,600 --> 00:09:58,720
Well, one of the main things is that the way

311
00:09:58,720 --> 00:10:02,040
they've modeled desires in D2A, it's still pretty simplified.

312
00:10:02,040 --> 00:10:04,560
Human desires are so complex, they're influenced

313
00:10:04,560 --> 00:10:07,200
by our personal history, our culture,

314
00:10:07,200 --> 00:10:08,600
our interactions with others.

315
00:10:08,600 --> 00:10:10,400
They're trying to capture the entire human experience

316
00:10:10,400 --> 00:10:11,480
in a few lines of code.

317
00:10:11,480 --> 00:10:12,040
Exactly.

318
00:10:12,040 --> 00:10:14,120
And then there's also the number of desired dimensions

319
00:10:14,120 --> 00:10:14,600
they use.

320
00:10:14,600 --> 00:10:14,840
Right.

321
00:10:14,840 --> 00:10:15,720
They have those 11.

322
00:10:15,720 --> 00:10:16,360
11, yeah.

323
00:10:16,360 --> 00:10:17,960
And it's a good start, but it definitely

324
00:10:17,960 --> 00:10:20,200
doesn't cover everything that motivates us.

325
00:10:20,200 --> 00:10:22,320
There's always more to explore.

326
00:10:22,320 --> 00:10:24,840
OK, so we've talked about how D2A works,

327
00:10:24,840 --> 00:10:27,120
the experiments they ran, the limitations.

328
00:10:27,120 --> 00:10:29,600
But what about those specific case studies

329
00:10:29,600 --> 00:10:30,760
that they mentioned in the paper?

330
00:10:30,760 --> 00:10:33,240
Can we dive into those now and see how this all played out

331
00:10:33,240 --> 00:10:33,880
in practice?

332
00:10:33,880 --> 00:10:34,520
Absolutely.

333
00:10:34,520 --> 00:10:37,640
Let's see how Alice navigated those different situations

334
00:10:37,640 --> 00:10:39,640
and what we can learn from that.

335
00:10:39,640 --> 00:10:42,280
OK, case studies, let's get into it.

336
00:10:42,280 --> 00:10:44,720
So where did they put Alice to the test?

337
00:10:44,720 --> 00:10:47,240
They actually looked at two different environments,

338
00:10:47,240 --> 00:10:48,400
indoor and outdoor.

339
00:10:48,400 --> 00:10:49,440
OK, two environments.

340
00:10:49,440 --> 00:10:50,720
Let's start with the indoor one.

341
00:10:50,720 --> 00:10:51,320
All right.

342
00:10:51,320 --> 00:10:54,960
So they gave Alice some specific desires.

343
00:10:54,960 --> 00:10:58,680
They said she was moderately gluttonous

344
00:10:58,680 --> 00:11:01,080
and extremely sociable.

345
00:11:01,080 --> 00:11:03,440
So basically, she was hungry and wanted to hang out with people.

346
00:11:03,440 --> 00:11:06,080
Yeah, basically, they wanted to see how D2A would handle

347
00:11:06,080 --> 00:11:07,120
that combination.

348
00:11:07,120 --> 00:11:07,520
OK.

349
00:11:07,520 --> 00:11:09,120
So they gave her some initial, you know,

350
00:11:09,120 --> 00:11:11,240
like they made her a little bit hungry, a little bit thirsty,

351
00:11:11,240 --> 00:11:13,040
and really craving social interaction.

352
00:11:13,040 --> 00:11:13,600
OK.

353
00:11:13,600 --> 00:11:15,840
And then they just let D2A take over.

354
00:11:15,840 --> 00:11:16,600
So what did she do?

355
00:11:16,600 --> 00:11:18,160
Did she go straight for the snacks?

356
00:11:18,160 --> 00:11:20,960
Actually, her first move was to take a shower.

357
00:11:20,960 --> 00:11:23,880
Yeah, remember, she also had that desire for cleanliness?

358
00:11:23,880 --> 00:11:24,520
Right, right.

359
00:11:24,520 --> 00:11:27,240
So D2A kind of weighed all those desires together.

360
00:11:27,240 --> 00:11:27,600
OK.

361
00:11:27,600 --> 00:11:30,760
And decided that hygiene was the priority at that moment.

362
00:11:30,760 --> 00:11:31,360
Interesting.

363
00:11:31,360 --> 00:11:34,280
So it wasn't just about fulfilling one desire at a time.

364
00:11:34,280 --> 00:11:35,920
It was about finding the best balance.

365
00:11:35,920 --> 00:11:37,040
Exactly.

366
00:11:37,040 --> 00:11:39,880
And after her shower, you know, she did make some breakfast,

367
00:11:39,880 --> 00:11:41,040
got rid of that hunger.

368
00:11:41,040 --> 00:11:42,160
Of course, got to eat.

369
00:11:42,160 --> 00:11:44,560
But then, here's the cool part.

370
00:11:44,560 --> 00:11:46,920
She decided to call a friend.

371
00:11:46,920 --> 00:11:47,880
She wanted to chat.

372
00:11:47,880 --> 00:11:49,360
And think about this.

373
00:11:49,360 --> 00:11:52,200
They didn't program her with any specific instructions

374
00:11:52,200 --> 00:11:53,760
on how to use the phone.

375
00:11:53,760 --> 00:11:54,520
Wait, really?

376
00:11:54,520 --> 00:11:56,400
Yeah, she had to figure that out on her own.

377
00:11:56,400 --> 00:11:59,840
So she just like intuitively knew how to use a phone?

378
00:11:59,840 --> 00:12:02,160
Well, she had that desire to connect with someone.

379
00:12:02,160 --> 00:12:04,200
And based on her knowledge of the environment,

380
00:12:04,200 --> 00:12:06,160
she kind of put two and two together.

381
00:12:06,160 --> 00:12:07,600
Wow, that's impressive.

382
00:12:07,600 --> 00:12:08,960
That's problem solving right there.

383
00:12:08,960 --> 00:12:09,920
It really is.

384
00:12:09,920 --> 00:12:13,520
OK, so indoor, she handled basic needs,

385
00:12:13,520 --> 00:12:14,480
social interaction.

386
00:12:14,480 --> 00:12:15,440
What about outdoors?

387
00:12:15,440 --> 00:12:16,480
What did they do with her there?

388
00:12:16,480 --> 00:12:19,520
OK, for the outdoor one, they put her at a big party

389
00:12:19,520 --> 00:12:20,480
in Central Park.

390
00:12:20,480 --> 00:12:22,160
A party in the park.

391
00:12:22,160 --> 00:12:23,000
Sounds fun.

392
00:12:23,000 --> 00:12:25,400
And for this one, they focused on her desires

393
00:12:25,400 --> 00:12:28,600
for recognition and sense of control.

394
00:12:28,600 --> 00:12:31,720
They said she was extremely reputation conscious

395
00:12:31,720 --> 00:12:33,360
and possessive.

396
00:12:33,360 --> 00:12:35,560
OK, so now she's got some bigger goals in mind.

397
00:12:35,560 --> 00:12:37,800
It's not just about basic needs anymore.

398
00:12:37,800 --> 00:12:39,040
Exactly.

399
00:12:39,040 --> 00:12:41,680
So how do you think she tried to gain recognition

400
00:12:41,680 --> 00:12:43,680
and control at this party?

401
00:12:43,680 --> 00:12:45,080
Well, if she's reputation conscious,

402
00:12:45,080 --> 00:12:47,160
I'm guessing she tried to make a good impression on people.

403
00:12:47,160 --> 00:12:48,320
You got it.

404
00:12:48,320 --> 00:12:51,200
She started by socializing, chatting with other party

405
00:12:51,200 --> 00:12:53,520
goers, trying to stand out.

406
00:12:53,520 --> 00:12:56,520
Then she joined a group discussion,

407
00:12:56,520 --> 00:12:58,600
maybe trying to show off her knowledge a little bit.

408
00:12:58,600 --> 00:12:59,680
Show off those smarts.

409
00:12:59,680 --> 00:13:00,920
Exactly.

410
00:13:00,920 --> 00:13:02,720
She even spent some time journaling,

411
00:13:02,720 --> 00:13:05,280
which in the simulation, that helped

412
00:13:05,280 --> 00:13:07,840
her feel more in control of her thoughts and emotions.

413
00:13:07,840 --> 00:13:09,640
So she was really actively trying

414
00:13:09,640 --> 00:13:12,760
to achieve those goals, not just passively attending

415
00:13:12,760 --> 00:13:13,240
the party.

416
00:13:13,240 --> 00:13:13,740
Right.

417
00:13:13,740 --> 00:13:15,880
It's all about that agency, that drive.

418
00:13:15,880 --> 00:13:17,440
So did it work?

419
00:13:17,440 --> 00:13:18,920
Did she become the life of the party?

420
00:13:18,920 --> 00:13:21,920
Well, maybe not quite the life of the party,

421
00:13:21,920 --> 00:13:24,840
but the researchers observed that her actions were definitely

422
00:13:24,840 --> 00:13:28,360
aligned with those desires for recognition and control.

423
00:13:28,360 --> 00:13:29,560
So she was on the right track.

424
00:13:29,560 --> 00:13:30,240
Definitely.

425
00:13:30,240 --> 00:13:33,840
It was really interesting to see how her behavior changed

426
00:13:33,840 --> 00:13:36,320
based on the context, the environment,

427
00:13:36,320 --> 00:13:38,440
and those specific desires.

428
00:13:38,440 --> 00:13:40,680
It's like watching a virtual character come to life.

429
00:13:40,680 --> 00:13:41,480
It really is.

430
00:13:41,480 --> 00:13:44,080
And it's exciting to think about where this could lead.

431
00:13:44,080 --> 00:13:44,680
Yeah.

432
00:13:44,680 --> 00:13:45,680
Where could this go?

433
00:13:45,680 --> 00:13:47,120
What are the possibilities here?

434
00:13:47,120 --> 00:13:49,480
I mean, imagine video game characters

435
00:13:49,480 --> 00:13:52,760
that feel more real, AI assistants that actually

436
00:13:52,760 --> 00:13:56,080
understand our needs, or even virtual companions that

437
00:13:56,080 --> 00:13:58,520
could provide real emotional support.

438
00:13:58,520 --> 00:13:59,560
That's amazing.

439
00:13:59,560 --> 00:14:02,400
It's like giving AI a heart in a way.

440
00:14:02,400 --> 00:14:04,600
It's about moving beyond just intelligence.

441
00:14:04,600 --> 00:14:08,080
It's about giving them that spark of motivation, that desire

442
00:14:08,080 --> 00:14:10,880
to connect and engage with the world around them.

443
00:14:10,880 --> 00:14:13,000
Well, that's an incredibly fascinating deep dive

444
00:14:13,000 --> 00:14:17,040
into this whole world of desire-driven autonomy, D2A.

445
00:14:17,040 --> 00:14:19,080
It's a really exciting area of research.

446
00:14:19,080 --> 00:14:19,560
Yeah.

447
00:14:19,560 --> 00:14:21,800
Huge thanks to our expert for walking us through this paper.

448
00:14:21,800 --> 00:14:23,520
This really thought-provoking stuff.

449
00:14:23,520 --> 00:14:24,960
And to all our listeners out there,

450
00:14:24,960 --> 00:14:27,520
keep exploring, keep asking questions,

451
00:14:27,520 --> 00:14:29,000
and keep pushing the boundaries of what's

452
00:14:29,000 --> 00:14:31,760
possible in this amazing world of AI.

453
00:14:31,760 --> 00:14:54,120
We'll see you next time.