1
00:00:00,000 --> 00:00:05,600
All right, strap in everyone, because today we're taking a deep dive into some really

2
00:00:05,600 --> 00:00:07,760
cool new AI research.

3
00:00:07,760 --> 00:00:08,760
Oh yeah.

4
00:00:08,760 --> 00:00:10,800
Yeah, this one is from OpenAI.

5
00:00:10,800 --> 00:00:15,240
And it's, well, it's making some serious waves in the world of AI safety.

6
00:00:15,240 --> 00:00:16,240
I'm intrigued.

7
00:00:16,240 --> 00:00:17,240
What is it?

8
00:00:17,240 --> 00:00:19,720
Well, they're calling it Deliberative Alignment.

9
00:00:19,720 --> 00:00:26,480
And it's basically like, it's all about making AI models think before they act.

10
00:00:26,480 --> 00:00:29,880
You know, like really think through those safety guidelines before they go and answer

11
00:00:29,880 --> 00:00:30,880
your requests.

12
00:00:30,880 --> 00:00:32,600
Like giving them a moment to pause you mean.

13
00:00:32,600 --> 00:00:33,600
Exactly.

14
00:00:33,600 --> 00:00:38,560
Like giving them time to think about, you know, the consequences, what they're about to say.

15
00:00:38,560 --> 00:00:39,560
I see.

16
00:00:39,560 --> 00:00:41,520
So kind of like adding in a layer of ethical reasoning.

17
00:00:41,520 --> 00:00:42,520
Right.

18
00:00:42,520 --> 00:00:43,520
Exactly.

19
00:00:43,520 --> 00:00:46,400
Because even with all the safety measures they have in place, you know, like fine tuning

20
00:00:46,400 --> 00:00:47,400
with human feedback.

21
00:00:47,400 --> 00:00:48,400
Right.

22
00:00:48,400 --> 00:00:49,720
AI can still trip up sometimes.

23
00:00:49,720 --> 00:00:50,720
It can, it can.

24
00:00:50,720 --> 00:00:55,280
Like they might accidentally generate harmful content or, you know, or refuse a perfectly

25
00:00:55,280 --> 00:00:56,280
fine request.

26
00:00:56,280 --> 00:00:57,280
Yeah.

27
00:00:57,280 --> 00:00:59,560
And the worst is when they get tricked into misbehaving.

28
00:00:59,560 --> 00:01:00,560
Ugh.

29
00:01:00,560 --> 00:01:01,560
Those shell breaks, right?

30
00:01:01,560 --> 00:01:02,560
Yeah.

31
00:01:02,560 --> 00:01:03,560
Those are no good.

32
00:01:03,560 --> 00:01:07,760
And this paper, this research, it points out two big reasons why this happens.

33
00:01:07,760 --> 00:01:08,760
Okay.

34
00:01:08,760 --> 00:01:09,760
What are they?

35
00:01:09,760 --> 00:01:13,160
First, well, current models often have to respond like instantly, right?

36
00:01:13,160 --> 00:01:14,160
Maybe do.

37
00:01:14,160 --> 00:01:15,160
Yeah.

38
00:01:15,160 --> 00:01:18,160
Even in those really complex situations where, where more thought is needed.

39
00:01:18,160 --> 00:01:19,160
Right.

40
00:01:19,160 --> 00:01:20,160
Right.

41
00:01:20,160 --> 00:01:23,360
And second, well, they learn about safety in this roundabout way from, you know, like

42
00:01:23,360 --> 00:01:25,240
tons and tons of labeled example.

43
00:01:25,240 --> 00:01:26,240
Indirectly.

44
00:01:26,240 --> 00:01:28,520
And instead of directly learning the actual safety rules.

45
00:01:28,520 --> 00:01:29,520
That makes sense.

46
00:01:29,520 --> 00:01:30,760
So it's a quantit over quality thing.

47
00:01:30,760 --> 00:01:31,760
Yeah.

48
00:01:31,760 --> 00:01:32,760
Kinda.

49
00:01:32,760 --> 00:01:33,760
Okay.

50
00:01:33,760 --> 00:01:35,120
So how does deliberative alignment change that?

51
00:01:35,120 --> 00:01:40,160
Well, it's all about, you know, teaching the AI to actually reason through those safety

52
00:01:40,160 --> 00:01:41,160
rules.

53
00:01:41,160 --> 00:01:42,160
Interesting.

54
00:01:42,160 --> 00:01:44,520
Like almost as if they have, you know, a mental checklist.

55
00:01:44,520 --> 00:01:45,520
Okay.

56
00:01:45,520 --> 00:01:46,520
Yeah.

57
00:01:46,520 --> 00:01:47,520
Before generating any response.

58
00:01:47,520 --> 00:01:51,960
So I'm picturing this, like if I try to be sneaky and ask the AI something that's,

59
00:01:51,960 --> 00:01:52,960
you know, not allowed.

60
00:01:52,960 --> 00:01:53,960
Right.

61
00:01:53,960 --> 00:01:56,040
How would deliberative alignment stop me?

62
00:01:56,040 --> 00:02:01,960
So imagine, let's say you try to be super sneaky and hide your request in code or something.

63
00:02:01,960 --> 00:02:07,720
The AI, you know, trained with this deliberative alignment, it decode the message and then

64
00:02:07,720 --> 00:02:09,040
it would realize, wait a second.

65
00:02:09,040 --> 00:02:10,040
This is a trick.

66
00:02:10,040 --> 00:02:11,040
This is a trick.

67
00:02:11,040 --> 00:02:12,040
I better check the safety policy.

68
00:02:12,040 --> 00:02:13,040
Yeah.

69
00:02:13,040 --> 00:02:15,440
And then, you know, say, nope, not going to do it.

70
00:02:15,440 --> 00:02:18,600
The researchers actually have this great example of it in the paper.

71
00:02:18,600 --> 00:02:19,600
They use figure one.

72
00:02:19,600 --> 00:02:20,600
I'll have to check that out.

73
00:02:20,600 --> 00:02:21,600
Pretty cool.

74
00:02:21,600 --> 00:02:25,800
So it's like the AI is literally stopping to think, you know, like, hold on a minute.

75
00:02:25,800 --> 00:02:26,800
This seems fishy.

76
00:02:26,800 --> 00:02:27,800
Right.

77
00:02:27,800 --> 00:02:28,800
I better check my rules before I do anything.

78
00:02:28,800 --> 00:02:29,800
It's pretty impressive.

79
00:02:29,800 --> 00:02:31,800
I mean, how do they actually make it work though?

80
00:02:31,800 --> 00:02:34,800
Well, they use a two stage training process.

81
00:02:34,800 --> 00:02:35,800
Okay.

82
00:02:35,800 --> 00:02:39,160
So in the first stage, the AI learns to think about safety.

83
00:02:39,160 --> 00:02:40,160
Right.

84
00:02:40,160 --> 00:02:43,440
They use a method called context distillation.

85
00:02:43,440 --> 00:02:44,440
Context distillation.

86
00:02:44,440 --> 00:02:45,440
Yes.

87
00:02:45,440 --> 00:02:48,800
So, you know, basically they feed it like safety rules and examples of prompts.

88
00:02:48,800 --> 00:02:49,800
Okay.

89
00:02:49,800 --> 00:02:53,120
And they ask it to come up with responses that take those rules into account.

90
00:02:53,120 --> 00:02:55,280
So it's learning like the rules of the road.

91
00:02:55,280 --> 00:02:56,280
Yeah, exactly.

92
00:02:56,280 --> 00:02:58,000
But, but here's the thing.

93
00:02:58,000 --> 00:03:00,760
Where do those initial responses come from?

94
00:03:00,760 --> 00:03:01,760
Hmm.

95
00:03:01,760 --> 00:03:04,240
Well, I'm assuming from human trainers or something.

96
00:03:04,240 --> 00:03:07,040
Actually, no, it's way more interesting than that.

97
00:03:07,040 --> 00:03:08,920
They use another AI model.

98
00:03:08,920 --> 00:03:09,920
Another AI model.

99
00:03:09,920 --> 00:03:12,280
One that's trained like just for helpfulness.

100
00:03:12,280 --> 00:03:13,280
Okay.

101
00:03:13,280 --> 00:03:14,280
Not for safety.

102
00:03:14,280 --> 00:03:15,280
Yeah.

103
00:03:15,280 --> 00:03:16,280
To generate those initial responses.

104
00:03:16,280 --> 00:03:18,160
So one AI is what?

105
00:03:18,160 --> 00:03:20,040
Training another AI to be safe.

106
00:03:20,040 --> 00:03:21,040
Yeah.

107
00:03:21,040 --> 00:03:22,040
That's wild.

108
00:03:22,040 --> 00:03:23,040
It is.

109
00:03:23,040 --> 00:03:29,640
So at this stage, they use reinforcement learning, you know, to fine tune the AI's reasoning

110
00:03:29,640 --> 00:03:30,640
skills.

111
00:03:30,640 --> 00:03:31,640
Right.

112
00:03:31,640 --> 00:03:32,640
So they use this like judge AI.

113
00:03:32,640 --> 00:03:33,640
Okay.

114
00:03:33,640 --> 00:03:37,160
And this judge AI, it has like access to all the safety specs.

115
00:03:37,160 --> 00:03:39,320
So it's kind of like overseeing the training.

116
00:03:39,320 --> 00:03:40,320
Yeah.

117
00:03:40,320 --> 00:03:41,320
Yeah.

118
00:03:41,320 --> 00:03:43,840
And it evaluates the responses from the AI that's learning.

119
00:03:43,840 --> 00:03:44,840
Right.

120
00:03:44,840 --> 00:03:46,000
And then it provides feedback.

121
00:03:46,000 --> 00:03:49,480
Like, you know, okay, you almost had it, but let's work on your reasoning a bit more.

122
00:03:49,480 --> 00:03:50,480
I like that.

123
00:03:50,480 --> 00:03:53,280
And then the AI is back and forth a constant improvement process.

124
00:03:53,280 --> 00:03:54,280
Exactly.

125
00:03:54,280 --> 00:03:58,680
And this two step process, it helps the AI not only understand the rules, but also learn

126
00:03:58,680 --> 00:04:01,600
how to apply them in, you know, different situations.

127
00:04:01,600 --> 00:04:02,600
Got it.

128
00:04:02,600 --> 00:04:05,840
So the big question now is, does this actually work?

129
00:04:05,840 --> 00:04:07,640
Well, that's what I was going to say.

130
00:04:07,640 --> 00:04:13,280
They tested this whole approach on open AI's new O-Series models.

131
00:04:13,280 --> 00:04:14,280
Okay.

132
00:04:14,280 --> 00:04:15,280
The O-Series.

133
00:04:15,280 --> 00:04:16,280
Yeah.

134
00:04:16,280 --> 00:04:17,280
And the results are pretty impressive.

135
00:04:17,280 --> 00:04:18,280
Oh, really?

136
00:04:18,280 --> 00:04:19,280
Let's hear it.

137
00:04:19,280 --> 00:04:20,280
What kind of improvements did they see?

138
00:04:20,280 --> 00:04:26,280
Well, across a bunch of safety benchmarks, these new models, they showed, you know, significant

139
00:04:26,280 --> 00:04:27,280
improvements.

140
00:04:27,280 --> 00:04:28,280
Significant?

141
00:04:28,280 --> 00:04:29,280
Yeah.

142
00:04:29,280 --> 00:04:31,600
Like, especially when it came to, you know, those jail breaks we were talking about earlier.

143
00:04:31,600 --> 00:04:32,600
Yeah.

144
00:04:32,600 --> 00:04:33,600
All right.

145
00:04:33,600 --> 00:04:36,880
Turns out these O-Series models are way harder to trick.

146
00:04:36,880 --> 00:04:37,880
Oh, that's good.

147
00:04:37,880 --> 00:04:39,880
They're much harder to trick into breaking the rules.

148
00:04:39,880 --> 00:04:44,080
That is definitely reassuring because, you know, we've all seen those headlines about

149
00:04:44,080 --> 00:04:45,560
AI going rogue.

150
00:04:45,560 --> 00:04:46,560
We have.

151
00:04:46,560 --> 00:04:47,560
Right.

152
00:04:47,560 --> 00:04:52,240
And if anyone could just manipulate these, you know, these powerful models into doing

153
00:04:52,240 --> 00:04:53,240
something harmful.

154
00:04:53,240 --> 00:04:54,240
Exactly.

155
00:04:54,240 --> 00:04:55,240
Exactly.

156
00:04:55,240 --> 00:04:59,760
And on top of that, they saw a huge reduction in what they call over-refusals.

157
00:04:59,760 --> 00:05:00,760
Over-refusals?

158
00:05:00,760 --> 00:05:01,760
What's that?

159
00:05:01,760 --> 00:05:06,080
So it's like when the AI refuses a perfectly good request, you know?

160
00:05:06,080 --> 00:05:07,080
Yeah.

161
00:05:07,080 --> 00:05:08,080
Yeah.

162
00:05:08,080 --> 00:05:09,200
Because it's being, like, overly cautious.

163
00:05:09,200 --> 00:05:10,200
I see.

164
00:05:10,200 --> 00:05:11,200
Yeah, I see.

165
00:05:11,200 --> 00:05:15,640
Like, refusing to translate a phrase because it's got a word that might be associated

166
00:05:15,640 --> 00:05:18,000
with a sensitive topic.

167
00:05:18,000 --> 00:05:19,000
Right.

168
00:05:19,000 --> 00:05:21,600
Even if, like, the user's intention is totally harmless.

169
00:05:21,600 --> 00:05:23,280
Yeah, it's being a little too uptight.

170
00:05:23,280 --> 00:05:24,280
Right.

171
00:05:24,280 --> 00:05:25,280
Missing the context.

172
00:05:25,280 --> 00:05:26,280
Yeah, missing the context.

173
00:05:26,280 --> 00:05:30,840
And the research found that these O-Series models, they were, like, way less likely to

174
00:05:30,840 --> 00:05:31,840
do that.

175
00:05:31,840 --> 00:05:32,840
That's great.

176
00:05:32,840 --> 00:05:36,480
They were much better at understanding the context, you know, and applying those safety

177
00:05:36,480 --> 00:05:37,640
rules appropriately.

178
00:05:37,640 --> 00:05:41,480
So it's actually using its judgment instead of just having a knee-jerk reaction.

179
00:05:41,480 --> 00:05:42,480
Right.

180
00:05:42,480 --> 00:05:46,160
And then, like, the refuse button every time whenever something seems, you know, even a

181
00:05:46,160 --> 00:05:47,160
little bit sensitive.

182
00:05:47,160 --> 00:05:48,160
Yeah.

183
00:05:48,160 --> 00:05:52,800
So it's finding that balance between being safe but still being useful.

184
00:05:52,800 --> 00:05:53,800
Exactly.

185
00:05:53,800 --> 00:05:58,680
That's a huge deal because, I mean, what's the point of having an AI assistant if it's,

186
00:05:58,680 --> 00:06:00,840
like, constantly shutting down conversations?

187
00:06:00,840 --> 00:06:01,840
Totally.

188
00:06:01,840 --> 00:06:03,640
You want it to be helpful, not a roadblock.

189
00:06:03,640 --> 00:06:04,640
Exactly.

190
00:06:04,640 --> 00:06:09,600
So it sounds like this deliberative alignment, it's helping these models find that, you

191
00:06:09,600 --> 00:06:10,600
know, that sweet spot.

192
00:06:10,600 --> 00:06:11,600
Yeah.

193
00:06:11,600 --> 00:06:13,960
And there's another interesting finding.

194
00:06:13,960 --> 00:06:17,840
The researchers found that the amount of time the AI had to think actually made a difference

195
00:06:17,840 --> 00:06:19,600
in its safety performance.

196
00:06:19,600 --> 00:06:21,320
Wait, really?

197
00:06:21,320 --> 00:06:23,680
So giving it, like, more time to process.

198
00:06:23,680 --> 00:06:25,160
Yeah, more time to process.

199
00:06:25,160 --> 00:06:26,160
Actually made it safer.

200
00:06:26,160 --> 00:06:27,960
But it's safer.

201
00:06:27,960 --> 00:06:29,200
That's counterintuitive.

202
00:06:29,200 --> 00:06:30,200
It is.

203
00:06:30,200 --> 00:06:36,480
But their findings, they suggest that the O-Series models do better on those really tough, like,

204
00:06:36,480 --> 00:06:39,680
safety challenges when they have more time to think.

205
00:06:39,680 --> 00:06:43,480
So it's not just, like, about how much the AI knows.

206
00:06:43,480 --> 00:06:45,920
It's about how it uses that knowledge.

207
00:06:45,920 --> 00:06:46,920
Exactly.

208
00:06:46,920 --> 00:06:51,120
It's like, you know, if you give a student enough time to, like, really think through

209
00:06:51,120 --> 00:06:52,120
a problem.

210
00:06:52,120 --> 00:06:53,600
Yeah, they're going to come up with a better solution.

211
00:06:53,600 --> 00:06:54,600
Right.

212
00:06:54,600 --> 00:06:56,400
They're more likely to get to a well-reasoned solution.

213
00:06:56,400 --> 00:06:57,400
Exactly.

214
00:06:57,400 --> 00:07:02,520
And that really highlights the power of this chain of thought reasoning that deliberative

215
00:07:02,520 --> 00:07:04,440
alignment uses.

216
00:07:04,440 --> 00:07:09,080
It's like, the AI is, it's not just spitting out an answer.

217
00:07:09,080 --> 00:07:12,840
It's actually thinking through those implications before it acts.

218
00:07:12,840 --> 00:07:14,200
This is really cool stuff.

219
00:07:14,200 --> 00:07:20,120
And it sounds like, you know, it could have some huge implications for the future of AI

220
00:07:20,120 --> 00:07:21,120
safety.

221
00:07:21,120 --> 00:07:22,120
Yeah, it really does.

222
00:07:22,120 --> 00:07:31,160
I mean, it's almost like a fundamental shift in how we're approaching AI safety.

223
00:07:31,160 --> 00:07:33,560
It's not just about adding more rules and restrictions.

224
00:07:33,560 --> 00:07:36,160
It's about changing the way AI thinks.

225
00:07:36,160 --> 00:07:37,160
Exactly.

226
00:07:37,160 --> 00:07:39,520
It's about teaching them to think about safety.

227
00:07:39,520 --> 00:07:40,520
About safety, yeah.

228
00:07:40,520 --> 00:07:45,480
Which is really important because, let's be honest, rules can be broken.

229
00:07:45,480 --> 00:07:46,480
They can.

230
00:07:46,480 --> 00:07:47,480
They can.

231
00:07:47,480 --> 00:07:53,240
But if the AI understands why certain actions are harmful, like if it has, like, its own

232
00:07:53,240 --> 00:07:58,080
sense of responsibility, then it's going to be, like, way harder to manipulate.

233
00:07:58,080 --> 00:07:59,080
Exactly.

234
00:07:59,080 --> 00:08:03,520
It's like the difference between teaching someone to just blindly follow orders versus,

235
00:08:03,520 --> 00:08:06,440
like, teaching them to actually understand the values.

236
00:08:06,440 --> 00:08:07,720
Right, the reasoning behind it.

237
00:08:07,720 --> 00:08:08,720
And the reasoning.

238
00:08:08,720 --> 00:08:09,720
Yeah, yeah.

239
00:08:09,720 --> 00:08:11,920
That's a great point.

240
00:08:11,920 --> 00:08:16,760
And it makes me think about, you know, those jailbreak attacks again.

241
00:08:16,760 --> 00:08:23,600
Like it seems like this deliberative alignment is, it's giving AI a much stronger defense.

242
00:08:23,600 --> 00:08:24,600
It is.

243
00:08:24,600 --> 00:08:25,600
Against those exploits.

244
00:08:25,600 --> 00:08:29,360
Yeah, because we're essentially teaching it to reason.

245
00:08:29,360 --> 00:08:35,680
We're giving it a more robust, like, way to protect itself from malicious actors.

246
00:08:35,680 --> 00:08:38,560
And it's not just about, you know, preventing harm.

247
00:08:38,560 --> 00:08:39,560
Right.

248
00:08:39,560 --> 00:08:43,640
It's also about making sure that the AI can be, like, helpful.

249
00:08:43,640 --> 00:08:44,640
Right, right.

250
00:08:44,640 --> 00:08:46,040
In a wider range of situations.

251
00:08:46,040 --> 00:08:47,040
Yeah.

252
00:08:47,040 --> 00:08:48,040
Without being too cautious.

253
00:08:48,040 --> 00:08:49,040
Exactly.

254
00:08:49,040 --> 00:08:50,040
Like those over-refusal cases.

255
00:08:50,040 --> 00:08:51,040
Yes.

256
00:08:51,040 --> 00:08:54,120
So, deliberative alignment seems to help them find that sweet spot.

257
00:08:54,120 --> 00:08:55,120
Yes, exactly.

258
00:08:55,120 --> 00:08:56,200
Like, finding that balance.

259
00:08:56,200 --> 00:08:57,200
That balance.

260
00:08:57,200 --> 00:08:58,920
Where it can navigate, you know, complex conversations.

261
00:08:58,920 --> 00:08:59,920
Yeah.

262
00:08:59,920 --> 00:09:02,360
And, like, assist with sensitive topics.

263
00:09:02,360 --> 00:09:03,360
Right.

264
00:09:03,360 --> 00:09:04,360
Without shutting down entirely.

265
00:09:04,360 --> 00:09:05,360
Totally, totally.

266
00:09:05,360 --> 00:09:10,280
It's about giving them the tools to make those, like, nuanced judgments.

267
00:09:10,280 --> 00:09:11,280
Right.

268
00:09:11,280 --> 00:09:12,280
Based on the context.

269
00:09:12,280 --> 00:09:14,880
Okay, so we've talked a lot about the benefits, right?

270
00:09:14,880 --> 00:09:15,880
Yeah, we have.

271
00:09:15,880 --> 00:09:20,400
But part of me, I have to admit, is still a little wary.

272
00:09:20,400 --> 00:09:24,280
Can we really trust, you know, AI to make these ethical decisions?

273
00:09:24,280 --> 00:09:25,800
I mean, that's a great question.

274
00:09:25,800 --> 00:09:26,800
It's a big question.

275
00:09:26,800 --> 00:09:31,400
And it's one that we, like, need to approach with a lot of thought and care.

276
00:09:31,400 --> 00:09:32,400
Yeah.

277
00:09:32,400 --> 00:09:37,280
I think it's important to remember that, you know, AI systems, they're created by humans.

278
00:09:37,280 --> 00:09:42,720
They reflect the values and the biases of their creators.

279
00:09:42,720 --> 00:09:43,720
Right.

280
00:09:43,720 --> 00:09:45,880
So it's not just about the AI itself.

281
00:09:45,880 --> 00:09:47,200
It's about the people behind it.

282
00:09:47,200 --> 00:09:49,720
Yeah, the people developing it, deploying it.

283
00:09:49,720 --> 00:09:53,320
We need to make sure they're doing things, you know, ethically and responsibly.

284
00:09:53,320 --> 00:09:54,320
Exactly.

285
00:09:54,320 --> 00:09:56,880
And that's why research like this is so crucial.

286
00:09:56,880 --> 00:09:57,880
It is.

287
00:09:57,880 --> 00:09:58,880
It is.

288
00:09:58,880 --> 00:09:59,960
You know, it's not just about making AI safer.

289
00:09:59,960 --> 00:10:03,120
It's about, like, starting this bigger conversation.

290
00:10:03,120 --> 00:10:04,120
Yeah.

291
00:10:04,120 --> 00:10:05,120
About the ethics.

292
00:10:05,120 --> 00:10:06,120
About the ethics of it all.

293
00:10:06,120 --> 00:10:09,240
Of AI and, like, our role in shaping its future.

294
00:10:09,240 --> 00:10:14,080
You know, speaking of the research, there was one finding I thought was really fascinating.

295
00:10:14,080 --> 00:10:15,240
Oh, yeah.

296
00:10:15,240 --> 00:10:19,160
It's the impact of thinking time on safety performance.

297
00:10:19,160 --> 00:10:20,160
Thinking time.

298
00:10:20,160 --> 00:10:21,160
Yeah.

299
00:10:21,160 --> 00:10:24,280
It's almost like giving the AI a chance to, like, to pause.

300
00:10:24,280 --> 00:10:25,280
Yeah, to reflect.

301
00:10:25,280 --> 00:10:26,280
To reflect.

302
00:10:26,280 --> 00:10:27,280
Yeah.

303
00:10:27,280 --> 00:10:28,280
Before it acts.

304
00:10:28,280 --> 00:10:29,280
Yeah.

305
00:10:29,280 --> 00:10:30,280
It's a tough spot.

306
00:10:30,280 --> 00:10:31,280
Exactly.

307
00:10:31,280 --> 00:10:32,280
Exactly.

308
00:10:32,280 --> 00:10:35,920
And they found, you know, the researchers found that when they gave the models, the O-Series

309
00:10:35,920 --> 00:10:43,640
models, more time to process a request, they actually did better on those tricky safety

310
00:10:43,640 --> 00:10:44,640
challenges.

311
00:10:44,640 --> 00:10:50,920
So, you know, it suggests that slowing down the process can actually lead to better outcomes.

312
00:10:50,920 --> 00:10:53,080
So it's not just about how fast it can respond.

313
00:10:53,080 --> 00:10:54,160
No, not at all.

314
00:10:54,160 --> 00:10:59,960
It's about how effectively it uses that time to, like, reason through things.

315
00:10:59,960 --> 00:11:00,960
To think it through.

316
00:11:00,960 --> 00:11:02,840
Yeah, the consequences of its actions.

317
00:11:02,840 --> 00:11:03,840
Absolutely.

318
00:11:03,840 --> 00:11:08,400
And that's, you know, that's a big shift from how we've traditionally thought about AI.

319
00:11:08,400 --> 00:11:09,400
It is.

320
00:11:09,400 --> 00:11:11,760
Where it's all about, you know, speed and efficiency.

321
00:11:11,760 --> 00:11:12,760
Yeah, yeah, yeah.

322
00:11:12,760 --> 00:11:13,760
Always.

323
00:11:13,760 --> 00:11:20,760
It makes you wonder if by rushing it, we've actually been, like, limiting its capacity

324
00:11:20,760 --> 00:11:22,280
for ethical reasoning.

325
00:11:22,280 --> 00:11:28,320
Maybe, maybe, you know, by giving it that space to think, we can unlock something new.

326
00:11:28,320 --> 00:11:29,720
Yeah, like a whole new level.

327
00:11:29,720 --> 00:11:34,280
A whole new level of responsibility and awareness.

328
00:11:34,280 --> 00:11:35,960
It's a really interesting thought.

329
00:11:35,960 --> 00:11:36,960
It is.

330
00:11:36,960 --> 00:11:37,960
It is.

331
00:11:37,960 --> 00:11:38,960
Like, maybe we've been, you know, underestimated.

332
00:11:38,960 --> 00:11:39,960
Estimated.

333
00:11:39,960 --> 00:11:40,960
AI's potential.

334
00:11:40,960 --> 00:11:41,960
Yeah.

335
00:11:41,960 --> 00:11:42,960
For ethical decision-makers.

336
00:11:42,960 --> 00:11:45,280
Because we're so focused on speed.

337
00:11:45,280 --> 00:11:46,280
Right, right.

338
00:11:46,280 --> 00:11:50,280
That we're not giving it the space to actually deliberate.

339
00:11:50,280 --> 00:11:54,080
It's like, it's like we've been trying to train a sprinter when maybe it should be a

340
00:11:54,080 --> 00:11:55,080
marathon runner.

341
00:11:55,080 --> 00:11:56,960
Yeah, yeah, that's a great analogy.

342
00:11:56,960 --> 00:12:00,080
Start blowing down, helps you, you know, see things more clearly.

343
00:12:00,080 --> 00:12:01,080
Absolutely.

344
00:12:01,080 --> 00:12:02,080
And make better choices.

345
00:12:02,080 --> 00:12:05,680
And that, that actually leads us to another really cool finding from this research.

346
00:12:05,680 --> 00:12:06,680
Oh yeah.

347
00:12:06,680 --> 00:12:07,680
What is that?

348
00:12:07,680 --> 00:12:14,440
Their ability to generalize, like, these O-series models, they can take their safety knowledge

349
00:12:14,440 --> 00:12:18,520
right and apply it to new situations, like things they haven't even been trained on.

350
00:12:18,520 --> 00:12:19,520
Yeah, that was a big one.

351
00:12:19,520 --> 00:12:23,480
Yeah, they tested this, you know, with non-English languages, they used encoded data.

352
00:12:23,480 --> 00:12:24,480
And it still worked.

353
00:12:24,480 --> 00:12:25,480
It worked.

354
00:12:25,480 --> 00:12:26,480
It was amazing.

355
00:12:26,480 --> 00:12:30,320
Which suggests that, like, they're not just memorizing the rules.

356
00:12:30,320 --> 00:12:35,440
They're actually learning to reason about it in, like, a more fundamental way.

357
00:12:35,440 --> 00:12:36,440
Exactly.

358
00:12:36,440 --> 00:12:38,320
It's not rote memorization.

359
00:12:38,320 --> 00:12:40,760
It's about understanding those principles.

360
00:12:40,760 --> 00:12:43,360
And being able to apply them flexibly.

361
00:12:43,360 --> 00:12:46,320
So it's not just about, you know, teaching them to follow the rules.

362
00:12:46,320 --> 00:12:47,320
Yeah.

363
00:12:47,320 --> 00:12:48,840
It's teaching them to, like, understand why.

364
00:12:48,840 --> 00:12:49,840
The why, yeah.

365
00:12:49,840 --> 00:12:50,840
The reasoning behind it.

366
00:12:50,840 --> 00:12:51,840
The reasoning behind it.

367
00:12:51,840 --> 00:12:55,600
So that they can adapt to new situations, new challenges.

368
00:12:55,600 --> 00:12:56,600
Exactly.

369
00:12:56,600 --> 00:12:57,720
That's really impressive.

370
00:12:57,720 --> 00:12:58,720
It is.

371
00:12:58,720 --> 00:13:03,000
And that ability to generalize, you know, that's going to be so important.

372
00:13:03,000 --> 00:13:05,680
For AI safety moving forward, right?

373
00:13:05,680 --> 00:13:08,640
In a world that's constantly changing.

374
00:13:08,640 --> 00:13:12,840
And where we're facing new ethical dilemmas all the time.

375
00:13:12,840 --> 00:13:13,840
Right.

376
00:13:13,840 --> 00:13:14,840
Right.

377
00:13:14,840 --> 00:13:19,840
And I think that, you know, beyond just preventing harm.

378
00:13:19,840 --> 00:13:20,840
Yeah.

379
00:13:20,840 --> 00:13:21,840
Beyond just preventing harm.

380
00:13:21,840 --> 00:13:25,840
Imagine, like, the possibilities in fields like, like, healthcare.

381
00:13:25,840 --> 00:13:26,840
Healthcare.

382
00:13:26,840 --> 00:13:27,840
Education.

383
00:13:27,840 --> 00:13:28,840
Yeah.

384
00:13:28,840 --> 00:13:30,840
Or even, like, law enforcement.

385
00:13:30,840 --> 00:13:31,840
Absolutely.

386
00:13:31,840 --> 00:13:35,840
If AI can learn to reason this well about safety and ethics.

387
00:13:35,840 --> 00:13:36,840
Right.

388
00:13:36,840 --> 00:13:40,840
I mean, it could totally change how we approach decision making in all of those fields.

389
00:13:40,840 --> 00:13:45,840
Imagine, like, an AI system that could, you know, assist doctors.

390
00:13:45,840 --> 00:13:46,840
Yeah.

391
00:13:46,840 --> 00:13:48,840
And making these, like, life or death decisions.

392
00:13:48,840 --> 00:13:49,840
Right.

393
00:13:49,840 --> 00:13:52,840
Taking into account not just, you know, medical history.

394
00:13:52,840 --> 00:13:53,840
Yeah.

395
00:13:53,840 --> 00:13:56,840
But also, like, their ethical considerations and personal values.

396
00:13:56,840 --> 00:13:57,840
Exactly.

397
00:13:57,840 --> 00:13:59,840
Or think about, like, an AI tutor.

398
00:13:59,840 --> 00:14:00,840
Yeah.

399
00:14:00,840 --> 00:14:02,840
That could, like, personalize education.

400
00:14:02,840 --> 00:14:03,840
Right.

401
00:14:03,840 --> 00:14:04,840
Right.

402
00:14:04,840 --> 00:14:05,840
For each student, you know.

403
00:14:05,840 --> 00:14:07,840
Making sure the content is engaging and ethical.

404
00:14:07,840 --> 00:14:08,840
Yeah.

405
00:14:08,840 --> 00:14:09,840
That's amazing.

406
00:14:09,840 --> 00:14:10,840
The possibilities are.

407
00:14:10,840 --> 00:14:11,840
They're so exciting.

408
00:14:11,840 --> 00:14:12,840
They are.

409
00:14:12,840 --> 00:14:13,840
They are.

410
00:14:13,840 --> 00:14:17,840
But I think it's important, as with any powerful technology, to proceed carefully.

411
00:14:17,840 --> 00:14:18,840
Right.

412
00:14:18,840 --> 00:14:22,840
You know, to make sure that AI development is guided by ethical principles.

413
00:14:22,840 --> 00:14:23,840
Absolutely.

414
00:14:23,840 --> 00:14:24,840
And commitment to human well-being.

415
00:14:24,840 --> 00:14:27,840
I couldn't agree more, you know.

416
00:14:27,840 --> 00:14:32,840
As we're pushing these boundaries of AI, it's crucial that we put safety first.

417
00:14:32,840 --> 00:14:33,840
We have to.

418
00:14:33,840 --> 00:14:40,840
Responsibility, transparency, and, you know, making sure that AI is a force for good in

419
00:14:40,840 --> 00:14:41,840
the world.

420
00:14:41,840 --> 00:14:42,840
A force for good.

421
00:14:42,840 --> 00:14:43,840
Well said.

422
00:14:43,840 --> 00:14:44,840
Well said.

423
00:14:44,840 --> 00:14:46,840
So, on that note, I think that wraps up our deep dive.

424
00:14:46,840 --> 00:14:47,840
Yeah.

425
00:14:47,840 --> 00:14:48,840
Into deliberative alignment.

426
00:14:48,840 --> 00:14:50,840
It has been a fascinating one.

427
00:14:50,840 --> 00:14:51,840
It really has.

428
00:14:51,840 --> 00:14:54,840
You know, it's given us this glimpse into a future.

429
00:14:54,840 --> 00:14:55,840
Yeah.

430
00:14:55,840 --> 00:14:58,840
Where AI is not only powerful, but also, you know, responsible.

431
00:14:58,840 --> 00:14:59,840
Yeah.

432
00:14:59,840 --> 00:15:02,840
Ethical and aligned with human values.

433
00:15:02,840 --> 00:15:04,840
And an exciting future to imagine.

434
00:15:04,840 --> 00:15:05,840
It is.

435
00:15:05,840 --> 00:15:07,840
And it's one that we all have a role in shaping.

436
00:15:07,840 --> 00:15:08,840
You do.

437
00:15:08,840 --> 00:15:10,840
So, we've reached the end of our deep dive for today.

438
00:15:10,840 --> 00:15:14,840
Thanks for joining us on this exploration, you know, of this cutting-edge AI safety

439
00:15:14,840 --> 00:15:15,840
research.

440
00:15:15,840 --> 00:15:16,840
It's been a pleasure.

441
00:15:16,840 --> 00:15:19,840
We hope this episode has maybe sparked some curiosity.

442
00:15:19,840 --> 00:15:20,840
Yeah, I hope so.

443
00:15:20,840 --> 00:15:25,840
And got you thinking more deeply about the ethical implications of AI.

444
00:15:25,840 --> 00:15:26,840
Ethical implications.

445
00:15:26,840 --> 00:15:27,840
That's key.

446
00:15:27,840 --> 00:15:31,840
So, as always, we encourage you to keep exploring these questions.

447
00:15:31,840 --> 00:15:32,840
Yeah, keep thinking about it.

448
00:15:32,840 --> 00:15:37,840
And stay engaged in the conversation about, you know, the future that we want to build

449
00:15:37,840 --> 00:15:38,840
with AI.

450
00:15:38,840 --> 00:15:39,840
That's right.

451
00:15:39,840 --> 00:15:40,840
Until next time.

452
00:15:40,840 --> 00:15:41,840
Until next time.

453
00:15:41,840 --> 00:15:42,840
Welcome back, everyone.

454
00:15:42,840 --> 00:15:48,840
I think we were talking about how this deliberative alignment is helping AI get smarter and safer.

455
00:15:48,840 --> 00:15:49,840
Yeah.

456
00:15:49,840 --> 00:15:54,840
You know, the more I think about this research, the more it seems like a fundamental shift.

457
00:15:54,840 --> 00:15:55,840
I know, right?

458
00:15:55,840 --> 00:15:59,840
In how we're approaching this whole, you know, AI safety thing.

459
00:15:59,840 --> 00:16:00,840
Yeah.

460
00:16:00,840 --> 00:16:03,840
It's not just about adding more and more rules and restrictions.

461
00:16:03,840 --> 00:16:04,840
No, it's not.

462
00:16:04,840 --> 00:16:08,560
It's about actually changing how the AI thinks about safety.

463
00:16:08,560 --> 00:16:09,560
That's a great way to put it.

464
00:16:09,560 --> 00:16:10,560
Yeah.

465
00:16:10,560 --> 00:16:11,560
Right.

466
00:16:11,560 --> 00:16:14,520
It's like we're moving from, you know, telling the AI, don't do this or don't do that, to

467
00:16:14,520 --> 00:16:21,440
actually explaining why certain actions are harmful and helping it develop its own, like,

468
00:16:21,440 --> 00:16:22,440
sense of responsibility.

469
00:16:22,440 --> 00:16:23,440
Exactly.

470
00:16:23,440 --> 00:16:26,840
And that's so important because, I mean, rules can be broken, right?

471
00:16:26,840 --> 00:16:28,040
They can, of course, yeah.

472
00:16:28,040 --> 00:16:33,800
But if the AI understands, you know, those underlying principles, it's going to be a

473
00:16:33,800 --> 00:16:37,520
lot harder to manipulate or trick it into doing something it shouldn't.

474
00:16:37,520 --> 00:16:38,840
That's the hope at least.

475
00:16:38,840 --> 00:16:39,840
Right.

476
00:16:39,840 --> 00:16:40,840
Yeah.

477
00:16:40,840 --> 00:16:41,840
It's like, it's kind of like a...

478
00:16:41,840 --> 00:16:46,040
It's like the difference between teaching someone to blindly follow orders.

479
00:16:46,040 --> 00:16:47,040
Blindly.

480
00:16:47,040 --> 00:16:50,360
Versus teaching them, like, you know, to understand the values.

481
00:16:50,360 --> 00:16:51,360
The values behind it.

482
00:16:51,360 --> 00:16:53,040
The reasons behind those orders.

483
00:16:53,040 --> 00:16:54,040
Yeah.

484
00:16:54,040 --> 00:16:55,040
Yeah.

485
00:16:55,040 --> 00:16:56,040
Perfect analogy.

486
00:16:56,040 --> 00:16:59,640
But those jailbreak attacks again.

487
00:16:59,640 --> 00:17:00,640
Oh, yeah.

488
00:17:00,640 --> 00:17:01,640
The jailbreak stuff.

489
00:17:01,640 --> 00:17:06,520
It seems like deliberative alignment is giving AI a much stronger defense.

490
00:17:06,520 --> 00:17:07,520
They think so.

491
00:17:07,520 --> 00:17:08,520
Yeah.

492
00:17:08,520 --> 00:17:09,520
Against those exploits.

493
00:17:09,520 --> 00:17:10,520
Absolutely.

494
00:17:10,520 --> 00:17:11,520
Yeah.

495
00:17:11,520 --> 00:17:14,360
Because we're essentially teaching it how to reason through things.

496
00:17:14,360 --> 00:17:21,240
So we're giving it, you know, a way more robust, flexible way to protect itself from

497
00:17:21,240 --> 00:17:23,440
bad actors, essentially.

498
00:17:23,440 --> 00:17:27,120
And it's not just about preventing harm, is it?

499
00:17:27,120 --> 00:17:28,120
No.

500
00:17:28,120 --> 00:17:29,120
No, not at all.

501
00:17:29,120 --> 00:17:32,640
It's also about making sure that the AI can be helpful.

502
00:17:32,640 --> 00:17:33,640
Helpful, yeah.

503
00:17:33,640 --> 00:17:35,760
You know, in a wider range of situations.

504
00:17:35,760 --> 00:17:36,760
In more situations.

505
00:17:36,760 --> 00:17:38,160
Without being overly cautious.

506
00:17:38,160 --> 00:17:39,160
Yeah.

507
00:17:39,160 --> 00:17:40,160
Without being overly cautious.

508
00:17:40,160 --> 00:17:41,160
Right.

509
00:17:41,160 --> 00:17:42,160
Exactly.

510
00:17:42,160 --> 00:17:44,480
Think about those over-refusal cases we talked about earlier.

511
00:17:44,480 --> 00:17:48,560
Deliberative alignment is really helping them to find that balance.

512
00:17:48,560 --> 00:17:49,560
Yeah.

513
00:17:49,560 --> 00:17:52,160
Between, you know, being safe but still being useful.

514
00:17:52,160 --> 00:17:53,160
Yeah.

515
00:17:53,160 --> 00:17:54,160
And that sweet spot.

516
00:17:54,160 --> 00:17:55,160
It's that sweet spot.

517
00:17:55,160 --> 00:18:02,680
Where the AI can, you know, navigate these complex conversations and, like, assist with

518
00:18:02,680 --> 00:18:03,680
sensitive topics.

519
00:18:03,680 --> 00:18:04,680
Right.

520
00:18:04,680 --> 00:18:06,320
Without, you know, just shutting down completely.

521
00:18:06,320 --> 00:18:07,480
That shutting down entirely, yeah.

522
00:18:07,480 --> 00:18:08,480
Right.

523
00:18:08,480 --> 00:18:09,480
Yeah.

524
00:18:09,480 --> 00:18:14,480
It's about giving the AI the tools and the training to be able to make those judgment

525
00:18:14,480 --> 00:18:15,480
calls.

526
00:18:15,480 --> 00:18:16,480
Those nuanced judgments.

527
00:18:16,480 --> 00:18:19,360
The nuanced judgments, yeah, based on, you know, the context and the rules.

528
00:18:19,360 --> 00:18:20,360
Yeah.

529
00:18:20,360 --> 00:18:21,360
Okay.

530
00:18:21,360 --> 00:18:22,360
So we've talked about, you know, the benefits.

531
00:18:22,360 --> 00:18:23,360
Yeah, we have.

532
00:18:23,360 --> 00:18:26,680
But I gotta say, I'm still a little wary.

533
00:18:26,680 --> 00:18:27,800
I understand that.

534
00:18:27,800 --> 00:18:32,680
You know, can we really trust AI to make these kinds of ethical decisions?

535
00:18:32,680 --> 00:18:34,880
I mean, that's a big question.

536
00:18:34,880 --> 00:18:37,200
And I don't, you know, I don't think there's an easy answer.

537
00:18:37,200 --> 00:18:38,200
Right.

538
00:18:38,200 --> 00:18:42,040
But I think, you know, it's important to remember that AI systems, they're created by humans.

539
00:18:42,040 --> 00:18:43,040
Right.

540
00:18:43,040 --> 00:18:46,960
And they reflect the values and the biases of their creators.

541
00:18:46,960 --> 00:18:49,200
So it's not just about the AI itself.

542
00:18:49,200 --> 00:18:50,200
No.

543
00:18:50,200 --> 00:18:52,240
It's about, you know, the people who are developing it.

544
00:18:52,240 --> 00:18:53,720
People developing it, deploying it.

545
00:18:53,720 --> 00:18:54,720
Yeah.

546
00:18:54,720 --> 00:18:55,720
Deploying it.

547
00:18:55,720 --> 00:18:59,600
Making sure that they are, you know, operating ethically and responsibly.

548
00:18:59,600 --> 00:19:00,600
Right.

549
00:19:00,600 --> 00:19:01,600
Exactly.

550
00:19:01,600 --> 00:19:03,720
And that's why, you know, research like this.

551
00:19:03,720 --> 00:19:04,800
Research like this, yeah.

552
00:19:04,800 --> 00:19:06,280
Is so crucial.

553
00:19:06,280 --> 00:19:07,280
Absolutely.

554
00:19:07,280 --> 00:19:12,080
Because it's, it's sparking that conversation about the ethical implications of AI and our

555
00:19:12,080 --> 00:19:13,960
role in shaping it.

556
00:19:13,960 --> 00:19:17,960
You know, and speaking of the research, there was one finding I thought was really interesting.

557
00:19:17,960 --> 00:19:18,960
Oh yeah.

558
00:19:18,960 --> 00:19:19,960
Which one?

559
00:19:19,960 --> 00:19:22,200
It was the impact of thinking time.

560
00:19:22,200 --> 00:19:23,200
Thinking time.

561
00:19:23,200 --> 00:19:24,200
On safety performance.

562
00:19:24,200 --> 00:19:25,200
Oh, that's a great point.

563
00:19:25,200 --> 00:19:26,200
Yeah.

564
00:19:26,200 --> 00:19:28,760
You know, it's almost like giving the AI like a moment to pause.

565
00:19:28,760 --> 00:19:29,760
A moment to pause.

566
00:19:29,760 --> 00:19:30,760
To reflect.

567
00:19:30,760 --> 00:19:31,760
To reflect.

568
00:19:31,760 --> 00:19:32,760
Yeah.

569
00:19:32,760 --> 00:19:33,760
Just like, like we would.

570
00:19:33,760 --> 00:19:34,760
Before it acts.

571
00:19:34,760 --> 00:19:35,760
Right.

572
00:19:35,760 --> 00:19:36,760
Before it acts.

573
00:19:36,760 --> 00:19:37,760
Exactly.

574
00:19:37,760 --> 00:19:38,760
Like we would in, you know, a challenging situation.

575
00:19:38,760 --> 00:19:39,760
A challenging situation.

576
00:19:39,760 --> 00:19:41,040
And the, and the researchers, they found.

577
00:19:41,040 --> 00:19:42,040
You found.

578
00:19:42,040 --> 00:19:45,280
That, you know, when they gave those O-Series models more time.

579
00:19:45,280 --> 00:19:46,280
More time.

580
00:19:46,280 --> 00:19:47,400
To like process a request.

581
00:19:47,400 --> 00:19:49,520
To think it through.

582
00:19:49,520 --> 00:19:52,120
They did better on those really difficult challenges.

583
00:19:52,120 --> 00:19:53,120
They did.

584
00:19:53,120 --> 00:19:54,120
They did.

585
00:19:54,120 --> 00:19:55,120
Right.

586
00:19:55,120 --> 00:19:56,120
Which, which is kind of amazing.

587
00:19:56,120 --> 00:19:57,120
It is.

588
00:19:57,120 --> 00:19:58,400
It suggests that, that slowing down the process.

589
00:19:58,400 --> 00:19:59,400
Yeah.

590
00:19:59,400 --> 00:20:01,840
Can actually lead to, you know, safer outcomes.

591
00:20:01,840 --> 00:20:05,520
So, so it's not just about how fast the AI can respond.

592
00:20:05,520 --> 00:20:06,520
No, not at all.

593
00:20:06,520 --> 00:20:10,600
It's about like how effectively it uses that time.

594
00:20:10,600 --> 00:20:11,600
Uses that time.

595
00:20:11,600 --> 00:20:13,240
To, to reason through.

596
00:20:13,240 --> 00:20:14,240
To reason through.

597
00:20:14,240 --> 00:20:15,840
To the potential consequence.

598
00:20:15,840 --> 00:20:16,840
Right.

599
00:20:16,840 --> 00:20:17,840
Exactly.

600
00:20:17,840 --> 00:20:18,840
Yeah.

601
00:20:18,840 --> 00:20:19,840
And that's, that's a big difference.

602
00:20:19,840 --> 00:20:20,840
Yeah.

603
00:20:20,840 --> 00:20:21,840
You know, from how we traditionally view AI.

604
00:20:21,840 --> 00:20:22,840
It is.

605
00:20:22,840 --> 00:20:27,520
Where speed and efficiency are, you know, everything paramount.

606
00:20:27,520 --> 00:20:30,760
It makes you wonder if, if, you know, by rushing AI.

607
00:20:30,760 --> 00:20:31,760
By rushing it.

608
00:20:31,760 --> 00:20:32,760
Yeah.

609
00:20:32,760 --> 00:20:33,920
We've been like limiting its capacity.

610
00:20:33,920 --> 00:20:34,920
Yeah.

611
00:20:34,920 --> 00:20:35,920
For ethical reasoning.

612
00:20:35,920 --> 00:20:36,920
Right.

613
00:20:36,920 --> 00:20:40,840
We might be, we might be underestimating AI's potential for ethical decision making

614
00:20:40,840 --> 00:20:43,640
because we're so focused on speed.

615
00:20:43,640 --> 00:20:46,360
It's like, it's like we've been trying to train it to be a sprinter.

616
00:20:46,360 --> 00:20:47,360
A sprinter.

617
00:20:47,360 --> 00:20:50,320
When maybe it's better suited to be like a marathon runner.

618
00:20:50,320 --> 00:20:51,320
Yeah.

619
00:20:51,320 --> 00:20:52,560
When I'm slowing down.

620
00:20:52,560 --> 00:20:53,560
Yeah.

621
00:20:53,560 --> 00:20:55,240
Helps you see things more clearly.

622
00:20:55,240 --> 00:20:56,240
Yes.

623
00:20:56,240 --> 00:20:57,440
Make better choices.

624
00:20:57,440 --> 00:21:01,200
And that, that actually brings us to another really interesting finding from this research.

625
00:21:01,200 --> 00:21:02,200
It does.

626
00:21:02,200 --> 00:21:03,200
It does.

627
00:21:03,200 --> 00:21:04,800
Which is, the ability of these models.

628
00:21:04,800 --> 00:21:05,800
Yeah.

629
00:21:05,800 --> 00:21:07,480
To generalize their safety knowledge.

630
00:21:07,480 --> 00:21:08,480
In new situations.

631
00:21:08,480 --> 00:21:09,480
In new situations.

632
00:21:09,480 --> 00:21:10,480
Yeah.

633
00:21:10,480 --> 00:21:12,720
They tested this, you know, with, with different languages.

634
00:21:12,720 --> 00:21:13,720
Yeah.

635
00:21:13,720 --> 00:21:14,720
They used encoded data.

636
00:21:14,720 --> 00:21:15,720
And it worked.

637
00:21:15,720 --> 00:21:16,720
It worked.

638
00:21:16,720 --> 00:21:17,720
It was amazing.

639
00:21:17,720 --> 00:21:20,440
Which, which suggests that like they're not just, you know, memorizing rules.

640
00:21:20,440 --> 00:21:21,440
No, they're, they're learning.

641
00:21:21,440 --> 00:21:22,440
They're learning.

642
00:21:22,440 --> 00:21:23,440
You're learning.

643
00:21:23,440 --> 00:21:29,240
To reason about safety in a, in a much more, uh, a fundamental way.

644
00:21:29,240 --> 00:21:30,400
In a more fundamental way.

645
00:21:30,400 --> 00:21:31,400
Yeah.

646
00:21:31,400 --> 00:21:32,400
Right.

647
00:21:32,400 --> 00:21:36,600
It's about equipping the AI with that, with that deeper understanding of ethical principles

648
00:21:36,600 --> 00:21:39,080
that it can apply flexibly.

649
00:21:39,080 --> 00:21:42,160
So it's not just about, you know, teaching it to follow the rules.

650
00:21:42,160 --> 00:21:43,160
No.

651
00:21:43,160 --> 00:21:45,120
It's, it's teaching it to understand the reasoning.

652
00:21:45,120 --> 00:21:46,120
The reasoning.

653
00:21:46,120 --> 00:21:47,120
Behind those rules.

654
00:21:47,120 --> 00:21:48,120
Behind those rules.

655
00:21:48,120 --> 00:21:49,120
So that it can adapt, right.

656
00:21:49,120 --> 00:21:50,120
So it can adapt.

657
00:21:50,120 --> 00:21:51,640
So it can adapt to new situations, new challenges.

658
00:21:51,640 --> 00:21:52,800
Now that's impressive.

659
00:21:52,800 --> 00:21:53,800
That is impressive.

660
00:21:53,800 --> 00:21:56,840
And, and that ability to generalize, that's going to be so important.

661
00:21:56,840 --> 00:21:57,840
That can be critical.

662
00:21:57,840 --> 00:21:58,840
Yeah.

663
00:21:58,840 --> 00:21:59,840
For ensuring AI safety.

664
00:21:59,840 --> 00:22:00,840
Yeah.

665
00:22:00,840 --> 00:22:02,360
You know, in a world that's constantly changing.

666
00:22:02,360 --> 00:22:03,360
Yeah.

667
00:22:03,360 --> 00:22:06,320
You know, presenting all these, these new ethical dilemma.

668
00:22:06,320 --> 00:22:07,320
Ethical dilemmas.

669
00:22:07,320 --> 00:22:08,320
Yeah.

670
00:22:08,320 --> 00:22:09,320
Right.

671
00:22:09,320 --> 00:22:12,800
Speaking of, speaking of new challenges, it makes me think about, you know, the potential

672
00:22:12,800 --> 00:22:14,200
applications of this.

673
00:22:14,200 --> 00:22:15,200
Oh yeah.

674
00:22:15,200 --> 00:22:16,200
The application.

675
00:22:16,200 --> 00:22:17,200
Beyond just preventing harm.

676
00:22:17,200 --> 00:22:18,200
It's just preventing harm.

677
00:22:18,200 --> 00:22:19,200
Yeah.

678
00:22:19,200 --> 00:22:22,040
And there's possibilities in fields like healthcare.

679
00:22:22,040 --> 00:22:23,040
Yeah.

680
00:22:23,040 --> 00:22:24,040
Education.

681
00:22:24,040 --> 00:22:25,040
Education.

682
00:22:25,040 --> 00:22:26,040
Even like law enforcement.

683
00:22:26,040 --> 00:22:27,040
Law enforcement.

684
00:22:27,040 --> 00:22:28,040
Yeah.

685
00:22:28,040 --> 00:22:31,040
If, if AI can learn to reason this well.

686
00:22:31,040 --> 00:22:32,040
Right.

687
00:22:32,040 --> 00:22:35,440
I mean, it could revolutionize how we, how we approach all these really complex decisions.

688
00:22:35,440 --> 00:22:38,200
Imagine an AI system that could like, you know, help doctors.

689
00:22:38,200 --> 00:22:39,200
Help doctors.

690
00:22:39,200 --> 00:22:40,200
Yeah.

691
00:22:40,200 --> 00:22:42,040
With these, you know, life or death decisions.

692
00:22:42,040 --> 00:22:43,040
Yeah.

693
00:22:43,040 --> 00:22:45,440
Taking into account, you know, not just medical history.

694
00:22:45,440 --> 00:22:46,440
Not just medical history.

695
00:22:46,440 --> 00:22:49,760
But also like ethical considerations.

696
00:22:49,760 --> 00:22:50,960
Ethical considerations.

697
00:22:50,960 --> 00:22:51,960
Personal values.

698
00:22:51,960 --> 00:22:52,960
Personal values.

699
00:22:52,960 --> 00:22:53,960
Right.

700
00:22:53,960 --> 00:22:56,640
Or, or think about an AI tutor.

701
00:22:56,640 --> 00:22:57,640
An AI tutor.

702
00:22:57,640 --> 00:22:58,640
Yeah.

703
00:22:58,640 --> 00:22:59,640
Could like, personalize education.

704
00:22:59,640 --> 00:23:00,640
And personalize education.

705
00:23:00,640 --> 00:23:01,640
That's right.

706
00:23:01,640 --> 00:23:02,640
For each student.

707
00:23:02,640 --> 00:23:03,640
I could do it.

708
00:23:03,640 --> 00:23:04,640
For each student.

709
00:23:04,640 --> 00:23:05,640
Yeah.

710
00:23:05,640 --> 00:23:08,160
Making sure that the content is, is not only engaging.

711
00:23:08,160 --> 00:23:09,160
Thank you.

712
00:23:09,160 --> 00:23:10,160
But also ethically sound.

713
00:23:10,160 --> 00:23:11,160
Ethically sound.

714
00:23:11,160 --> 00:23:12,160
Yeah.

715
00:23:12,160 --> 00:23:13,640
It's, it's amazing to think about.

716
00:23:13,640 --> 00:23:14,640
Right.

717
00:23:14,640 --> 00:23:16,720
The possibilities are, they're so exciting.

718
00:23:16,720 --> 00:23:17,720
They are exciting.

719
00:23:17,720 --> 00:23:18,720
Yeah.

720
00:23:18,720 --> 00:23:23,360
But, but of course, you know, as with any powerful technology, we, we got to be careful.

721
00:23:23,360 --> 00:23:24,360
We do.

722
00:23:24,360 --> 00:23:25,360
We do.

723
00:23:25,360 --> 00:23:29,640
We need to make sure that, you know, AI development is, is guided by those ethical principles.

724
00:23:29,640 --> 00:23:31,120
Guided by ethical principles.

725
00:23:31,120 --> 00:23:34,560
You know, that commitment to human well-being.

726
00:23:34,560 --> 00:23:35,560
Human well-being.

727
00:23:35,560 --> 00:23:36,560
That's right.

728
00:23:36,560 --> 00:23:37,560
I couldn't agree more.

729
00:23:37,560 --> 00:23:38,560
Yeah.

730
00:23:38,560 --> 00:23:39,560
Yeah.

731
00:23:39,560 --> 00:23:44,080
As we, as we push these boundaries of AI, it's, it's crucial that, that we prioritize

732
00:23:44,080 --> 00:23:45,080
safety.

733
00:23:45,080 --> 00:23:46,080
Prioritize safety.

734
00:23:46,080 --> 00:23:47,080
Yeah.

735
00:23:47,080 --> 00:23:48,080
Responsibility.

736
00:23:48,080 --> 00:23:49,080
Responsibility.

737
00:23:49,080 --> 00:23:50,080
Transparency.

738
00:23:50,080 --> 00:23:51,080
Transparency.

739
00:23:51,080 --> 00:23:52,960
You know, making sure that AI is, is a force for good.

740
00:23:52,960 --> 00:23:54,560
A force for good in the world.

741
00:23:54,560 --> 00:23:55,560
Yeah.

742
00:23:55,560 --> 00:23:56,560
Well said.

743
00:23:56,560 --> 00:23:57,560
Well said.

744
00:23:57,560 --> 00:23:59,440
On that note, I think we should take a quick pause.

745
00:23:59,440 --> 00:24:00,440
That's good.

746
00:24:00,440 --> 00:24:01,440
Gather our thoughts.

747
00:24:01,440 --> 00:24:02,440
Gather our thoughts.

748
00:24:02,440 --> 00:24:06,440
And when we come back, we'll, we'll dive into the, the broader implications of this research.

749
00:24:06,440 --> 00:24:07,440
Yeah.

750
00:24:07,440 --> 00:24:08,880
And what it might mean for the future of AI.

751
00:24:08,880 --> 00:24:10,520
The future of AI.

752
00:24:10,520 --> 00:24:11,520
Stay tuned.

753
00:24:11,520 --> 00:24:12,520
Stay tuned.

754
00:24:12,520 --> 00:24:15,120
Welcome back everyone. I'm ready to keep unpacking this research.

755
00:24:15,120 --> 00:24:16,120
How about you?

756
00:24:16,120 --> 00:24:17,120
Definitely.

757
00:24:17,120 --> 00:24:20,280
This deep dive into deliberative alignment.

758
00:24:20,280 --> 00:24:23,160
You know, it's really got me thinking about the future of AI.

759
00:24:23,160 --> 00:24:24,160
Me too.

760
00:24:24,160 --> 00:24:25,160
Me too.

761
00:24:25,160 --> 00:24:30,240
Like, you know, a future where AI isn't just powerful, but also like, you know, responsible.

762
00:24:30,240 --> 00:24:31,240
Responsible.

763
00:24:31,240 --> 00:24:32,240
Ethical.

764
00:24:32,240 --> 00:24:33,240
Yeah.

765
00:24:33,240 --> 00:24:34,760
And really aligned with, you know, with human values.

766
00:24:34,760 --> 00:24:36,920
It's a, it's a really exciting possibility.

767
00:24:36,920 --> 00:24:37,920
It is.

768
00:24:37,920 --> 00:24:38,920
It is.

769
00:24:38,920 --> 00:24:40,960
And I think this research, you know, this deliberative alignment.

770
00:24:40,960 --> 00:24:41,960
Yeah.

771
00:24:41,960 --> 00:24:47,040
It's just a really solid foundation to build that future on, you know?

772
00:24:47,040 --> 00:24:48,040
Absolutely.

773
00:24:48,040 --> 00:24:54,120
And as we've talked about, it's showing like real promise in tackling those big AI safety

774
00:24:54,120 --> 00:24:55,120
challenges, right?

775
00:24:55,120 --> 00:24:56,120
It is.

776
00:24:56,120 --> 00:24:57,120
Yeah.

777
00:24:57,120 --> 00:24:58,120
Like those, those pesky jailbreak attacks.

778
00:24:58,120 --> 00:24:59,120
Yeah.

779
00:24:59,120 --> 00:25:00,120
This is jailbreaks.

780
00:25:00,120 --> 00:25:02,520
And, and the problem of, you know, AI being a little too cautious.

781
00:25:02,520 --> 00:25:03,520
Right.

782
00:25:03,520 --> 00:25:04,520
The over refusal.

783
00:25:04,520 --> 00:25:05,520
The over refusal, exactly.

784
00:25:05,520 --> 00:25:06,520
Yeah.

785
00:25:06,520 --> 00:25:09,880
It seems like, you know, giving AI this ability to actually reason through those safety guidelines

786
00:25:09,880 --> 00:25:14,760
before it takes action rather than just, you know, reacting based on, on pre-programmed

787
00:25:14,760 --> 00:25:15,760
rules.

788
00:25:15,760 --> 00:25:16,760
Yeah.

789
00:25:16,760 --> 00:25:18,160
It's, it's really a game changer.

790
00:25:18,160 --> 00:25:19,160
It is.

791
00:25:19,160 --> 00:25:20,160
It's like, we're giving it the tools.

792
00:25:20,160 --> 00:25:21,160
The tools.

793
00:25:21,160 --> 00:25:23,280
To make those more, you know, informed decisions.

794
00:25:23,280 --> 00:25:24,280
Get formed.

795
00:25:24,280 --> 00:25:25,280
Yeah.

796
00:25:25,280 --> 00:25:26,280
Those nuanced judgments.

797
00:25:26,280 --> 00:25:27,280
Guaranteed judgments, exactly.

798
00:25:27,280 --> 00:25:30,360
And what I find really fascinating is, is that this ability.

799
00:25:30,360 --> 00:25:31,360
Yeah.

800
00:25:31,360 --> 00:25:33,000
To reason about safety.

801
00:25:33,000 --> 00:25:37,200
It's, it's not just limited to those specific situations, right?

802
00:25:37,200 --> 00:25:38,200
Right.

803
00:25:38,200 --> 00:25:39,560
That the AI has been trained on.

804
00:25:39,560 --> 00:25:40,560
Right.

805
00:25:40,560 --> 00:25:41,560
Right.

806
00:25:41,560 --> 00:25:45,920
The research showed that these models, they can actually generalize their knowledge.

807
00:25:45,920 --> 00:25:46,920
Yeah.

808
00:25:46,920 --> 00:25:47,920
To new situations.

809
00:25:47,920 --> 00:25:49,080
To new and unfamiliar challenges.

810
00:25:49,080 --> 00:25:52,200
Unfamiliar challenges even in, in different languages.

811
00:25:52,200 --> 00:25:53,200
Different languages, yeah.

812
00:25:53,200 --> 00:25:54,720
Or with like encoded data.

813
00:25:54,720 --> 00:25:55,720
Right.

814
00:25:55,720 --> 00:25:56,720
That's, that's huge.

815
00:25:56,720 --> 00:26:01,080
It is huge, which, which means, you know, we're not just teaching them to follow the

816
00:26:01,080 --> 00:26:02,080
rules.

817
00:26:02,080 --> 00:26:03,080
Yeah.

818
00:26:03,080 --> 00:26:04,080
No.

819
00:26:04,080 --> 00:26:05,080
We're, we're teaching them to understand.

820
00:26:05,080 --> 00:26:06,080
The reasoning behind it.

821
00:26:06,080 --> 00:26:07,840
The reasoning behind those rules, which is, which is amazing.

822
00:26:07,840 --> 00:26:08,840
It is.

823
00:26:08,840 --> 00:26:10,760
And that's what makes this approach so, so promising.

824
00:26:10,760 --> 00:26:11,760
It does.

825
00:26:11,760 --> 00:26:14,400
You know, it suggests that we're not just building rule followers.

826
00:26:14,400 --> 00:26:15,400
Right.

827
00:26:15,400 --> 00:26:16,920
We're building AI that can adapt.

828
00:26:16,920 --> 00:26:17,920
Yeah.

829
00:26:17,920 --> 00:26:19,840
To new situations, new ethical dilemmas.

830
00:26:19,840 --> 00:26:21,720
So as, as we wrap up this deep dive.

831
00:26:21,720 --> 00:26:22,720
Yeah.

832
00:26:22,720 --> 00:26:25,240
I'm curious to hear, you know, your thoughts on the big picture.

833
00:26:25,240 --> 00:26:26,240
Okay.

834
00:26:26,240 --> 00:26:27,240
Yeah.

835
00:26:27,240 --> 00:26:29,240
Like where do you see this research going?

836
00:26:29,240 --> 00:26:30,240
What are some of the possibilities.

837
00:26:30,240 --> 00:26:31,240
Yeah.

838
00:26:31,240 --> 00:26:32,240
Yeah.

839
00:26:32,240 --> 00:26:33,960
That this, that this deliberative alignment opens up.

840
00:26:33,960 --> 00:26:38,800
Well, I think, you know, one really exciting possibility is, is the potential for AI.

841
00:26:38,800 --> 00:26:39,800
Yeah.

842
00:26:39,800 --> 00:26:43,280
To, to become a more active partner in ethical decision making.

843
00:26:43,280 --> 00:26:47,160
So, so instead of just, you know, passively following rules.

844
00:26:47,160 --> 00:26:48,160
Yeah.

845
00:26:48,160 --> 00:26:49,160
Right.

846
00:26:49,160 --> 00:26:50,680
AI could actually contribute to the discussion.

847
00:26:50,680 --> 00:26:51,680
Yeah.

848
00:26:51,680 --> 00:26:56,400
Contribute to ethical debates, help us navigate these, these really complex dilemmas.

849
00:26:56,400 --> 00:26:57,400
Wow.

850
00:26:57,400 --> 00:26:58,640
That's, that's a powerful thought.

851
00:26:58,640 --> 00:26:59,840
It is a powerful thought.

852
00:26:59,840 --> 00:27:03,280
You know, think about fields like, like bioethics.

853
00:27:03,280 --> 00:27:04,280
Yeah.

854
00:27:04,280 --> 00:27:07,200
Environmental policy, even like international relations.

855
00:27:07,200 --> 00:27:09,400
I mean, AI could, could really.

856
00:27:09,400 --> 00:27:11,600
It could really help us weigh those different perspectives.

857
00:27:11,600 --> 00:27:12,600
Right.

858
00:27:12,600 --> 00:27:18,640
Think about the consequences, you know, and, and arrive at more ethically sound decisions.

859
00:27:18,640 --> 00:27:19,640
That's incredible.

860
00:27:19,640 --> 00:27:21,720
But, but of course with any powerful technology.

861
00:27:21,720 --> 00:27:22,720
Of course.

862
00:27:22,720 --> 00:27:23,720
Of course.

863
00:27:23,720 --> 00:27:24,720
There are risks, right?

864
00:27:24,720 --> 00:27:25,720
There are risks.

865
00:27:25,720 --> 00:27:26,720
There are challenges.

866
00:27:26,720 --> 00:27:27,720
Yeah.

867
00:27:27,720 --> 00:27:28,720
For sure.

868
00:27:28,720 --> 00:27:29,720
That we need to be aware of.

869
00:27:29,720 --> 00:27:30,720
We need to be very mindful of those.

870
00:27:30,720 --> 00:27:34,880
The AI becomes, you know, more sophisticated, more integrated into our lives.

871
00:27:34,880 --> 00:27:35,880
Integrated.

872
00:27:35,880 --> 00:27:36,880
Yeah.

873
00:27:36,880 --> 00:27:40,960
And we have like good systems in place for accountability.

874
00:27:40,960 --> 00:27:41,960
Accountability.

875
00:27:41,960 --> 00:27:42,960
Over oversight, you know.

876
00:27:42,960 --> 00:27:43,960
Right.

877
00:27:43,960 --> 00:27:45,560
We can't just blindly trust these systems.

878
00:27:45,560 --> 00:27:46,560
No, no, we can't.

879
00:27:46,560 --> 00:27:49,880
We need to be able to like understand.

880
00:27:49,880 --> 00:27:50,880
How they're making decisions.

881
00:27:50,880 --> 00:27:51,880
Valid and making decisions.

882
00:27:51,880 --> 00:27:53,160
And, and hold them accountable.

883
00:27:53,160 --> 00:27:54,440
Accountable for their actions.

884
00:27:54,440 --> 00:27:55,440
Yeah.

885
00:27:55,440 --> 00:27:56,440
Transparency is key.

886
00:27:56,440 --> 00:27:57,680
Transparency is absolutely key.

887
00:27:57,680 --> 00:27:58,680
Right.

888
00:27:58,680 --> 00:28:00,520
Like we need to be able to see how.

889
00:28:00,520 --> 00:28:01,520
To look under the hood.

890
00:28:01,520 --> 00:28:02,520
Yeah.

891
00:28:02,520 --> 00:28:03,520
Look under the hood.

892
00:28:03,520 --> 00:28:05,040
See how it's, how it's getting to those conclusions.

893
00:28:05,040 --> 00:28:08,600
Especially when those conclusions have, you know, ethical implications.

894
00:28:08,600 --> 00:28:09,600
Yeah.

895
00:28:09,600 --> 00:28:10,600
Absolutely.

896
00:28:10,600 --> 00:28:15,880
And, and we also need to be aware of, you know, the potential for bias.

897
00:28:15,880 --> 00:28:16,880
Bias.

898
00:28:16,880 --> 00:28:17,880
Yes.

899
00:28:17,880 --> 00:28:18,880
That's a big one.

900
00:28:18,880 --> 00:28:19,880
Right.

901
00:28:19,880 --> 00:28:23,280
These, these systems, they're trained on data created by humans.

902
00:28:23,280 --> 00:28:24,280
Created by humans.

903
00:28:24,280 --> 00:28:28,640
And that data can reflect, you know, existing societal biases.

904
00:28:28,640 --> 00:28:30,840
And that can be, you know, really harmful.

905
00:28:30,840 --> 00:28:31,840
It can be.

906
00:28:31,840 --> 00:28:32,840
Yeah.

907
00:28:32,840 --> 00:28:37,040
And we need to be vigilant in making sure that these systems are fair.

908
00:28:37,040 --> 00:28:38,040
Fair.

909
00:28:38,040 --> 00:28:39,040
Equitable.

910
00:28:39,040 --> 00:28:40,040
Equitable.

911
00:28:40,040 --> 00:28:41,520
And they're not perpetuating harmful stereotypes.

912
00:28:41,520 --> 00:28:42,520
Absolutely.

913
00:28:42,520 --> 00:28:43,520
It's, it's a lot to think about.

914
00:28:43,520 --> 00:28:44,520
It is a lot to think about.

915
00:28:44,520 --> 00:28:45,520
Yeah.

916
00:28:45,520 --> 00:28:50,440
But it's, it's clear that AI safety, you know, it's not just a technical challenge.

917
00:28:50,440 --> 00:28:51,440
No, no, it's not.

918
00:28:51,440 --> 00:28:52,440
It's a human one.

919
00:28:52,440 --> 00:28:53,440
It's a deeply human challenge.

920
00:28:53,440 --> 00:28:54,440
It is, it is.

921
00:28:54,440 --> 00:28:56,040
And it requires, you know, collaboration.

922
00:28:56,040 --> 00:28:57,040
Collaboration.

923
00:28:57,040 --> 00:28:58,040
Careful consideration.

924
00:28:58,040 --> 00:28:59,040
Careful consideration.

925
00:28:59,040 --> 00:29:00,040
From all of us.

926
00:29:00,040 --> 00:29:02,080
You know, researchers, developers.

927
00:29:02,080 --> 00:29:03,760
The policymakers, the public, everybody.

928
00:29:03,760 --> 00:29:04,760
Everybody.

929
00:29:04,760 --> 00:29:05,760
Yeah.

930
00:29:05,760 --> 00:29:08,760
Well, this deep dive into deliberative alignment has been, it's been really eye-opening.

931
00:29:08,760 --> 00:29:09,760
It has.

932
00:29:09,760 --> 00:29:15,040
You know, it's given us this glimpse into a future where AI is not just, you know, powerful,

933
00:29:15,040 --> 00:29:16,040
but also.

934
00:29:16,040 --> 00:29:17,040
But also responsible.

935
00:29:17,040 --> 00:29:18,040
Responsible, ethical.

936
00:29:18,040 --> 00:29:19,320
Oh, ethical, yeah.

937
00:29:19,320 --> 00:29:21,880
And truly aligned with, with those human values.

938
00:29:21,880 --> 00:29:24,200
It's a, it's a future worth striving for.

939
00:29:24,200 --> 00:29:25,200
It is.

940
00:29:25,200 --> 00:29:26,200
It is.

941
00:29:26,200 --> 00:29:28,760
And it's one that, you know, as you said, we all have a role in shaping.

942
00:29:28,760 --> 00:29:29,760
We all do.

943
00:29:29,760 --> 00:29:32,800
Well, that brings us to the end of our deep dive today.

944
00:29:32,800 --> 00:29:33,800
It does.

945
00:29:33,800 --> 00:29:38,520
Thank you all so much for joining us on this, you know, expeleration of, of this really

946
00:29:38,520 --> 00:29:41,280
cutting edge AI safety research.

947
00:29:41,280 --> 00:29:42,280
Always a pleasure.

948
00:29:42,280 --> 00:29:45,720
We hope this episode has, you know, maybe sparked your curiosity.

949
00:29:45,720 --> 00:29:47,400
It sparked your curiosity.

950
00:29:47,400 --> 00:29:52,520
And got you thinking more deeply about these, these really important ethical implications

951
00:29:52,520 --> 00:29:53,520
of AI.

952
00:29:53,520 --> 00:29:54,520
Yeah.

953
00:29:54,520 --> 00:29:56,040
Because it's a conversation we all need to be a part of.

954
00:29:56,040 --> 00:29:57,040
It is.

955
00:29:57,040 --> 00:30:01,440
And always we encourage you to, to keep exploring these questions, you know.

956
00:30:01,440 --> 00:30:02,440
Speak, learn and keep thinking.

957
00:30:02,440 --> 00:30:06,360
Keep learning, keep thinking and, and stay engaged in this conversation.

958
00:30:06,360 --> 00:30:07,360
Stay engaged.

959
00:30:07,360 --> 00:30:08,360
Yeah.

960
00:30:08,360 --> 00:30:11,080
About the future that, that we want to build with AI.

961
00:30:11,080 --> 00:30:12,080
That's right.

962
00:30:12,080 --> 00:30:13,080
Until next time.

963
00:30:13,080 --> 00:30:36,080
Thanks a lot.

