1
00:00:00,000 --> 00:00:02,820
Hey everyone and welcome back, ready for another deep dive.

2
00:00:02,820 --> 00:00:03,740
Always.

3
00:00:03,740 --> 00:00:04,580
Awesome.

4
00:00:04,580 --> 00:00:07,500
So today we're looking into something pretty amazing

5
00:00:07,500 --> 00:00:08,660
in the world of AI.

6
00:00:09,660 --> 00:00:11,860
You know how we always hear about bigger and better AI models.

7
00:00:11,860 --> 00:00:12,700
Right.

8
00:00:12,700 --> 00:00:17,700
Well, what if I told you there's this new law emerging

9
00:00:19,060 --> 00:00:22,820
that's all about making those models smarter and smaller.

10
00:00:22,820 --> 00:00:24,020
Interesting.

11
00:00:24,020 --> 00:00:25,220
At the same time.

12
00:00:25,220 --> 00:00:26,060
Yeah.

13
00:00:26,060 --> 00:00:27,860
It's called the Densing Law.

14
00:00:27,860 --> 00:00:31,500
And it's gonna like change how we think about AI,

15
00:00:31,500 --> 00:00:33,140
especially when we're talking about using it

16
00:00:33,140 --> 00:00:35,300
on everyday devices like our phones.

17
00:00:35,300 --> 00:00:36,140
Right.

18
00:00:36,140 --> 00:00:37,340
So pretty interesting stuff.

19
00:00:37,340 --> 00:00:38,500
We're gonna be checking out the paper,

20
00:00:38,500 --> 00:00:41,260
Densing Law of LLMs today.

21
00:00:41,260 --> 00:00:42,940
Sounds good.

22
00:00:42,940 --> 00:00:46,300
So this paper really tackles that tension

23
00:00:46,300 --> 00:00:49,660
between like wanting really powerful AI

24
00:00:49,660 --> 00:00:51,340
but needing it to run efficiently.

25
00:00:51,340 --> 00:00:52,180
Yeah.

26
00:00:52,180 --> 00:00:53,020
On something like a smartphone.

27
00:00:53,020 --> 00:00:53,860
Exactly.

28
00:00:53,860 --> 00:00:55,740
We're talking about a potential future

29
00:00:55,740 --> 00:00:57,980
where your phone has the brain power

30
00:00:57,980 --> 00:01:00,740
of some of today's most advanced AI systems

31
00:01:00,740 --> 00:01:03,820
without needing like a supercomputer to back it up.

32
00:01:03,820 --> 00:01:04,660
Exactly.

33
00:01:04,660 --> 00:01:05,500
It's crazy.

34
00:01:05,500 --> 00:01:06,820
So okay, so I think we need to break down

35
00:01:06,820 --> 00:01:08,060
this whole Densing Law thing.

36
00:01:08,060 --> 00:01:12,580
I read the paper and it's a little dense even for me.

37
00:01:12,580 --> 00:01:13,900
Where do we even begin?

38
00:01:13,900 --> 00:01:15,500
Well, I think the best place to start

39
00:01:15,500 --> 00:01:17,300
is with capacity density.

40
00:01:17,300 --> 00:01:20,580
Imagine you have like this giant toolbox

41
00:01:20,580 --> 00:01:22,540
overflowing with tools,

42
00:01:22,540 --> 00:01:24,980
but you only really need like a handful

43
00:01:24,980 --> 00:01:26,380
to build something amazing.

44
00:01:26,380 --> 00:01:27,220
Okay.

45
00:01:27,220 --> 00:01:29,260
That's kind of what's happening with these AI models.

46
00:01:29,260 --> 00:01:30,100
Yeah.

47
00:01:30,100 --> 00:01:31,780
They have tons of parameters

48
00:01:31,780 --> 00:01:34,060
which are like the tools in the toolbox,

49
00:01:34,060 --> 00:01:36,900
but they might not be using all of them effectively.

50
00:01:36,900 --> 00:01:40,140
Oh, so it's not just about having a huge AI model.

51
00:01:40,140 --> 00:01:43,740
It's how efficiently that model uses its resources.

52
00:01:43,740 --> 00:01:44,860
Yes, exactly.

53
00:01:44,860 --> 00:01:45,940
Interesting.

54
00:01:45,940 --> 00:01:48,780
So does that mean that a smaller model

55
00:01:48,780 --> 00:01:50,740
with a high capacity density

56
00:01:50,740 --> 00:01:53,060
could actually be more powerful than

57
00:01:53,060 --> 00:01:54,900
potentially a much larger model

58
00:01:54,900 --> 00:01:55,860
that's less efficient?

59
00:01:55,860 --> 00:01:56,700
You got it.

60
00:01:56,700 --> 00:01:59,220
And that's what this Densing Law is pointing to.

61
00:01:59,220 --> 00:02:00,060
Okay.

62
00:02:00,060 --> 00:02:03,580
The researchers found that the maximum capacity density

63
00:02:03,580 --> 00:02:05,580
of open source LLMs,

64
00:02:05,580 --> 00:02:07,540
those are the AI that power things

65
00:02:07,540 --> 00:02:10,220
like chatbots and text generators,

66
00:02:10,220 --> 00:02:11,980
is doubling about every three months.

67
00:02:11,980 --> 00:02:12,900
Every three months.

68
00:02:12,900 --> 00:02:14,860
Hold on, that's insanely fast.

69
00:02:14,860 --> 00:02:15,700
It is.

70
00:02:15,700 --> 00:02:18,180
What does it even mean for like someone like me

71
00:02:18,180 --> 00:02:19,820
who uses AI every day?

72
00:02:19,820 --> 00:02:20,740
Think of it this way.

73
00:02:20,740 --> 00:02:21,580
Okay.

74
00:02:21,580 --> 00:02:22,620
Remember Moore's Law.

75
00:02:22,620 --> 00:02:23,700
That's so many transistors.

76
00:02:23,700 --> 00:02:24,540
Yes.

77
00:02:24,540 --> 00:02:25,380
It's a chip.

78
00:02:25,380 --> 00:02:26,220
Exactly.

79
00:02:26,220 --> 00:02:27,860
It's about squeezing more and more transistors

80
00:02:27,860 --> 00:02:30,380
onto a chip, making computers more powerful.

81
00:02:30,380 --> 00:02:31,220
Right.

82
00:02:31,220 --> 00:02:32,660
And the Densing Law,

83
00:02:32,660 --> 00:02:34,940
it's doing something similar for AI.

84
00:02:34,940 --> 00:02:35,780
Wow.

85
00:02:35,780 --> 00:02:37,300
We're squeezing more intelligence

86
00:02:37,300 --> 00:02:39,260
into a smaller package.

87
00:02:39,260 --> 00:02:40,700
So you're saying in a few months,

88
00:02:40,700 --> 00:02:42,940
we could have the same performance

89
00:02:42,940 --> 00:02:45,180
as a large AI today.

90
00:02:45,180 --> 00:02:46,020
Yeah.

91
00:02:46,020 --> 00:02:47,620
But with a model half its size.

92
00:02:47,620 --> 00:02:49,180
Yeah, potentially.

93
00:02:49,180 --> 00:02:51,420
Wait, so that means my phone could be running

94
00:02:51,420 --> 00:02:53,420
something as good as chatGPT.

95
00:02:53,420 --> 00:02:54,260
Uh-huh.

96
00:02:54,260 --> 00:02:55,100
In your future,

97
00:02:55,100 --> 00:02:55,920
Yeah.

98
00:02:55,920 --> 00:02:56,860
without, you know, killing my battery.

99
00:02:56,860 --> 00:02:57,980
That's the idea.

100
00:02:57,980 --> 00:02:58,940
That's a game changer.

101
00:02:58,940 --> 00:02:59,780
Yeah.

102
00:02:59,780 --> 00:03:01,340
This is the kind of future the research is hinting at.

103
00:03:01,340 --> 00:03:02,180
Yeah.

104
00:03:02,180 --> 00:03:03,340
And they have some pretty good evidence.

105
00:03:03,340 --> 00:03:07,300
Like this model called Mini CPM1 2.4B

106
00:03:07,300 --> 00:03:09,420
released back in February.

107
00:03:09,420 --> 00:03:10,980
And it managed to do about the same

108
00:03:10,980 --> 00:03:13,380
as the Mistral 7B model.

109
00:03:13,380 --> 00:03:14,220
Which came out.

110
00:03:14,220 --> 00:03:16,140
Which came out in September of last year,

111
00:03:16,140 --> 00:03:19,420
but using only 35% of the parameters.

112
00:03:19,420 --> 00:03:20,260
Wow.

113
00:03:20,260 --> 00:03:22,100
That's a huge difference in size.

114
00:03:22,100 --> 00:03:24,620
So the size of the model still matters.

115
00:03:24,620 --> 00:03:25,460
Of course.

116
00:03:25,460 --> 00:03:27,900
But this dinsing law is like suggesting

117
00:03:27,900 --> 00:03:29,780
that we could be entering a world

118
00:03:29,780 --> 00:03:32,420
where those massive models.

119
00:03:32,420 --> 00:03:33,260
Yeah.

120
00:03:33,260 --> 00:03:34,100
The power hungry ones.

121
00:03:34,100 --> 00:03:34,940
Right.

122
00:03:34,940 --> 00:03:35,780
Become a thing of the past.

123
00:03:35,780 --> 00:03:36,620
Exactly.

124
00:03:36,620 --> 00:03:38,300
It's not about raw size anymore.

125
00:03:38,300 --> 00:03:39,140
Yeah.

126
00:03:39,140 --> 00:03:40,060
It's about using your resources well.

127
00:03:40,060 --> 00:03:41,340
So smart AI.

128
00:03:41,340 --> 00:03:42,340
Yes.

129
00:03:42,340 --> 00:03:43,460
And you know what?

130
00:03:43,460 --> 00:03:45,620
This actually has some pretty big implications

131
00:03:45,620 --> 00:03:47,540
for the cost of running these AI models too.

132
00:03:47,540 --> 00:03:48,380
Oh really?

133
00:03:48,380 --> 00:03:49,220
Yeah.

134
00:03:49,220 --> 00:03:50,060
So we're talking more powerful, smaller,

135
00:03:50,060 --> 00:03:51,300
more efficient.

136
00:03:51,300 --> 00:03:52,460
And cheaper to run.

137
00:03:52,460 --> 00:03:53,300
Yeah.

138
00:03:53,300 --> 00:03:55,540
It's all thanks to this dinsing law.

139
00:03:55,540 --> 00:03:57,020
That almost sounds too good to be true.

140
00:03:57,020 --> 00:03:58,100
I know, right?

141
00:03:58,100 --> 00:03:59,100
What's the catch?

142
00:03:59,100 --> 00:04:00,500
Well, there are definitely some challenges

143
00:04:00,500 --> 00:04:02,180
that researchers are still working through.

144
00:04:02,180 --> 00:04:03,020
Like what?

145
00:04:03,020 --> 00:04:04,580
One of the big ones is figuring out

146
00:04:04,580 --> 00:04:06,420
how to measure AI model performance

147
00:04:06,420 --> 00:04:08,140
across different tasks.

148
00:04:08,140 --> 00:04:09,700
In a way that's fair.

149
00:04:09,700 --> 00:04:10,540
Okay.

150
00:04:10,540 --> 00:04:11,380
And comprehensive.

151
00:04:11,380 --> 00:04:13,740
So we gotta make sure we're comparing apples to apples.

152
00:04:13,740 --> 00:04:14,580
Exactly.

153
00:04:14,580 --> 00:04:15,980
When we talk about capacity density.

154
00:04:15,980 --> 00:04:16,820
Yes.

155
00:04:16,820 --> 00:04:17,660
That makes sense.

156
00:04:17,660 --> 00:04:18,500
Uh huh.

157
00:04:18,500 --> 00:04:20,940
So how is this dinsing law actually like playing out

158
00:04:20,940 --> 00:04:22,100
in the real world?

159
00:04:22,100 --> 00:04:25,660
Are there any examples besides many CPM?

160
00:04:25,660 --> 00:04:26,500
Oh yeah.

161
00:04:26,500 --> 00:04:27,900
There are actually quite a few.

162
00:04:27,900 --> 00:04:30,780
And researchers are finding that even in things like

163
00:04:30,780 --> 00:04:33,700
image recognition and natural language processing.

164
00:04:33,700 --> 00:04:34,540
Okay.

165
00:04:34,540 --> 00:04:36,460
Smaller models with high capacity density

166
00:04:36,460 --> 00:04:38,860
are starting to do as well

167
00:04:38,860 --> 00:04:40,300
as their larger counterparts.

168
00:04:40,300 --> 00:04:41,140
Oh wow.

169
00:04:41,140 --> 00:04:43,500
It's becoming pretty clear that this dinsing law

170
00:04:43,500 --> 00:04:44,980
isn't just some fluke.

171
00:04:44,980 --> 00:04:45,820
Yeah.

172
00:04:45,820 --> 00:04:48,340
It's a trend that could like fundamentally change

173
00:04:48,340 --> 00:04:50,420
how we design and develop AI.

174
00:04:50,420 --> 00:04:51,340
That's huge.

175
00:04:51,340 --> 00:04:55,420
But how are researchers actually getting these improvements

176
00:04:55,420 --> 00:04:56,460
in capacity density?

177
00:04:56,460 --> 00:04:57,500
It sounds like magic.

178
00:04:57,500 --> 00:04:58,660
It's not magic.

179
00:04:58,660 --> 00:05:00,220
It's really clever engineering.

180
00:05:00,220 --> 00:05:01,060
Okay.

181
00:05:01,060 --> 00:05:02,540
One of the things that is helping

182
00:05:02,540 --> 00:05:05,380
is the increasing scale and quality of the data

183
00:05:05,380 --> 00:05:07,220
used to train the models.

184
00:05:07,220 --> 00:05:10,100
So like if you wanna teach a kid a new language,

185
00:05:10,100 --> 00:05:11,980
you give them lots of books and conversations

186
00:05:11,980 --> 00:05:13,020
and experiences, right?

187
00:05:13,020 --> 00:05:13,860
Exactly.

188
00:05:13,860 --> 00:05:15,220
And it's the same with AI.

189
00:05:15,220 --> 00:05:17,140
The more high quality data you feed it,

190
00:05:17,140 --> 00:05:18,380
the better it learns.

191
00:05:18,380 --> 00:05:21,020
So it's like we're giving these AI models

192
00:05:21,020 --> 00:05:23,580
a bigger and better library to learn from.

193
00:05:23,580 --> 00:05:24,820
That's a great way to put it.

194
00:05:24,820 --> 00:05:25,660
Cool.

195
00:05:25,660 --> 00:05:27,100
And along with better data,

196
00:05:27,100 --> 00:05:29,300
we're also seeing some incredible innovations

197
00:05:29,300 --> 00:05:32,140
in the algorithms and model architectures.

198
00:05:32,140 --> 00:05:32,980
And those are...

199
00:05:32,980 --> 00:05:34,940
Like the brains of the AI system.

200
00:05:34,940 --> 00:05:35,780
Okay.

201
00:05:35,780 --> 00:05:37,260
Researchers are finding new ways

202
00:05:37,260 --> 00:05:39,220
to streamline these models.

203
00:05:39,220 --> 00:05:40,060
Okay.

204
00:05:40,060 --> 00:05:41,780
You know, pruning away unnecessary connections

205
00:05:41,780 --> 00:05:43,300
and optimizing how they think.

206
00:05:43,300 --> 00:05:44,780
So they can do more with less?

207
00:05:44,780 --> 00:05:45,620
Exactly.

208
00:05:45,620 --> 00:05:47,100
So it's smarter learning.

209
00:05:47,100 --> 00:05:48,220
Thanks to better data.

210
00:05:48,220 --> 00:05:49,060
Yes.

211
00:05:49,060 --> 00:05:50,060
And then more efficient thinking.

212
00:05:50,060 --> 00:05:50,900
Uh-huh.

213
00:05:50,900 --> 00:05:52,260
Because of better algorithms and architectures.

214
00:05:52,260 --> 00:05:53,100
Yeah, yeah.

215
00:05:53,100 --> 00:05:56,540
So they're like optimizing every aspect of the AI engine.

216
00:05:56,540 --> 00:05:57,460
Yes.

217
00:05:57,460 --> 00:05:59,700
And this is leading to a big shift

218
00:05:59,700 --> 00:06:02,100
in how we think about developing AI.

219
00:06:02,100 --> 00:06:02,940
In what way?

220
00:06:02,940 --> 00:06:05,100
The paper argues that we should move

221
00:06:05,100 --> 00:06:08,060
from a performance-centric approach.

222
00:06:08,060 --> 00:06:08,900
Okay.

223
00:06:08,900 --> 00:06:10,700
To a density-centric one.

224
00:06:10,700 --> 00:06:13,540
Okay, so instead of just chasing higher scores

225
00:06:13,540 --> 00:06:14,380
on benchmarks.

226
00:06:14,380 --> 00:06:15,220
Right.

227
00:06:15,220 --> 00:06:16,460
Which usually means bigger models.

228
00:06:16,460 --> 00:06:17,300
Exactly.

229
00:06:17,300 --> 00:06:21,420
To prioritize building models that hit those scores.

230
00:06:21,420 --> 00:06:22,260
Yeah.

231
00:06:22,260 --> 00:06:23,820
But use less resources.

232
00:06:23,820 --> 00:06:24,660
That's the idea.

233
00:06:24,660 --> 00:06:25,500
Makes sense.

234
00:06:25,500 --> 00:06:27,580
Because what's the point of super powerful AI?

235
00:06:27,580 --> 00:06:30,340
If it's too expensive or uses too much energy to run.

236
00:06:30,340 --> 00:06:31,180
Exactly.

237
00:06:31,180 --> 00:06:33,260
It's like having a car that's super fast.

238
00:06:33,260 --> 00:06:34,100
Right.

239
00:06:34,100 --> 00:06:35,620
But it only gets like one mile per gallon.

240
00:06:35,620 --> 00:06:36,460
Yeah.

241
00:06:36,460 --> 00:06:37,940
It's about finding the right balance

242
00:06:37,940 --> 00:06:39,620
between performance and efficiency.

243
00:06:39,620 --> 00:06:40,460
Right.

244
00:06:40,460 --> 00:06:41,300
And it's interesting.

245
00:06:41,300 --> 00:06:42,380
The paper actually makes a connection

246
00:06:42,380 --> 00:06:45,660
between this trend in AI and Moore's Law.

247
00:06:45,660 --> 00:06:46,500
Moore's Law.

248
00:06:46,500 --> 00:06:47,820
Remind me, that's how many transistors

249
00:06:47,820 --> 00:06:48,820
you can fit on a chip, right?

250
00:06:48,820 --> 00:06:49,780
Yes.

251
00:06:49,780 --> 00:06:51,900
I mean, you can fit on a limited area.

252
00:06:51,900 --> 00:06:54,500
But how does that relate to AI?

253
00:06:54,500 --> 00:06:56,140
Well, think of it this way.

254
00:06:56,140 --> 00:06:59,260
Just as chip makers are focused on squeezing

255
00:06:59,260 --> 00:07:01,220
more transistors onto a chip.

256
00:07:01,220 --> 00:07:03,140
To make computers more powerful.

257
00:07:03,140 --> 00:07:06,460
AI developers are now trying to get more capacity density

258
00:07:06,460 --> 00:07:07,500
out of their models.

259
00:07:07,500 --> 00:07:10,380
So it's all about getting more bang for your buck

260
00:07:10,380 --> 00:07:12,180
in terms of computing resources.

261
00:07:12,180 --> 00:07:13,020
Yeah.

262
00:07:13,020 --> 00:07:15,380
So it's not just about making AI models bigger.

263
00:07:15,380 --> 00:07:16,220
Right.

264
00:07:16,220 --> 00:07:18,860
It's about making them smarter and more efficient.

265
00:07:18,860 --> 00:07:19,740
Exactly.

266
00:07:19,740 --> 00:07:21,260
Just like with computer chips.

267
00:07:21,260 --> 00:07:22,260
Wow.

268
00:07:22,260 --> 00:07:25,620
And this trend could make AI much more accessible.

269
00:07:25,620 --> 00:07:26,340
Yes.

270
00:07:26,340 --> 00:07:26,700
Absolutely.

271
00:07:26,700 --> 00:07:28,020
You mean like democratize it?

272
00:07:28,020 --> 00:07:29,260
That's exactly what I mean.

273
00:07:29,260 --> 00:07:32,780
If we can build smaller, more efficient models that

274
00:07:32,780 --> 00:07:37,140
still perform well, then AI tools

275
00:07:37,140 --> 00:07:39,220
could be available to way more people.

276
00:07:39,220 --> 00:07:39,940
Oh, wow.

277
00:07:39,940 --> 00:07:42,420
Think about students in developing countries

278
00:07:42,420 --> 00:07:45,380
using advanced AI tutors right on their phones.

279
00:07:45,380 --> 00:07:45,820
Right.

280
00:07:45,820 --> 00:07:48,420
Or small businesses using AI analytics

281
00:07:48,420 --> 00:07:50,220
to compete with bigger companies.

282
00:07:50,220 --> 00:07:51,860
That's a powerful vision.

283
00:07:51,860 --> 00:07:53,380
It's like leveling the playing field.

284
00:07:53,380 --> 00:07:53,740
Yeah.

285
00:07:53,740 --> 00:07:57,260
For anyone who wants to use AI, no matter what resources

286
00:07:57,260 --> 00:07:57,860
they have.

287
00:07:57,860 --> 00:07:59,100
Exactly.

288
00:07:59,100 --> 00:08:01,060
But are there any like limitations?

289
00:08:01,060 --> 00:08:02,100
That's a good question.

290
00:08:02,100 --> 00:08:04,620
So we really make it that accessible

291
00:08:04,620 --> 00:08:07,140
without sacrificing how well it works.

292
00:08:07,140 --> 00:08:08,220
Well, that's the challenge.

293
00:08:08,220 --> 00:08:08,540
OK.

294
00:08:08,540 --> 00:08:11,740
One concern is making sure that these smaller models don't

295
00:08:11,740 --> 00:08:14,420
lose accuracy or the ability to generalize.

296
00:08:14,420 --> 00:08:16,260
Right, we don't want AI that's fast and cheap.

297
00:08:16,260 --> 00:08:16,740
Right.

298
00:08:16,740 --> 00:08:18,580
But can't handle difficult tasks.

299
00:08:18,580 --> 00:08:20,420
Yeah, or adapt to new situations.

300
00:08:20,420 --> 00:08:21,460
It's all about balance.

301
00:08:21,460 --> 00:08:22,100
It is.

302
00:08:22,100 --> 00:08:24,660
The paper mentioned some interesting future directions.

303
00:08:24,660 --> 00:08:25,460
Oh, yeah.

304
00:08:25,460 --> 00:08:27,780
Like this multimodal dancing law.

305
00:08:27,780 --> 00:08:28,580
What is that?

306
00:08:28,580 --> 00:08:29,660
That's where it gets interesting.

307
00:08:29,660 --> 00:08:30,060
OK.

308
00:08:30,060 --> 00:08:33,820
Right now, this dancing law is mainly about LLMs.

309
00:08:33,820 --> 00:08:35,100
Which are all about text.

310
00:08:35,100 --> 00:08:35,700
Yes.

311
00:08:35,700 --> 00:08:36,100
OK.

312
00:08:36,100 --> 00:08:39,060
But what if we could do the same thing for other AI models?

313
00:08:39,060 --> 00:08:39,460
OK.

314
00:08:39,460 --> 00:08:41,580
The ones that deal with images and sound.

315
00:08:41,580 --> 00:08:42,220
Oh, wow.

316
00:08:42,220 --> 00:08:46,100
Imagine AI that can see and hear and interact with the world.

317
00:08:46,100 --> 00:08:47,020
Almost like a human.

318
00:08:47,020 --> 00:08:47,540
Yeah.

319
00:08:47,540 --> 00:08:48,380
Yeah.

320
00:08:48,380 --> 00:08:52,260
Think about AI assistants that can understand you,

321
00:08:52,260 --> 00:08:55,420
but can also read your expressions.

322
00:08:55,420 --> 00:08:55,860
Oh.

323
00:08:55,860 --> 00:08:59,700
Or robots that can navigate using vision, touch, and sound.

324
00:08:59,700 --> 00:09:00,580
That's incredible.

325
00:09:00,580 --> 00:09:02,460
The possibilities are endless.

326
00:09:02,460 --> 00:09:04,540
Wow, that's mind-blowing.

327
00:09:04,540 --> 00:09:07,140
What about this other concept, the inference dancing law?

328
00:09:07,140 --> 00:09:07,940
Oh, yeah.

329
00:09:07,940 --> 00:09:08,900
What's that about?

330
00:09:08,900 --> 00:09:11,180
That's another cool research area.

331
00:09:11,180 --> 00:09:15,100
It suggests that as AI gets better at reasoning

332
00:09:15,100 --> 00:09:19,900
and problem solving, it might need fewer thinking steps.

333
00:09:19,900 --> 00:09:21,260
So more efficient thinking.

334
00:09:21,260 --> 00:09:21,740
Exactly.

335
00:09:21,740 --> 00:09:24,660
Not just smaller size and energy usage,

336
00:09:24,660 --> 00:09:26,140
but actually thinking better.

337
00:09:26,140 --> 00:09:27,420
That's the idea.

338
00:09:27,420 --> 00:09:31,460
Imagine AI that can solve complex problems in an instant.

339
00:09:31,460 --> 00:09:32,500
Using minimal energy.

340
00:09:32,500 --> 00:09:32,980
Yes.

341
00:09:32,980 --> 00:09:34,780
Wow, that'd be a game changer.

342
00:09:34,780 --> 00:09:36,700
Like in scientific discovery or medicine.

343
00:09:36,700 --> 00:09:36,980
Yeah.

344
00:09:36,980 --> 00:09:38,740
Or even everyday decisions.

345
00:09:38,740 --> 00:09:40,660
We could have AI that's powerful.

346
00:09:40,660 --> 00:09:42,580
But also fast and efficient.

347
00:09:42,580 --> 00:09:43,780
Exactly.

348
00:09:43,780 --> 00:09:46,620
And while this is still pretty new,

349
00:09:46,620 --> 00:09:50,100
it could change how we think about AI completely.

350
00:09:50,100 --> 00:09:52,220
It does sound like we're on the edge of something big.

351
00:09:52,220 --> 00:09:53,300
Definitely.

352
00:09:53,300 --> 00:09:56,780
But with all this focus on efficiency and smaller models,

353
00:09:56,780 --> 00:10:00,340
does that mean the end for those huge AI systems?

354
00:10:00,340 --> 00:10:01,660
That's a good question.

355
00:10:01,660 --> 00:10:03,660
While smaller, more efficient models

356
00:10:03,660 --> 00:10:05,660
are definitely becoming more popular,

357
00:10:05,660 --> 00:10:09,420
there will always be a need for those large, powerful models

358
00:10:09,420 --> 00:10:10,340
for some things.

359
00:10:10,340 --> 00:10:12,140
So it's about picking the right tool for the job.

360
00:10:12,140 --> 00:10:13,580
Yes, exactly.

361
00:10:13,580 --> 00:10:16,060
Not just assuming smaller is always better.

362
00:10:16,060 --> 00:10:16,620
Right.

363
00:10:16,620 --> 00:10:18,140
And you know the world of AI development

364
00:10:18,140 --> 00:10:19,380
is getting more diverse.

365
00:10:19,380 --> 00:10:19,580
OK.

366
00:10:19,580 --> 00:10:22,500
With more models being made for specific purposes.

367
00:10:22,500 --> 00:10:24,100
So not a one size fits all anymore.

368
00:10:24,100 --> 00:10:24,740
Exactly.

369
00:10:24,740 --> 00:10:25,540
That makes sense.

370
00:10:25,540 --> 00:10:26,220
Yeah.

371
00:10:26,220 --> 00:10:29,140
So as we move towards this future of AI

372
00:10:29,140 --> 00:10:33,220
that's more accessible and efficient,

373
00:10:33,220 --> 00:10:36,060
are there things we should be thinking about as a society?

374
00:10:36,060 --> 00:10:37,140
Oh, that's a great question.

375
00:10:37,140 --> 00:10:39,140
Are there downsides to this trend?

376
00:10:39,140 --> 00:10:42,380
Well, one concern is that these smaller models

377
00:10:42,380 --> 00:10:44,260
could be misused.

378
00:10:44,260 --> 00:10:45,100
In what way?

379
00:10:45,100 --> 00:10:46,900
What if someone with bad intentions

380
00:10:46,900 --> 00:10:51,700
got a hold of a powerful AI that could run on a phone or laptop?

381
00:10:51,700 --> 00:10:52,580
Yeah, that's scary.

382
00:10:52,580 --> 00:10:53,620
That could be really bad.

383
00:10:53,620 --> 00:10:54,940
So how do we stop that?

384
00:10:54,940 --> 00:10:56,740
Can we even control this technology?

385
00:10:56,740 --> 00:11:00,020
It's tough, but it's something that researchers, policymakers,

386
00:11:00,020 --> 00:11:02,380
and industry leaders are all working on.

387
00:11:02,380 --> 00:11:02,820
OK.

388
00:11:02,820 --> 00:11:06,100
One way is to develop strong ethical guidelines

389
00:11:06,100 --> 00:11:08,860
and regulations for how AI is developed and used.

390
00:11:08,860 --> 00:11:09,260
Right.

391
00:11:09,260 --> 00:11:11,980
Another is to educate people about the potential risks

392
00:11:11,980 --> 00:11:12,620
and benefits.

393
00:11:12,620 --> 00:11:14,060
If you can make good choices.

394
00:11:14,060 --> 00:11:16,300
It sounds like this needs to be an ongoing conversation.

395
00:11:16,300 --> 00:11:17,340
Oh, definitely.

396
00:11:17,340 --> 00:11:19,100
AI is always changing.

397
00:11:19,100 --> 00:11:22,300
And we have to adapt as new challenges come up.

398
00:11:22,300 --> 00:11:24,220
Well, this has given me a lot to think about.

399
00:11:24,220 --> 00:11:24,740
Me too.

400
00:11:24,740 --> 00:11:26,660
And hopefully our listeners as well.

401
00:11:26,660 --> 00:11:30,420
It's clear that AI is moving towards efficiency

402
00:11:30,420 --> 00:11:31,460
and accessibility.

403
00:11:31,460 --> 00:11:32,260
Yes.

404
00:11:32,260 --> 00:11:35,700
But like any powerful technology,

405
00:11:35,700 --> 00:11:37,940
we need to be careful and make sure it's used for good.

406
00:11:37,940 --> 00:11:38,700
I agree.

407
00:11:38,700 --> 00:11:40,900
And its benefits are shared by all.

408
00:11:40,900 --> 00:11:41,500
Definitely.

409
00:11:41,500 --> 00:11:43,460
Definitely to keep an eye on this.

410
00:11:43,460 --> 00:11:45,460
And to our listeners, we encourage

411
00:11:45,460 --> 00:11:49,300
you to check out the paper Densing Law of LLMs.

412
00:11:49,300 --> 00:11:50,580
Yes.

413
00:11:50,580 --> 00:11:51,300
Good read.

414
00:11:51,300 --> 00:11:52,300
And learn more.

415
00:11:52,300 --> 00:11:53,260
It is.

416
00:11:53,260 --> 00:11:55,180
Thanks for joining us for another deep dive.

417
00:11:55,180 --> 00:12:09,340
See you.