1
00:00:00,000 --> 00:00:01,920
All right, get ready.

2
00:00:01,920 --> 00:00:05,120
Because today, we're tackling some serious math.

3
00:00:05,120 --> 00:00:06,640
Oh yeah, it's gonna be a fun one.

4
00:00:06,640 --> 00:00:09,040
You wanted to know if AI can solve

5
00:00:09,040 --> 00:00:10,720
really hard math problems.

6
00:00:10,720 --> 00:00:11,760
Yeah.

7
00:00:11,760 --> 00:00:13,080
And to answer that,

8
00:00:13,080 --> 00:00:15,320
we're doing a deep dive into frontier math.

9
00:00:15,320 --> 00:00:16,160
Okay.

10
00:00:16,160 --> 00:00:20,240
A brand new benchmark designed to test AI's limits.

11
00:00:20,240 --> 00:00:23,920
So like, how good is AI really at doing math?

12
00:00:23,920 --> 00:00:26,800
Yeah, think of it like the SATs for AI,

13
00:00:26,800 --> 00:00:27,880
but way, way harder.

14
00:00:27,880 --> 00:00:29,400
Like way harder.

15
00:00:29,400 --> 00:00:30,720
So what's really interesting here is that

16
00:00:30,720 --> 00:00:32,800
we're not just talking about like your typical

17
00:00:32,800 --> 00:00:34,000
high school algebra.

18
00:00:34,000 --> 00:00:36,200
No, no, no, this is the big leagues.

19
00:00:36,200 --> 00:00:38,400
Frontier math is all about pushing AI

20
00:00:38,400 --> 00:00:40,680
beyond its comfort zone with problems

21
00:00:40,680 --> 00:00:43,400
that even experienced mathematicians find challenging.

22
00:00:43,400 --> 00:00:46,840
Yeah, stuff that even makes like seasoned math pros

23
00:00:46,840 --> 00:00:47,840
sweat a little.

24
00:00:47,840 --> 00:00:49,840
So it's kind of like giving a calculator

25
00:00:49,840 --> 00:00:51,400
a PhD level exam.

26
00:00:51,400 --> 00:00:52,520
That's a great way to put it.

27
00:00:52,520 --> 00:00:53,440
Is that a good analogy?

28
00:00:53,440 --> 00:00:54,280
It really is.

29
00:00:54,280 --> 00:00:55,120
Okay.

30
00:00:55,120 --> 00:00:57,320
Because the thing is, current AI benchmarks,

31
00:00:57,320 --> 00:00:58,720
they're getting kind of easy.

32
00:00:58,720 --> 00:00:59,360
Okay.

33
00:00:59,360 --> 00:01:03,560
AI is hitting near perfect scores on tests like GSMAK

34
00:01:03,560 --> 00:01:05,920
or math, which means we need a new challenge, right?

35
00:01:05,920 --> 00:01:07,320
We need to push the limits.

36
00:01:07,320 --> 00:01:08,160
Yeah, yeah.

37
00:01:08,160 --> 00:01:10,560
So like AI's aced all the beginner levels

38
00:01:10,560 --> 00:01:12,320
and now we need the boss fight.

39
00:01:12,320 --> 00:01:13,440
Right, the boss fight, yeah.

40
00:01:13,440 --> 00:01:14,280
Exactly.

41
00:01:14,280 --> 00:01:15,120
Okay, that makes sense.

42
00:01:15,120 --> 00:01:17,280
And plus there's this whole thing about contamination.

43
00:01:17,280 --> 00:01:18,120
Contamination.

44
00:01:18,120 --> 00:01:21,520
Yeah, it means like AI might have seen similar problems

45
00:01:21,520 --> 00:01:23,560
when it was being trained, right?

46
00:01:23,560 --> 00:01:26,560
So it's not really solving them from scratch.

47
00:01:27,520 --> 00:01:28,440
Frontier math,

48
00:01:28,440 --> 00:01:31,280
they try to avoid this by using all new problems.

49
00:01:31,280 --> 00:01:32,120
Yeah.

50
00:01:32,120 --> 00:01:33,840
Stuff that AI has never encountered before.

51
00:01:33,840 --> 00:01:36,440
So it's like making sure the AI isn't just like

52
00:01:36,440 --> 00:01:38,200
spitting out memorized answers.

53
00:01:38,200 --> 00:01:39,720
You got it.

54
00:01:39,720 --> 00:01:42,440
It's got to think deeply and creatively.

55
00:01:42,440 --> 00:01:44,680
Just like a human mathematician would.

56
00:01:44,680 --> 00:01:46,880
Okay, so how did they come up with these

57
00:01:46,880 --> 00:01:48,720
brain melting math problems?

58
00:01:48,720 --> 00:01:49,560
Yeah.

59
00:01:49,560 --> 00:01:52,080
Did they like walk a bunch of mathematicians

60
00:01:52,080 --> 00:01:54,120
in a room with no pizza until they came up

61
00:01:54,120 --> 00:01:56,360
with like truly evil problems?

62
00:01:56,360 --> 00:01:58,280
It wasn't quite that dramatic.

63
00:01:58,280 --> 00:02:01,840
But they did assemble this team of over 60 mathematicians.

64
00:02:01,840 --> 00:02:04,560
And this ranged from like grad students

65
00:02:04,560 --> 00:02:07,200
all the way up to some of the biggest names in math,

66
00:02:07,200 --> 00:02:08,040
you know?

67
00:02:08,040 --> 00:02:10,200
People who've won something called the Fields Medal.

68
00:02:10,200 --> 00:02:12,920
The Fields Medal, is that like the Nobel Prize of math?

69
00:02:12,920 --> 00:02:14,400
You could say that, yeah.

70
00:02:14,400 --> 00:02:16,960
It's one of the highest honors a mathematician can receive.

71
00:02:16,960 --> 00:02:19,920
So they've got like the best of the best.

72
00:02:19,920 --> 00:02:20,760
Absolutely.

73
00:02:20,760 --> 00:02:21,600
Coming up with these problems.

74
00:02:21,600 --> 00:02:23,120
And having that range of expertise,

75
00:02:23,120 --> 00:02:24,160
that was important, right?

76
00:02:24,160 --> 00:02:27,080
Because it made sure that frontier math covers

77
00:02:27,080 --> 00:02:29,680
like a huge spectrum of math.

78
00:02:29,680 --> 00:02:31,280
We're talking number theory,

79
00:02:31,280 --> 00:02:35,560
combinatorics, algebraic geometry, you name it.

80
00:02:35,560 --> 00:02:40,560
Wow, okay, so we've got these insanely hard problems

81
00:02:41,080 --> 00:02:44,400
created by like the smartest mathematicians in the world.

82
00:02:44,400 --> 00:02:46,120
But can you give us a taste?

83
00:02:46,120 --> 00:02:48,160
Like what do these problems actually look like?

84
00:02:48,160 --> 00:02:51,280
Absolutely, let's look at some examples from the paper.

85
00:02:51,280 --> 00:02:54,760
One problem, which is rated as high difficulty,

86
00:02:54,760 --> 00:02:57,800
involves figuring out how often a certain pattern

87
00:02:57,800 --> 00:02:59,440
shows up with prime numbers.

88
00:02:59,440 --> 00:03:00,280
Oh, okay.

89
00:03:00,280 --> 00:03:02,080
It sounds simple, but it actually connects

90
00:03:02,080 --> 00:03:04,800
to some of the biggest unsolved mysteries in math.

91
00:03:04,800 --> 00:03:07,440
Like something called Arton's conjecture.

92
00:03:07,440 --> 00:03:10,360
Wait, so even humans haven't figured these problems out?

93
00:03:10,360 --> 00:03:11,200
That's right.

94
00:03:11,200 --> 00:03:13,840
And that's part of what makes frontier math so cool, right?

95
00:03:13,840 --> 00:03:14,680
Yeah.

96
00:03:14,680 --> 00:03:16,440
It's not just about testing AI.

97
00:03:16,440 --> 00:03:18,960
It's about pushing the boundaries of like

98
00:03:18,960 --> 00:03:20,560
what we know as humans.

99
00:03:20,560 --> 00:03:21,520
That's pretty awesome.

100
00:03:21,520 --> 00:03:22,360
It is.

101
00:03:22,360 --> 00:03:24,000
Okay, so that's next level.

102
00:03:24,000 --> 00:03:25,400
What about another example?

103
00:03:25,400 --> 00:03:26,920
All right, another one.

104
00:03:26,920 --> 00:03:29,520
This is rated high medium difficulty.

105
00:03:29,520 --> 00:03:30,360
Okay.

106
00:03:30,360 --> 00:03:32,960
It asks AI to design a specific type

107
00:03:32,960 --> 00:03:34,640
of polynomial equation.

108
00:03:34,640 --> 00:03:35,480
Okay.

109
00:03:35,480 --> 00:03:38,720
And the solution, you have to visualize these like

110
00:03:38,720 --> 00:03:43,720
twisting and looping paths in a really abstract way.

111
00:03:43,800 --> 00:03:44,640
Whoa.

112
00:03:44,640 --> 00:03:46,680
It's not the kind of math you'd find in a textbook.

113
00:03:46,680 --> 00:03:48,920
No, so it's not just about crunching numbers.

114
00:03:48,920 --> 00:03:49,760
No.

115
00:03:49,760 --> 00:03:51,840
It's about like real creative thinking.

116
00:03:51,840 --> 00:03:52,840
Exactly.

117
00:03:52,840 --> 00:03:53,840
And there's more.

118
00:03:53,840 --> 00:03:54,680
Okay.

119
00:03:54,680 --> 00:03:57,200
Another problem uses something called piatic numbers.

120
00:03:57,200 --> 00:03:58,720
Piatic numbers, huh?

121
00:03:58,720 --> 00:04:01,000
Which is a totally different way of thinking about numbers

122
00:04:01,000 --> 00:04:01,840
than what we're used to.

123
00:04:01,840 --> 00:04:04,080
It's like, uh, whole new world of math.

124
00:04:04,080 --> 00:04:06,400
Okay, my brain is starting to hurt a little bit.

125
00:04:06,400 --> 00:04:07,880
But this is cool.

126
00:04:07,880 --> 00:04:11,560
So the big question is, how did AI actually do on these

127
00:04:11,560 --> 00:04:12,760
frontier math challenges?

128
00:04:12,760 --> 00:04:16,280
Well, let's just say it wasn't exactly a stellar performance.

129
00:04:16,280 --> 00:04:18,600
Oh no, did they completely crash and burn?

130
00:04:18,600 --> 00:04:21,360
None of the AI models they tested could solve more than

131
00:04:21,360 --> 00:04:23,320
2% of the problems.

132
00:04:23,320 --> 00:04:24,320
Wow.

133
00:04:24,320 --> 00:04:25,600
It was a real wake-up call.

134
00:04:25,600 --> 00:04:28,560
It shows that AI still has a long way to go before it can

135
00:04:28,560 --> 00:04:32,440
match, you know, real human mathematicians at this kind

136
00:04:32,440 --> 00:04:33,720
of hardcore math.

137
00:04:33,720 --> 00:04:35,240
Yeah, wow, that's a reality check.

138
00:04:35,240 --> 00:04:38,520
Yeah, it wasn't quite the AI takeover some people

139
00:04:38,520 --> 00:04:39,640
were maybe predicting.

140
00:04:39,640 --> 00:04:41,120
So a total wipeout.

141
00:04:41,120 --> 00:04:43,680
But I'm curious, which AI models were even brave enough

142
00:04:43,680 --> 00:04:45,560
to step into the ring on this one?

143
00:04:45,560 --> 00:04:49,240
Oh, they tested some of the heavy hitters like GPT-4,

144
00:04:49,240 --> 00:04:53,040
Claude, couple different versions of open AI's,

145
00:04:53,040 --> 00:04:58,040
01 model, Grock 2, even Google DeepMind's Gemini.

146
00:04:58,560 --> 00:05:00,480
Whoa, so the all-star team of AI?

147
00:05:00,480 --> 00:05:04,200
Yeah, these are like the most advanced AI systems we've got.

148
00:05:04,200 --> 00:05:06,400
And they still got like completely stumped.

149
00:05:06,400 --> 00:05:09,120
Pretty much, yeah, frontier math was a whole other ball game.

150
00:05:09,120 --> 00:05:13,680
Okay, so no AI came close to solving like a decent chunk

151
00:05:13,680 --> 00:05:14,520
of the problems.

152
00:05:14,520 --> 00:05:17,800
But I'm curious, were there any like interesting patterns

153
00:05:17,800 --> 00:05:18,880
in how they approached it?

154
00:05:18,880 --> 00:05:21,840
Yeah, actually there were some interesting differences.

155
00:05:21,840 --> 00:05:25,320
Some AI like 01 Preview and Gemini,

156
00:05:25,320 --> 00:05:28,160
they kind of just jumped right into answering

157
00:05:28,160 --> 00:05:31,480
without a whole lot of like exploration or experimenting.

158
00:05:31,480 --> 00:05:33,480
Yeah, they were like skimming the textbook

159
00:05:33,480 --> 00:05:35,880
and thinking they were ready for the exam

160
00:05:35,880 --> 00:05:37,760
without really understanding the material.

161
00:05:37,760 --> 00:05:39,760
Yeah, they seem to rely a lot

162
00:05:39,760 --> 00:05:43,720
on pattern recognition, quick calculations,

163
00:05:43,720 --> 00:05:47,920
but maybe not so much on that deep conceptual understanding.

164
00:05:47,920 --> 00:05:49,080
Okay, that makes sense.

165
00:05:49,080 --> 00:05:50,020
What about the others?

166
00:05:50,020 --> 00:05:52,200
Did any of them take a different approach?

167
00:05:52,200 --> 00:05:54,320
Yeah, what about the other AI models?

168
00:05:54,320 --> 00:05:57,840
Some models like Grock 2 seem to be more methodical.

169
00:05:57,840 --> 00:06:00,080
They ran more code experiments.

170
00:06:00,080 --> 00:06:00,920
Interesting.

171
00:06:00,920 --> 00:06:03,920
Tried different approaches like reflecting on the results

172
00:06:03,920 --> 00:06:05,480
before they gave a final answer.

173
00:06:05,480 --> 00:06:07,960
It's more like the student who's carefully working

174
00:06:07,960 --> 00:06:09,960
through the practice problems.

175
00:06:09,960 --> 00:06:10,800
You got it.

176
00:06:10,800 --> 00:06:11,880
Double checking their work.

177
00:06:11,880 --> 00:06:12,720
Exactly.

178
00:06:12,720 --> 00:06:15,720
Okay, so that's reflected in how much they actually used

179
00:06:15,720 --> 00:06:17,120
like computationally.

180
00:06:17,120 --> 00:06:17,960
It is, yeah.

181
00:06:17,960 --> 00:06:21,960
Interesting, like is one approach better than the other?

182
00:06:21,960 --> 00:06:22,920
That's a great question.

183
00:06:22,920 --> 00:06:26,560
And honestly, the research doesn't say for sure.

184
00:06:26,560 --> 00:06:29,200
It just points out this difference in strategy.

185
00:06:29,200 --> 00:06:32,280
Like this is something to pay attention to.

186
00:06:32,280 --> 00:06:33,120
Yeah.

187
00:06:33,120 --> 00:06:34,560
It's something to keep researching.

188
00:06:34,560 --> 00:06:36,760
It makes me wonder though,

189
00:06:36,760 --> 00:06:40,560
is it like how we're training these AIs,

190
00:06:40,560 --> 00:06:43,760
are we accidentally rewarding speed

191
00:06:43,760 --> 00:06:45,400
over that deeper thinking?

192
00:06:45,400 --> 00:06:46,360
That's a really good point.

193
00:06:46,360 --> 00:06:48,160
Like is that built into how we're making them?

194
00:06:48,160 --> 00:06:50,440
It leads to an even bigger question, right?

195
00:06:50,440 --> 00:06:53,720
Like how do we evaluate AI in the first place?

196
00:06:53,720 --> 00:06:54,680
Oh, interesting.

197
00:06:54,680 --> 00:06:57,440
Because as these systems get more complex,

198
00:06:57,440 --> 00:07:00,300
maybe just looking at did they get the right answer,

199
00:07:00,300 --> 00:07:01,680
that's not enough anymore.

200
00:07:01,680 --> 00:07:02,520
Right, right.

201
00:07:02,520 --> 00:07:05,040
We gotta look at the process, the reasoning, the strategy.

202
00:07:05,040 --> 00:07:07,800
So it's not just like did they get an A on the test?

203
00:07:07,800 --> 00:07:09,320
It's like how are they learning?

204
00:07:09,320 --> 00:07:10,160
How are they thinking?

205
00:07:10,160 --> 00:07:10,980
Exactly.

206
00:07:10,980 --> 00:07:13,280
Are they learning to think like really deeply

207
00:07:13,280 --> 00:07:15,500
and creatively like a mathematician?

208
00:07:15,500 --> 00:07:16,880
Or are they just getting really good

209
00:07:16,880 --> 00:07:20,240
at finding shortcuts and mimicking patterns

210
00:07:20,240 --> 00:07:22,200
without truly understanding?

211
00:07:22,200 --> 00:07:23,560
Right, without the deeper knowledge.

212
00:07:23,560 --> 00:07:25,600
And those are big questions, right?

213
00:07:25,600 --> 00:07:29,400
And frontier math, it really forces us to think about that.

214
00:07:29,400 --> 00:07:31,400
So it's not just a test for AI.

215
00:07:31,400 --> 00:07:32,240
No.

216
00:07:32,240 --> 00:07:33,040
It's a test for us,

217
00:07:33,040 --> 00:07:34,840
for how we're even approaching this whole thing.

218
00:07:34,840 --> 00:07:35,680
I'd say so, yeah,

219
00:07:35,680 --> 00:07:37,760
it's a challenge for the whole AI field.

220
00:07:37,760 --> 00:07:40,240
You know, we've been talking about the AI struggles,

221
00:07:40,240 --> 00:07:42,080
but I'm kinda curious,

222
00:07:42,080 --> 00:07:45,280
did the mathematicians who designed these problems,

223
00:07:45,280 --> 00:07:46,800
did they find them hard too?

224
00:07:46,800 --> 00:07:48,160
Oh, absolutely.

225
00:07:48,160 --> 00:07:51,760
They interviewed several fields medalists.

226
00:07:51,760 --> 00:07:54,440
Oh, remember, those are the top mathematicians in the world.

227
00:07:54,440 --> 00:07:55,480
Yeah, the best of the best.

228
00:07:55,480 --> 00:07:56,560
And even a top coach

229
00:07:56,560 --> 00:07:59,240
for the International Mathematical Olympiad.

230
00:07:59,240 --> 00:08:00,480
Wow, okay, so this is-

231
00:08:00,480 --> 00:08:02,480
Like the Olympics of math competitions

232
00:08:02,480 --> 00:08:03,880
for high school students.

233
00:08:03,880 --> 00:08:04,720
Next level.

234
00:08:04,720 --> 00:08:05,540
Yeah.

235
00:08:05,540 --> 00:08:06,880
What did they think about this?

236
00:08:06,880 --> 00:08:10,680
They all agreed the problems were really hard,

237
00:08:10,680 --> 00:08:11,840
even for them.

238
00:08:11,840 --> 00:08:13,280
Okay, good, I feel a little better

239
00:08:13,280 --> 00:08:14,720
about my own math skills now.

240
00:08:14,720 --> 00:08:16,960
Huh, that's reassuring.

241
00:08:16,960 --> 00:08:18,360
It wasn't just the concepts,

242
00:08:18,360 --> 00:08:21,240
it was the creative thinking, the problem solving.

243
00:08:21,240 --> 00:08:23,120
Like that was the tricky part.

244
00:08:23,120 --> 00:08:27,480
They did point out that the way frontier math is set up,

245
00:08:27,480 --> 00:08:29,680
you know, giving numerical answers

246
00:08:29,680 --> 00:08:32,280
instead of formal proofs,

247
00:08:32,280 --> 00:08:35,480
that's not really typical for math research.

248
00:08:35,480 --> 00:08:38,000
But it still needs that deep understanding,

249
00:08:38,000 --> 00:08:40,920
that ability to connect different ideas.

250
00:08:40,920 --> 00:08:42,280
So it's still real math,

251
00:08:42,280 --> 00:08:43,880
it's just a different way of showing it.

252
00:08:43,880 --> 00:08:45,160
Yeah, exactly.

253
00:08:45,160 --> 00:08:47,560
So then they asked these mathematicians,

254
00:08:47,560 --> 00:08:51,960
when do you think AI will be able to crack frontier math?

255
00:08:51,960 --> 00:08:52,800
Okay.

256
00:08:52,800 --> 00:08:54,720
And their estimates were,

257
00:08:54,720 --> 00:08:56,800
well, let's just say they weren't holding their breath.

258
00:08:56,800 --> 00:08:59,200
It's a years, decades.

259
00:08:59,200 --> 00:09:00,960
Yeah, that seemed to be the consensus.

260
00:09:00,960 --> 00:09:02,400
Wow, that's a long time.

261
00:09:02,400 --> 00:09:04,960
It is, but here's where things get interesting.

262
00:09:04,960 --> 00:09:05,800
Okay.

263
00:09:05,800 --> 00:09:07,160
They were much more optimistic

264
00:09:07,160 --> 00:09:09,040
about human AI collaboration.

265
00:09:09,040 --> 00:09:11,560
Oh, tell me more about that, what did they picture?

266
00:09:11,560 --> 00:09:15,000
They were thinking, you know, AI could be a really powerful

267
00:09:15,000 --> 00:09:16,520
research assistant.

268
00:09:16,520 --> 00:09:19,200
Helping mathematicians explore ideas,

269
00:09:19,200 --> 00:09:21,800
test things out, verify calculations.

270
00:09:21,800 --> 00:09:23,400
But like a souped up calculator.

271
00:09:23,400 --> 00:09:24,240
Yeah.

272
00:09:24,240 --> 00:09:26,080
But that can also like brainstorm with you.

273
00:09:26,080 --> 00:09:26,920
Exactly.

274
00:09:26,920 --> 00:09:29,400
And they thought that kind of team humans

275
00:09:29,400 --> 00:09:31,560
and AI working together,

276
00:09:31,560 --> 00:09:35,560
they could make progress on frontier math way sooner

277
00:09:35,560 --> 00:09:36,400
Oh, interesting.

278
00:09:36,400 --> 00:09:38,520
Than AI could on its own.

279
00:09:38,520 --> 00:09:39,960
So we don't have to wait for AI

280
00:09:39,960 --> 00:09:41,920
to become some kind of math genius.

281
00:09:41,920 --> 00:09:42,760
Right.

282
00:09:42,760 --> 00:09:43,760
Before it can actually be useful,

283
00:09:43,760 --> 00:09:45,240
we can start working with it now.

284
00:09:45,240 --> 00:09:46,120
Exactly.

285
00:09:46,120 --> 00:09:46,960
That's cool.

286
00:09:46,960 --> 00:09:49,440
It is, but for this to actually work,

287
00:09:49,440 --> 00:09:51,720
we need the right tools, the right way for them

288
00:09:51,720 --> 00:09:52,920
to talk to each other.

289
00:09:52,920 --> 00:09:53,760
Yeah.

290
00:09:53,760 --> 00:09:56,360
AI needs to understand human instructions.

291
00:09:56,360 --> 00:09:59,160
And humans need to be able to guide the AI's thinking.

292
00:09:59,160 --> 00:10:01,160
Like learning to speak each other's language.

293
00:10:01,160 --> 00:10:02,720
I like that, yeah, building a bridge

294
00:10:02,720 --> 00:10:03,880
between those two minds.

295
00:10:03,880 --> 00:10:04,720
Yeah.

296
00:10:04,720 --> 00:10:05,560
And that's a challenge in itself.

297
00:10:05,560 --> 00:10:06,400
It is.

298
00:10:06,400 --> 00:10:08,080
But I like the idea of like teamwork.

299
00:10:08,080 --> 00:10:09,120
Yeah.

300
00:10:09,120 --> 00:10:10,200
Like powerful stuff.

301
00:10:10,200 --> 00:10:12,840
So I'm sold on this team idea.

302
00:10:12,840 --> 00:10:13,680
Yeah.

303
00:10:13,680 --> 00:10:16,000
But what are some of the like practical hurdles

304
00:10:16,000 --> 00:10:18,280
we got it clear to actually make this happen?

305
00:10:18,280 --> 00:10:20,520
Well, for one thing, AI models right now,

306
00:10:20,520 --> 00:10:22,160
they're still pretty limited.

307
00:10:22,160 --> 00:10:23,000
Okay.

308
00:10:23,000 --> 00:10:25,040
When it comes to this really hardcore math,

309
00:10:25,040 --> 00:10:26,640
even with human guidance.

310
00:10:26,640 --> 00:10:27,480
Right.

311
00:10:27,480 --> 00:10:29,520
It might be tough for them to make a lot of progress.

312
00:10:29,520 --> 00:10:31,960
So it's like having a brilliant,

313
00:10:31,960 --> 00:10:33,720
but inexperienced assistant.

314
00:10:33,720 --> 00:10:34,560
Exactly.

315
00:10:34,560 --> 00:10:36,920
Tons of potential, but still needs a lot of training.

316
00:10:36,920 --> 00:10:39,080
And that training, like we talked about before,

317
00:10:39,080 --> 00:10:41,800
it's hard because there's not enough data.

318
00:10:41,800 --> 00:10:42,640
Right.

319
00:10:42,640 --> 00:10:44,400
In these really specific areas of math.

320
00:10:44,400 --> 00:10:46,480
So we need more data, is that what you're saying?

321
00:10:46,480 --> 00:10:47,320
We do.

322
00:10:47,320 --> 00:10:51,480
And we also need better ways to create synthetic data.

323
00:10:51,480 --> 00:10:52,320
Okay.

324
00:10:52,320 --> 00:10:53,920
That actually captures how complex

325
00:10:53,920 --> 00:10:55,520
these math problems are.

326
00:10:55,520 --> 00:10:56,360
Okay.

327
00:10:56,360 --> 00:10:57,280
So data is one thing,

328
00:10:57,280 --> 00:10:59,120
but what about the computing power, right?

329
00:10:59,120 --> 00:10:59,960
Yeah.

330
00:10:59,960 --> 00:11:00,800
We mentioned earlier,

331
00:11:00,800 --> 00:11:04,920
it could take days on Google servers to solve one problem.

332
00:11:04,920 --> 00:11:07,400
It's not really practical if you're trying to do research.

333
00:11:07,400 --> 00:11:08,240
You're right.

334
00:11:08,240 --> 00:11:10,000
We need more efficient algorithms,

335
00:11:10,000 --> 00:11:11,880
better hardware that can handle

336
00:11:11,880 --> 00:11:13,960
these really intense calculations

337
00:11:13,960 --> 00:11:16,200
without breaking the bank.

338
00:11:16,200 --> 00:11:17,040
Right.

339
00:11:17,040 --> 00:11:17,880
Without taking forever.

340
00:11:17,880 --> 00:11:18,720
Exactly.

341
00:11:18,720 --> 00:11:19,960
So it's a lot of different challenges all at the same time.

342
00:11:19,960 --> 00:11:21,680
It is, but the good news is

343
00:11:21,680 --> 00:11:24,400
there's a ton of research happening in all these areas.

344
00:11:24,400 --> 00:11:25,240
Okay. That's good.

345
00:11:25,240 --> 00:11:26,080
There's some hope then.

346
00:11:26,080 --> 00:11:26,920
Yeah, I think so.

347
00:11:26,920 --> 00:11:28,680
It seems like frontier math has been

348
00:11:28,680 --> 00:11:31,960
like a shakeup for the MCU world.

349
00:11:31,960 --> 00:11:32,800
It has.

350
00:11:32,800 --> 00:11:36,600
It's forcing everyone to rethink how they're doing things.

351
00:11:36,600 --> 00:11:39,040
It's a wake-up call and a call to action.

352
00:11:39,040 --> 00:11:41,800
It reminds us that true AI intelligence,

353
00:11:41,800 --> 00:11:43,520
we're not there yet,

354
00:11:43,520 --> 00:11:46,000
but it shows us the way forward.

355
00:11:46,000 --> 00:11:49,000
How to get to that future where humans and AI

356
00:11:49,000 --> 00:11:53,320
are working together to solve these amazing math problems.

357
00:11:53,320 --> 00:11:55,360
Cool. This has been a really interesting journey.

358
00:11:55,360 --> 00:11:57,520
We started with a pretty simple question.

359
00:11:57,520 --> 00:11:59,520
Can AI solve really hard math?

360
00:11:59,520 --> 00:12:00,360
Yeah.

361
00:12:00,360 --> 00:12:04,400
And we've ended up exploring so much more.

362
00:12:04,400 --> 00:12:05,240
We have.

363
00:12:05,240 --> 00:12:06,760
We have a lot of sense of AI, human AI teams.

364
00:12:06,760 --> 00:12:08,600
What even is AI intelligence?

365
00:12:08,600 --> 00:12:09,440
Yeah.

366
00:12:09,440 --> 00:12:10,280
It's been a lot.

367
00:12:10,280 --> 00:12:12,360
And we need more data, more computing power.

368
00:12:12,360 --> 00:12:13,360
It's a big challenge.

369
00:12:13,360 --> 00:12:14,800
Yeah, it really is.

370
00:12:14,800 --> 00:12:18,000
But what's amazing to me is the sense of possibility.

371
00:12:18,000 --> 00:12:18,840
Yeah.

372
00:12:18,840 --> 00:12:19,680
AI is still so,

373
00:12:19,680 --> 00:12:22,400
we're just starting to understand what it can do.

374
00:12:22,400 --> 00:12:23,240
Right.

375
00:12:23,240 --> 00:12:25,280
Who knows what crazy discoveries are waiting for us

376
00:12:25,280 --> 00:12:27,160
as AI gets smarter.

377
00:12:27,160 --> 00:12:31,240
Maybe AI will not only solve the problems in frontier math,

378
00:12:31,240 --> 00:12:34,200
but also help us discover new math concepts

379
00:12:34,200 --> 00:12:35,560
that we haven't even thought of yet.

380
00:12:35,560 --> 00:12:36,880
I wouldn't bet against it.

381
00:12:36,880 --> 00:12:37,720
That would be amazing.

382
00:12:37,720 --> 00:12:40,200
It would be a sign that AI isn't just copying us.

383
00:12:40,200 --> 00:12:41,040
Right.

384
00:12:41,040 --> 00:12:43,280
It's coming up with its own understanding of math.

385
00:12:43,280 --> 00:12:46,520
Okay, well, sadly, that's all the time we have

386
00:12:46,520 --> 00:12:47,720
for today's deep dive.

387
00:12:47,720 --> 00:12:48,800
Oh!

388
00:12:48,800 --> 00:12:49,840
Thanks for coming along with us

389
00:12:49,840 --> 00:12:51,920
on this journey into AI and math.

390
00:12:51,920 --> 00:12:53,000
It's been fun.

391
00:12:53,000 --> 00:12:54,080
It really has.

392
00:12:54,080 --> 00:12:56,520
Until next time, keep those minds curious

393
00:12:56,520 --> 00:12:58,680
and keep on diving deep.

394
00:12:58,680 --> 00:13:01,440
Welcome back for the final part of our deep dive

395
00:13:01,440 --> 00:13:02,880
into frontier math.

396
00:13:02,880 --> 00:13:04,080
It's been a wild ride.

397
00:13:04,080 --> 00:13:04,920
It has.

398
00:13:04,920 --> 00:13:07,800
We started off realizing that AI isn't quite ready

399
00:13:07,800 --> 00:13:09,440
to take over the math world,

400
00:13:09,440 --> 00:13:12,480
but then we saw the potential of these human AI teams

401
00:13:12,480 --> 00:13:14,240
pushing the limits of what we know.

402
00:13:14,240 --> 00:13:15,400
And we also talked about,

403
00:13:15,400 --> 00:13:17,480
what even is AI intelligence?

404
00:13:17,480 --> 00:13:18,320
Yeah.

405
00:13:18,320 --> 00:13:20,040
It's not just about getting the right answer anymore.

406
00:13:20,040 --> 00:13:22,600
No, it's about the process, the reasoning,

407
00:13:22,600 --> 00:13:24,480
the creativity behind it all.

408
00:13:24,480 --> 00:13:26,360
So as we wrap things up, I'm curious.

409
00:13:26,360 --> 00:13:27,200
Yeah.

410
00:13:27,200 --> 00:13:29,520
What do you think are the bigger implications of all this?

411
00:13:29,520 --> 00:13:30,640
It's a good question.

412
00:13:30,640 --> 00:13:34,160
What does frontier math tell us about AI as a whole

413
00:13:34,160 --> 00:13:35,720
and where it's going?

414
00:13:35,720 --> 00:13:37,680
Well, I think it's both a reality check

415
00:13:37,680 --> 00:13:39,360
and a roadmap, you know what I mean?

416
00:13:39,360 --> 00:13:41,800
Like on the one hand, it shows us the limits

417
00:13:41,800 --> 00:13:43,600
of what AI can do right now,

418
00:13:43,600 --> 00:13:45,680
especially with that abstract reasoning,

419
00:13:45,680 --> 00:13:48,880
creative problem solving, which is important for math,

420
00:13:48,880 --> 00:13:50,880
but also for so many other fields too.

421
00:13:50,880 --> 00:13:53,000
So it's telling us that we haven't reached

422
00:13:53,000 --> 00:13:56,760
that like true artificial general intelligence yet.

423
00:13:56,760 --> 00:13:58,760
Right, that AI that can think like a human

424
00:13:58,760 --> 00:14:00,240
across all these different areas.

425
00:14:00,240 --> 00:14:01,320
Okay, that makes sense.

426
00:14:01,320 --> 00:14:04,520
But on the other hand, frontier math gives us clues

427
00:14:04,520 --> 00:14:06,760
about where AI needs to get better.

428
00:14:06,760 --> 00:14:08,000
Okay, I like that.

429
00:14:08,000 --> 00:14:10,800
By seeing how AI struggled with these problems,

430
00:14:10,800 --> 00:14:13,440
researchers can figure out the weak spots, right?

431
00:14:13,440 --> 00:14:15,800
And focus on making new algorithms,

432
00:14:15,800 --> 00:14:17,800
new ways of training these systems.

433
00:14:17,800 --> 00:14:22,400
So it's like a diagnostic tool for the whole field of AI.

434
00:14:22,400 --> 00:14:25,400
Okay, and this isn't just about math problems then, right?

435
00:14:25,400 --> 00:14:26,480
No, it's bigger than that.

436
00:14:26,480 --> 00:14:28,120
Because the skills we're talking about,

437
00:14:28,120 --> 00:14:31,760
abstract reasoning, logic, creativity,

438
00:14:31,760 --> 00:14:33,560
those are important for everything.

439
00:14:33,560 --> 00:14:36,160
Absolutely, if AI can master that stuff,

440
00:14:36,160 --> 00:14:38,120
it can change so many things.

441
00:14:38,120 --> 00:14:42,320
So the lessons from frontier math could ripple out

442
00:14:42,320 --> 00:14:44,440
and affect all kinds of AI research.

443
00:14:44,440 --> 00:14:45,400
I think so, yeah.

444
00:14:45,400 --> 00:14:48,960
From science and engineering to maybe even art.

445
00:14:48,960 --> 00:14:52,160
It's pushing the boundaries of what AI can do.

446
00:14:52,160 --> 00:14:54,160
And it's making everyone aim higher.

447
00:14:54,160 --> 00:14:55,000
I love that.

448
00:14:55,000 --> 00:14:57,440
It's not just about making AI good at math,

449
00:14:57,440 --> 00:15:00,920
it's about using math to create a whole new kind of AI.

450
00:15:00,920 --> 00:15:01,760
That's so cool.

451
00:15:01,760 --> 00:15:04,960
One that's more powerful, more flexible, more like us.

452
00:15:04,960 --> 00:15:07,200
Well, this deep dive has been mind blowing.

453
00:15:07,200 --> 00:15:08,080
I've enjoyed it.

454
00:15:08,080 --> 00:15:10,000
We went from a simple question,

455
00:15:10,000 --> 00:15:12,520
can AI solve hard math?

456
00:15:12,520 --> 00:15:13,360
Yeah.

457
00:15:13,360 --> 00:15:15,000
To talking about the future of AI.

458
00:15:15,000 --> 00:15:16,280
It's all connected.

459
00:15:16,280 --> 00:15:17,120
It really is.

460
00:15:17,120 --> 00:15:19,040
It's been awesome sharing this journey with you

461
00:15:19,040 --> 00:15:20,080
and with everyone listening.

462
00:15:20,080 --> 00:15:20,920
It has.

463
00:15:20,920 --> 00:15:22,600
I hope it's made everyone a little more curious

464
00:15:22,600 --> 00:15:23,440
about this stuff.

465
00:15:23,440 --> 00:15:24,280
Me too.

466
00:15:24,280 --> 00:15:26,040
And I think that's a perfect note to end on.

467
00:15:26,040 --> 00:15:26,880
Yeah.

468
00:15:26,880 --> 00:15:29,840
Thanks for joining us on this exploration of AI and math.

469
00:15:29,840 --> 00:15:30,720
It's been a pleasure.

470
00:15:30,720 --> 00:15:31,680
It really has.

471
00:15:31,680 --> 00:15:34,720
Until next time, keep those minds questioning,

472
00:15:34,720 --> 00:15:37,600
keep exploring, and keep diving deep

473
00:15:37,600 --> 00:15:39,600
into this amazing world of knowledge.

474
00:15:39,600 --> 00:15:58,600
Couldn't have said it better myself.

