WEBVTT

00:00:00.000 --> 00:00:02.020
So here's something kind of paradoxical to chew

00:00:02.020 --> 00:00:05.360
on. Imagine running an AI model. It knows as

00:00:05.360 --> 00:00:07.320
much as like a massive one trillion parameter

00:00:07.320 --> 00:00:11.019
machine, but it runs with the speed, the latency,

00:00:11.140 --> 00:00:14.380
the cost of a model that's, what, 30 times smaller?

00:00:14.560 --> 00:00:17.980
Only 32 billion parameters active? That kind

00:00:17.980 --> 00:00:20.359
of efficiency. Right. It's genuinely stunning

00:00:20.359 --> 00:00:22.120
from an engineering perspective. Right. And this

00:00:22.120 --> 00:00:24.440
is in vaporware, not some research paper dream.

00:00:24.559 --> 00:00:28.600
It's Kimi K2 thinking built by Moonshot AI. And

00:00:28.600 --> 00:00:30.980
it's the open source challenger that's, well,

00:00:31.079 --> 00:00:33.420
it's really shaking things up because independent

00:00:33.420 --> 00:00:35.719
leaderboards, they now show it sitting globally

00:00:35.719 --> 00:00:39.100
at number two, just right behind GPT -5. Welcome

00:00:39.100 --> 00:00:41.140
to the deep dive. Today we're taking a close

00:00:41.140 --> 00:00:42.679
look, a measured look, at the source material

00:00:42.679 --> 00:00:45.520
we have on KimiK2 and, you know, what this really

00:00:45.520 --> 00:00:47.679
represents for the whole AI landscape. It feels

00:00:47.679 --> 00:00:49.939
like a pretty big shift. Yeah, our mission today

00:00:49.939 --> 00:00:52.039
is kind of threefold. First, we're going to really

00:00:52.039 --> 00:00:55.340
dig into the architecture, the engine under the

00:00:55.340 --> 00:00:58.179
hood, right? What makes KimiK2 so powerful but

00:00:58.179 --> 00:01:01.000
also so efficient, so affordable? Second, we'll

00:01:01.000 --> 00:01:03.560
look at the benchmarks. The proof, essentially,

00:01:03.740 --> 00:01:05.939
that open source isn't just catching up anymore.

00:01:06.060 --> 00:01:08.140
In some ways, it's actually pulling ahead of

00:01:08.140 --> 00:01:10.900
the big, close source players. And finally, we

00:01:10.900 --> 00:01:13.040
need to unpack what this means, the implications

00:01:13.040 --> 00:01:17.239
for things like data privacy, cost control, corporate

00:01:17.239 --> 00:01:19.519
independence even. It's pretty revolutionary

00:01:19.519 --> 00:01:21.420
stuff. Okay, let's start with that core tech

00:01:21.420 --> 00:01:24.019
then. The big question is how. How does Moonshot

00:01:24.019 --> 00:01:27.819
AI... get this like massive power, but keep the

00:01:27.819 --> 00:01:30.299
efficiency so incredibly high. It really comes

00:01:30.299 --> 00:01:31.959
down to the architecture. They're using what's

00:01:31.959 --> 00:01:34.500
called a mixture of experts model and Moe. So

00:01:34.500 --> 00:01:36.819
instead of one giant brain that like lights up

00:01:36.819 --> 00:01:38.599
everywhere for every query, think of it more

00:01:38.599 --> 00:01:40.620
like a huge team, a team of very specialized

00:01:40.620 --> 00:01:43.599
experts. Ah, okay. So if I ask a really specific

00:01:43.599 --> 00:01:46.299
complex question, maybe something technical about,

00:01:46.379 --> 00:01:48.519
I don't know, protein folding, the system is

00:01:48.519 --> 00:01:50.840
smart enough to route my query only to the few

00:01:50.840 --> 00:01:53.340
experts who actually know about protein folding.

00:01:54.299 --> 00:01:56.739
And the rest kind of stay quiet. Exactly. That's

00:01:56.739 --> 00:01:59.849
the core idea. In simple terms, Moe is a vast

00:01:59.849 --> 00:02:02.390
network that only lights up key sections. It

00:02:02.390 --> 00:02:04.930
achieves what they call sparsity. And that sparsity,

00:02:05.030 --> 00:02:06.569
the fact that most of the network isn't being

00:02:06.569 --> 00:02:09.469
used for any given task, that's the key to the

00:02:09.469 --> 00:02:11.409
efficiency numbers. So the total model size,

00:02:11.669 --> 00:02:15.349
the whole knowledge base is indeed that massive

00:02:15.349 --> 00:02:17.969
1 trillion parameters. Right up there with the

00:02:17.969 --> 00:02:19.770
biggest models out there. But, and this is the

00:02:19.770 --> 00:02:22.310
kicker for any single request you make, the activated

00:02:22.310 --> 00:02:24.469
parameters, the parts that actually spin up and

00:02:24.469 --> 00:02:28.520
generate the answer, only 32 billion. 3 .2 %

00:02:28.520 --> 00:02:31.080
sparsity. So you're getting this like incredibly

00:02:31.080 --> 00:02:34.819
deep knowledge base, but the speed, the cost,

00:02:34.960 --> 00:02:37.000
it feels like you're running a much, much smaller

00:02:37.000 --> 00:02:39.620
model. That's a huge strategic advantage, isn't

00:02:39.620 --> 00:02:41.539
it? Just from an engineering and cost perspective,

00:02:41.659 --> 00:02:43.419
it completely changes the deployment economics.

00:02:43.840 --> 00:02:46.439
Yeah. If you only need to power up 3 .2 % of

00:02:46.439 --> 00:02:49.599
the network, you're drastically cutting down

00:02:49.599 --> 00:02:52.319
the compute needed, the energy draw, the chip

00:02:52.319 --> 00:02:54.240
requirements for inference. Oh, absolutely. Think

00:02:54.240 --> 00:02:57.330
about traditional dense models. Every single

00:02:57.330 --> 00:03:00.389
parameter is involved every single time. That

00:03:00.389 --> 00:03:03.969
takes immense continuous power. But Kimi K2's

00:03:03.969 --> 00:03:06.330
Moe design, it means companies can scale up how

00:03:06.330 --> 00:03:09.669
many queries they handle without needing exponentially

00:03:09.669 --> 00:03:12.789
more hardware. It makes top -tier AI suddenly

00:03:12.789 --> 00:03:15.110
much more accessible, runnable on, let's say,

00:03:15.210 --> 00:03:17.770
less extreme hardware clusters. And beyond just

00:03:17.770 --> 00:03:19.930
the efficiency, the capabilities themselves sound

00:03:19.930 --> 00:03:22.590
pretty formidable. Context length is 256 ,000

00:03:22.590 --> 00:03:25.030
tokens. That's definitely big enough to handle

00:03:25.030 --> 00:03:27.509
whole code bases. Right. Or long legal docs,

00:03:27.650 --> 00:03:29.830
years of financial reports, maybe all in one

00:03:29.830 --> 00:03:31.800
go. For sure. And then there's this other metric,

00:03:31.900 --> 00:03:35.539
the agentic capability one, sequential tool calls.

00:03:35.759 --> 00:03:39.379
The sources say Kimi K2 can handle 200, maybe

00:03:39.379 --> 00:03:42.319
300 plus tool calls in a row without a human

00:03:42.319 --> 00:03:44.560
stepping in. That means you can give it a really

00:03:44.560 --> 00:03:47.199
complex multi -step plan, something like analyze

00:03:47.199 --> 00:03:49.819
sentiment for these five stocks, pull relevant

00:03:49.819 --> 00:03:52.659
news from the last quarter, draft a summary email,

00:03:52.879 --> 00:03:55.520
schedule the meeting, and it can just go and

00:03:55.520 --> 00:03:58.319
execute almost that entire workflow autonomously.

00:03:58.460 --> 00:04:00.900
Okay, that's impressive. 300 sequential steps

00:04:00.900 --> 00:04:04.620
without intervention. So given that capability,

00:04:04.919 --> 00:04:07.460
what's the biggest hurdle people actually face

00:04:07.460 --> 00:04:09.259
when they try to use this for really complex

00:04:09.259 --> 00:04:12.159
autonomous agent tasks? Where does it still struggle?

00:04:12.460 --> 00:04:14.460
Yeah, that's a good question, honestly. I still

00:04:14.460 --> 00:04:16.800
wrestle with prompt drift myself sometimes, keeping

00:04:16.800 --> 00:04:19.160
agents locked onto the original goal across really

00:04:19.160 --> 00:04:21.000
long workflows. It's kind of like, you know,

00:04:21.019 --> 00:04:23.600
giving someone a 10 -step task. By step 7 or

00:04:23.600 --> 00:04:25.180
8, they might just slightly misunderstand the

00:04:25.180 --> 00:04:27.000
original intent because of tiny errors building

00:04:27.000 --> 00:04:29.319
up. Mm -hmm, that makes sense. A bit of cumulative

00:04:29.319 --> 00:04:31.339
error. Okay, that vulnerability is important

00:04:31.339 --> 00:04:34.519
context. Now let's shift to the proof. The numbers,

00:04:34.699 --> 00:04:37.540
the benchmarks that seem to back up this claim

00:04:37.540 --> 00:04:40.120
that open source is really competing at the top

00:04:40.120 --> 00:04:41.600
level now. Right. This is where the story gets

00:04:41.600 --> 00:04:44.060
really interesting. So for general reasoning,

00:04:44.160 --> 00:04:46.319
and especially in these agentic benchmarks testing

00:04:46.319 --> 00:04:48.759
how well it uses tools and follows multi -step

00:04:48.759 --> 00:04:51.600
plans, Kimi K2 is consistently beating several

00:04:51.600 --> 00:04:54.540
top closed source models. It just seems better

00:04:54.540 --> 00:04:56.420
at actually applying its knowledge through tools.

00:04:56.579 --> 00:04:59.519
And coding. The results there sound pretty clear

00:04:59.519 --> 00:05:01.839
cut. The sources mentioned competitive programming

00:05:01.839 --> 00:05:04.360
challenges. And Kini K2 apparently got the highest

00:05:04.360 --> 00:05:08.459
score, decisively beating Claude 4 .5. That points

00:05:08.459 --> 00:05:10.800
to really strong logic and cogeneration. Yeah,

00:05:10.819 --> 00:05:13.839
really strong. And even in high -level academic

00:05:13.839 --> 00:05:17.800
stuff. They use this tough benchmark, GPQA, Diamond,

00:05:18.019 --> 00:05:20.740
graduate -level science questions. Kimi K2 is

00:05:20.740 --> 00:05:23.100
a beast there, too. Beats Cloud 4 .5, again,

00:05:23.180 --> 00:05:25.000
pretty handily. It's only slightly behind Grok

00:05:25.000 --> 00:05:28.300
4, scoring like 87 .5%. Really impressive. But

00:05:28.300 --> 00:05:30.259
the big headline, the thing that really feels

00:05:30.259 --> 00:05:33.319
like a turning point, is that independent global

00:05:33.319 --> 00:05:37.680
leaderboard ranking. For so long... The top spots,

00:05:37.899 --> 00:05:40.420
maybe the top five, were all proprietary models,

00:05:40.639 --> 00:05:43.819
closed systems. Now you look at the list, and

00:05:43.819 --> 00:05:46.040
okay, GPT -5 high is number one, yes, proprietary.

00:05:46.319 --> 00:05:48.779
But right there at number two, globally, it's

00:05:48.779 --> 00:05:51.699
Kimi K2 thinking. An open source model, setting

00:05:51.699 --> 00:05:54.180
higher than Grot 4, higher than Claude 4 .5,

00:05:54.240 --> 00:05:57.560
higher than Gemini 2 .5 Pro. It's huge validation

00:05:57.560 --> 00:06:00.420
for this whole Moe approach. And, you know, if

00:06:00.420 --> 00:06:03.420
performing at that level wasn't enough. We absolutely

00:06:03.420 --> 00:06:05.500
have to talk about the cost. Right. Because Kimi

00:06:05.500 --> 00:06:07.660
K2 is apparently way cheaper than the other top

00:06:07.660 --> 00:06:09.800
models. The numbers suggest it's roughly three

00:06:09.800 --> 00:06:12.600
times cheaper than GPT -5. Yeah, around 3x cheaper

00:06:12.600 --> 00:06:14.939
than GPT -5. But the gap gets even wider compared

00:06:14.939 --> 00:06:16.620
to some others. We're talking like a six -fold

00:06:16.620 --> 00:06:19.899
cost advantage over a Quad 4 .5 and Grok 4. Six

00:06:19.899 --> 00:06:21.939
times cheaper. And remember, you can potentially

00:06:21.939 --> 00:06:23.699
run it on less expensive hardware because of

00:06:23.699 --> 00:06:26.180
that sparsity. So the TCO, the total cost of

00:06:26.180 --> 00:06:28.819
ownership for a business, just plummets. So that

00:06:28.819 --> 00:06:31.079
leads to the practical question. Does that...

00:06:31.449 --> 00:06:35.170
huge cost saving, maybe six times cheaper. Does

00:06:35.170 --> 00:06:38.149
that outweigh the perhaps very small performance

00:06:38.149 --> 00:06:41.129
difference compared to the absolute number one

00:06:41.129 --> 00:06:45.529
GPT -5 for most everyday business uses? I think

00:06:45.529 --> 00:06:48.350
for most professional use cases, yes, the massive

00:06:48.350 --> 00:06:50.829
cost savings are an easy choice. The value proposition

00:06:50.829 --> 00:06:53.889
is just incredibly strong. Okay. This leads us

00:06:53.889 --> 00:06:56.170
nicely into the more philosophical side, which

00:06:56.170 --> 00:06:58.050
might actually be the most critical part of this

00:06:58.050 --> 00:07:00.769
whole story. Moonshot AI didn't just build this

00:07:00.769 --> 00:07:03.269
incredibly powerful model. They decided to release

00:07:03.269 --> 00:07:05.589
it with open weights. We should probably clarify

00:07:05.589 --> 00:07:07.509
what that means exactly. Right. Open weights

00:07:07.509 --> 00:07:10.480
means the core of the model. The actual train

00:07:10.480 --> 00:07:12.860
parameters, that one trillion number is freely

00:07:12.860 --> 00:07:14.779
downloadable. Anyone can grab it and run it.

00:07:14.839 --> 00:07:17.000
It's not quite fully open source, like, say,

00:07:17.100 --> 00:07:19.139
Linux, where maybe all the training data and

00:07:19.139 --> 00:07:21.399
code are also open. But the model's intelligence,

00:07:21.639 --> 00:07:24.000
its brain is out there. And we should probably

00:07:24.000 --> 00:07:25.899
stress who this is really for. This isn't something

00:07:25.899 --> 00:07:27.899
you just, you know, download to your laptop on

00:07:27.899 --> 00:07:30.139
a whim. That model file is apparently around

00:07:30.139 --> 00:07:33.920
600 gigabytes. You need a serious GPU cluster.

00:07:33.959 --> 00:07:36.220
Right. We're talking. potentially hundreds of

00:07:36.220 --> 00:07:38.420
thousands of dollars in hardware to run it well.

00:07:38.579 --> 00:07:41.980
Exactly. It's aimed squarely at companies, serious

00:07:41.980 --> 00:07:44.540
research labs, developers building significant

00:07:44.540 --> 00:07:47.459
applications. And it's a very strategic move,

00:07:47.540 --> 00:07:50.480
really. It's positioned as a direct answer to

00:07:50.480 --> 00:07:52.519
what some people call the closed source problem

00:07:52.519 --> 00:07:56.540
that's kind of dominated high -end AI until now.

00:07:56.680 --> 00:07:59.480
And that problem. That basically means being

00:07:59.480 --> 00:08:02.600
totally reliant on a few big, mostly American

00:08:02.600 --> 00:08:06.019
AI labs, which creates some real strategic vulnerabilities

00:08:06.019 --> 00:08:08.379
for businesses in other countries. Absolutely.

00:08:08.439 --> 00:08:11.000
You've got serious vendor lock -in. Your entire

00:08:11.000 --> 00:08:13.980
AI capability could depend on one company's pricing

00:08:13.980 --> 00:08:16.699
whims or sudden policy shifts. And critically,

00:08:16.920 --> 00:08:19.600
the data privacy issue is huge. To use those

00:08:19.600 --> 00:08:21.480
closed models, you have to constantly send your

00:08:21.480 --> 00:08:23.720
sensitive proprietary data out to their servers

00:08:23.720 --> 00:08:26.259
for processing. There's often very little transparency

00:08:26.259 --> 00:08:29.170
into how it's used. secured, it's complete dependence.

00:08:29.569 --> 00:08:31.550
So the open source approach, particularly with

00:08:31.550 --> 00:08:34.330
a powerful model like Kini K2 available with

00:08:34.330 --> 00:08:37.250
open weights, it just flips that whole dynamic.

00:08:37.269 --> 00:08:39.470
It offers data sovereignty. Total data control.

00:08:39.610 --> 00:08:42.070
Because you download the model, you run it on

00:08:42.070 --> 00:08:44.370
your hardware inside your secure perimeter, your

00:08:44.370 --> 00:08:46.690
confidential health data, your financial projections.

00:08:47.419 --> 00:08:50.340
They never leave your infrastructure. Plus, you

00:08:50.340 --> 00:08:53.179
get full control over customization. You can

00:08:53.179 --> 00:08:55.460
fine tune it deeply for your specific industry

00:08:55.460 --> 00:08:58.639
or tasks. And crucially, you gain independence

00:08:58.639 --> 00:09:01.860
from external pricing, external rules, external

00:09:01.860 --> 00:09:04.279
regulators. It must have taken a massive commitment,

00:09:04.320 --> 00:09:07.159
though, for Moonshot AI to spend what must have

00:09:07.159 --> 00:09:09.980
been astronomical sums developing something this

00:09:09.980 --> 00:09:12.820
close to GPT -5 level. And then essentially just...

00:09:13.370 --> 00:09:15.490
Give the engine away for free. It really is like

00:09:15.490 --> 00:09:17.750
a gift to the global research and development

00:09:17.750 --> 00:09:20.250
community. It just turbocharges innovation everywhere

00:09:20.250 --> 00:09:22.710
because now everyone can access and build on

00:09:22.710 --> 00:09:24.509
a truly state -of -the -art foundation model.

00:09:24.830 --> 00:09:28.169
Whoa. Imagine scaling to a billion queries without

00:09:28.169 --> 00:09:31.330
reliance on external companies. That's just pure

00:09:31.330 --> 00:09:34.269
innovation fuel for countless startups, for corporate

00:09:34.269 --> 00:09:36.919
R &D labs everywhere. And, you know, it puts

00:09:36.919 --> 00:09:39.580
immediate, intense pressure back on the closed

00:09:39.580 --> 00:09:43.159
labs. They now have to really justify those high

00:09:43.159 --> 00:09:45.799
subscription fees when something this good is

00:09:45.799 --> 00:09:48.320
available for free, provided you have the hardware.

00:09:48.519 --> 00:09:50.460
So putting aside the competitive angle for a

00:09:50.460 --> 00:09:52.860
second, what do you think is the single biggest

00:09:52.860 --> 00:09:55.919
benefit for global research when a foundational

00:09:55.919 --> 00:09:59.990
model this powerful is made open like this? Accelerated

00:09:59.990 --> 00:10:02.169
research happens because everyone can now build

00:10:02.169 --> 00:10:03.690
directly on a state -of -the -art foundation.

00:10:03.929 --> 00:10:06.590
It just raises the baseline for the entire field

00:10:06.590 --> 00:10:09.009
almost overnight. Okay, so let's get practical.

00:10:09.529 --> 00:10:11.889
How can someone listening right now actually

00:10:11.889 --> 00:10:14.049
access this power? Sound like there are basically

00:10:14.049 --> 00:10:16.649
two main ways. Option one, the simplest path

00:10:16.649 --> 00:10:18.950
is just using their online interface, right?

00:10:20.279 --> 00:10:22.480
Yeah, but the key there is you have to remember

00:10:22.480 --> 00:10:25.139
to enable thinking mode in the settings to actually

00:10:25.139 --> 00:10:28.320
access the Kimi K2 power. And that online version

00:10:28.320 --> 00:10:30.679
already has some pretty capable agent modes built

00:10:30.679 --> 00:10:32.960
in, like OK Computer for coding, which sounds

00:10:32.960 --> 00:10:35.419
quite autonomous, and Researcher for digging

00:10:35.419 --> 00:10:37.519
through and summarizing data. And then option

00:10:37.519 --> 00:10:40.700
two is for the, let's say, power users. the enterprises,

00:10:41.000 --> 00:10:43.879
the researchers needing maximum control. That's

00:10:43.879 --> 00:10:46.220
the local on -premise deployment, downloading

00:10:46.220 --> 00:10:48.379
the weights from Hugging Face. That's the road

00:10:48.379 --> 00:10:50.200
if you're dealing with highly sensitive data

00:10:50.200 --> 00:10:53.289
like health or finance. and need absolute 100

00:10:53.289 --> 00:10:56.889
% control. Exactly. So let's kind of sum up where

00:10:56.889 --> 00:11:00.149
Kimi K2 really shines. Its superpowers, if you

00:11:00.149 --> 00:11:03.169
will. First, definitely coding and software development.

00:11:03.389 --> 00:11:06.070
The reasoning ability for code seems top -notch.

00:11:06.129 --> 00:11:09.429
Second, building AI agents. That high number

00:11:09.429 --> 00:11:12.210
of sequential tool calls makes it ideal for complex

00:11:12.210 --> 00:11:15.419
autonomous tasks. Third, probably high level

00:11:15.419 --> 00:11:18.200
scientific and financial research. It seems exceptionally

00:11:18.200 --> 00:11:20.379
good at pulling together and reasoning over complex

00:11:20.379 --> 00:11:22.779
technical information. And of course, anytime

00:11:22.779 --> 00:11:25.399
cost is a major factor or when data privacy is

00:11:25.399 --> 00:11:27.440
absolutely mandatory. But we should be fair and

00:11:27.440 --> 00:11:29.919
note where the alternatives might still have

00:11:29.919 --> 00:11:32.279
a slight advantage. GPT -5, for instance. Yeah.

00:11:32.360 --> 00:11:34.379
The sources still suggest it might be a bit better

00:11:34.379 --> 00:11:36.740
at extremely complex, maybe more creative or

00:11:36.740 --> 00:11:38.720
abstract tasks. Right. They mentioned things

00:11:38.720 --> 00:11:41.240
like the beehive simulation example, scenarios

00:11:41.240 --> 00:11:43.460
needing really complex, maybe edge case physics.

00:11:43.500 --> 00:11:46.659
understanding, Kimi K2 apparently struggled a

00:11:46.659 --> 00:11:49.360
bit more there. So it suggests that while the

00:11:49.360 --> 00:11:52.139
general knowledge is vast, maybe the absolute

00:11:52.139 --> 00:11:55.360
peak of abstract complex reasoning isn't quite

00:11:55.360 --> 00:11:58.909
at the GPT -5 level yet. minor gaps. And context

00:11:58.909 --> 00:12:01.309
window length is another one. If you absolutely

00:12:01.309 --> 00:12:04.429
need the biggest possible window, say you're

00:12:04.429 --> 00:12:06.870
feeding it a massive 900 page book or dozens

00:12:06.870 --> 00:12:11.070
of dense PDFs at once, Gemini 2 .5 Pro's 1 million

00:12:11.070 --> 00:12:13.490
token window is still the leader there, right?

00:12:13.629 --> 00:12:15.789
Correct. So, yeah, the tradeoff seems pretty

00:12:15.789 --> 00:12:18.730
clear. You might accept a tiny, maybe negligible

00:12:18.730 --> 00:12:21.389
hit in absolute top end edge case quality. But

00:12:21.389 --> 00:12:23.350
in return, you get enormous gains in openness,

00:12:23.629 --> 00:12:25.789
total data privacy if you run it locally and

00:12:25.789 --> 00:12:27.889
that, you know, potentially. Six full cost efficiency.

00:12:28.129 --> 00:12:30.389
The final question on readiness. Considering

00:12:30.389 --> 00:12:32.409
those known weaknesses like edge case physics,

00:12:32.509 --> 00:12:34.870
some minor layout bugs mentioned, is Kimi K2

00:12:34.870 --> 00:12:36.769
truly ready for mission critical high stakes

00:12:36.769 --> 00:12:39.590
business use today? I'd say yes, especially for

00:12:39.590 --> 00:12:42.809
privacy critical work. The ability to have complete

00:12:42.809 --> 00:12:45.610
control over your data often outweighs those

00:12:45.610 --> 00:12:49.269
known minor bugs. That independence is frequently

00:12:49.269 --> 00:12:52.220
the deciding factor. Okay, let's try and bring

00:12:52.220 --> 00:12:54.480
this all together then, the core thesis of our

00:12:54.480 --> 00:12:57.340
deep talk today. It seems to me that Kimi K2's

00:12:57.340 --> 00:13:01.000
success really proves that open source AI isn't

00:13:01.000 --> 00:13:03.100
just playing catch -up anymore. It's actually

00:13:03.100 --> 00:13:06.080
leading in some really important areas. And it's

00:13:06.080 --> 00:13:09.100
firmly established itself as a peer, performance

00:13:09.100 --> 00:13:12.179
-wise, to the big proprietary models. It really

00:13:12.179 --> 00:13:14.559
feels like a historic shift. Think back just

00:13:14.559 --> 00:13:17.419
a few months. The top five models on those leaderboards,

00:13:17.600 --> 00:13:20.399
all closed systems. Innovation locked behind.

00:13:20.490 --> 00:13:23.289
paywalls. Now, the number two spot globally is

00:13:23.289 --> 00:13:25.129
held by an open source model that puts incredible

00:13:25.129 --> 00:13:27.429
pressure on big tech. They have to innovate faster,

00:13:27.590 --> 00:13:29.809
sure, but they also probably have to bring prices

00:13:29.809 --> 00:13:32.389
down for everyone needing AI. So for you, the

00:13:32.389 --> 00:13:34.629
listener, the takeaway is that you now have a

00:13:34.629 --> 00:13:37.850
real meaningful choice. It's not just about convenience

00:13:37.850 --> 00:13:39.850
versus nothing. It's convenience versus this

00:13:39.850 --> 00:13:42.470
potent combination of power, privacy, and cost

00:13:42.470 --> 00:13:45.220
control if you go the open way route. Yeah. So

00:13:45.220 --> 00:13:47.000
maybe here's a final thought to leave you with

00:13:47.000 --> 00:13:50.100
something to mull over. If a top tier open source

00:13:50.100 --> 00:13:53.159
model can basically match performance in most

00:13:53.159 --> 00:13:56.259
areas while potentially costing six times less

00:13:56.259 --> 00:13:59.679
to operate, what really is the long term sustainable

00:13:59.679 --> 00:14:02.519
value proposition for the closed models? How

00:14:02.519 --> 00:14:04.419
are they going to justify that significant price

00:14:04.419 --> 00:14:06.620
premium going forward? Definitely something to

00:14:06.620 --> 00:14:08.279
think about. We encourage you to check out the

00:14:08.279 --> 00:14:10.179
sources, look at the leaderboards and consider

00:14:10.179 --> 00:14:13.259
what this shift means for your own work or research.

00:14:13.799 --> 00:14:15.179
Thanks for joining us on this deep dive.
