WEBVTT

00:00:00.000 --> 00:00:03.899
Imagine this. You take a sentence and about 60

00:00:03.899 --> 00:00:06.540
seconds later, you have a full website, not just

00:00:06.540 --> 00:00:10.400
text, but like playable video, user logins, the

00:00:10.400 --> 00:00:12.679
works. Yeah. Or creating, you know, cinematic

00:00:12.679 --> 00:00:14.660
video clips just like that. And then imagine

00:00:14.660 --> 00:00:18.120
getting all of that for free. That's really the

00:00:18.120 --> 00:00:20.460
heart of it. That's the quiet revolution happening

00:00:20.460 --> 00:00:23.899
in AI right now. Welcome to the Deep Dive. We're

00:00:23.899 --> 00:00:27.059
here to unpack complex topics and really pull

00:00:27.059 --> 00:00:28.980
out the key insights. And today, yeah, we're

00:00:28.980 --> 00:00:31.719
diving into what feels like a huge shift in artificial

00:00:31.719 --> 00:00:34.259
intelligence. For years, it felt like the best

00:00:34.259 --> 00:00:37.439
AI tools were locked away, right? Expensive subscriptions,

00:00:37.939 --> 00:00:40.619
complicated access. Totally, like an exclusive

00:00:40.619 --> 00:00:44.079
club. But now, especially with some big moves

00:00:44.079 --> 00:00:46.719
from Chinese tech firms, that whole model is

00:00:46.719 --> 00:00:48.880
getting flipped on its head. Flipped how? They're

00:00:48.880 --> 00:00:51.359
basically giving away truly world -class AI.

00:00:52.030 --> 00:00:54.810
For everyone. OK, so our mission today is to

00:00:54.810 --> 00:00:57.070
really get under the hood of this change. We

00:00:57.070 --> 00:01:00.270
need to explore this great AI divide, these like

00:01:00.270 --> 00:01:02.390
fundamentally different philosophies. Right.

00:01:02.469 --> 00:01:04.590
And we'll look at three specific examples that

00:01:04.590 --> 00:01:07.329
are already making waves. Minimax M1 for websites,

00:01:07.650 --> 00:01:13.250
CDance 1 .0 for video and KimiDev 72B for code

00:01:13.250 --> 00:01:16.319
fixing. A superhuman code doctor, I think you

00:01:16.319 --> 00:01:18.760
called it. Yeah, something like that. It's about

00:01:18.760 --> 00:01:22.560
showing how these tools give, well, superpowers

00:01:22.560 --> 00:01:25.379
to individuals, to small businesses. I don't

00:01:25.379 --> 00:01:28.000
like the playing field. Exactly. In ways that

00:01:28.000 --> 00:01:30.200
honestly seemed impossible just a little while

00:01:30.200 --> 00:01:32.680
ago. All right, let's unpack that core idea first.

00:01:32.900 --> 00:01:36.200
Why? Why give away something that costs? Presumably

00:01:36.200 --> 00:01:38.180
hundreds of millions to develop. It just seems

00:01:38.180 --> 00:01:40.840
to fly in the face of, you know, normal business

00:01:40.840 --> 00:01:43.120
strategy. It does seem counterintuitive from

00:01:43.120 --> 00:01:45.420
a Western perspective. Yeah. That model usually

00:01:45.420 --> 00:01:48.480
focuses on high value users, subscription fees,

00:01:48.719 --> 00:01:50.900
maximizing revenue from each customer willing

00:01:50.900 --> 00:01:53.099
to pay top dollar. Like you said, the exclusive

00:01:53.099 --> 00:01:55.819
club. Exactly. But the Eastern model, especially

00:01:55.819 --> 00:01:58.810
these big Chinese tech companies. It's playing

00:01:58.810 --> 00:02:01.790
a different game entirely. Their goal is domination

00:02:01.790 --> 00:02:04.209
through democratization. Domination through democratization.

00:02:04.390 --> 00:02:06.709
Okay. So they make these advanced tools free.

00:02:06.890 --> 00:02:10.210
Why? To get everyone using them. Massive adoption.

00:02:10.590 --> 00:02:12.770
Huge market penetration. And that feeds back

00:02:12.770 --> 00:02:15.210
into their systems. Precisely. It creates this

00:02:15.210 --> 00:02:17.650
enormous data feedback loop. They learn faster.

00:02:17.770 --> 00:02:19.530
Their models improve quicker because they have

00:02:19.530 --> 00:02:22.349
so much more usage data coming in. The end game

00:02:22.349 --> 00:02:25.289
isn't immediate profit per user. It's ecosystem

00:02:25.289 --> 00:02:28.770
control. Yes. become the foundation the free

00:02:28.770 --> 00:02:32.430
default and often the best choice then you essentially

00:02:32.430 --> 00:02:35.710
own the ecosystem so one sells access the other

00:02:35.710 --> 00:02:38.509
builds market share by giving power away that's

00:02:38.509 --> 00:02:40.310
a great way to put it control through distribution

00:02:40.310 --> 00:02:42.909
essentially okay that makes sense and it really

00:02:42.909 --> 00:02:45.229
sets the stage for these tools it's not theoretical

00:02:45.229 --> 00:02:48.379
anymore Let's talk Minimax M1, the instant website

00:02:48.379 --> 00:02:50.560
architect. Yeah, this one is a perfect example.

00:02:50.780 --> 00:02:53.259
I read this example that just blew my mind. Someone

00:02:53.259 --> 00:02:56.460
prompted it. Create a clone of the Netflix website

00:02:56.460 --> 00:02:59.240
complete with playable video trailers. And it

00:02:59.240 --> 00:03:01.300
wasn't just a mock -up. Nope. Fully interactive.

00:03:02.000 --> 00:03:04.340
Navigation, profiles, even the video trailers

00:03:04.340 --> 00:03:06.159
playing when you hover over them. Just like the

00:03:06.159 --> 00:03:08.379
real site. In about a minute. And this isn't

00:03:08.379 --> 00:03:10.500
just a slightly smarter chatbot spitting out

00:03:10.500 --> 00:03:13.139
HTML. This comes from what they call an open

00:03:13.139 --> 00:03:16.020
weight, large scale, hybrid reasoning model.

00:03:16.219 --> 00:03:19.639
Okay. Open weight, hybrid reasoning. Break that

00:03:19.639 --> 00:03:21.319
down a bit. What does that actually mean for

00:03:21.319 --> 00:03:24.560
someone using it? Well, open weight means the

00:03:24.560 --> 00:03:27.659
model's parameters. It's sort of brain. are publicly

00:03:27.659 --> 00:03:30.180
available. Anyone can use and build on it. Yeah.

00:03:30.280 --> 00:03:33.280
Under a very permissive license, Apache 2 .0,

00:03:33.360 --> 00:03:35.560
which is key for commercial use. Okay. So it's

00:03:35.560 --> 00:03:39.419
truly open. Right. And hybrid reasoning. Think

00:03:39.419 --> 00:03:40.960
of it like having different specialized tools.

00:03:41.139 --> 00:03:43.900
It combines different AI techniques. Plus, it

00:03:43.900 --> 00:03:46.360
has a massive context window, up to a million

00:03:46.360 --> 00:03:49.259
tokens. A million tokens. It's like processing

00:03:49.259 --> 00:03:51.340
multiple novels worth of information at once.

00:03:51.439 --> 00:03:53.219
Right. Pretty much. A token is just a small piece

00:03:53.219 --> 00:03:55.409
of text or code. So it can understand really

00:03:55.409 --> 00:03:57.830
complex requests like an entire website structure

00:03:57.830 --> 00:04:00.349
and content all in one go. And it's efficient,

00:04:00.449 --> 00:04:02.469
too, you mentioned. Extremely. It has something

00:04:02.469 --> 00:04:05.150
like 450 billion parameters total, which sounds

00:04:05.150 --> 00:04:08.270
huge. But for any given task, it only activates

00:04:08.270 --> 00:04:10.969
the relevant ones, maybe 45 billion. Like calling

00:04:10.969 --> 00:04:13.129
in only the specialists you need for a specific

00:04:13.129 --> 00:04:16.069
job. Exactly. Saves a ton of computing power.

00:04:16.230 --> 00:04:18.790
Makes it incredibly fast. So that efficiency

00:04:18.790 --> 00:04:21.180
and the open model. That's how they can offer

00:04:21.180 --> 00:04:23.720
it. How do people actually start using Minimax

00:04:23.720 --> 00:04:27.019
M1 today? It's pretty simple. Go to their website,

00:04:27.160 --> 00:04:29.000
sign up for a free account. They usually give

00:04:29.000 --> 00:04:30.939
you starting credits, maybe a thousand or so.

00:04:31.079 --> 00:04:34.000
Okay. I'd say start simple. Like create a portfolio

00:04:34.000 --> 00:04:37.120
website for a photographer, see what it does.

00:04:37.259 --> 00:04:39.660
Then you can iterate, maybe ask for changes.

00:04:39.899 --> 00:04:42.040
And then move to more complex stuff. Yeah. Then

00:04:42.040 --> 00:04:43.939
try something like that. Interactive dashboard

00:04:43.939 --> 00:04:47.000
for project deadlines example. Just build up

00:04:47.000 --> 00:04:49.439
complexity. What's the biggest hurdle people

00:04:49.439 --> 00:04:53.339
face when they first try it? Honestly, just believing

00:04:53.339 --> 00:04:55.860
it. Overcoming that initial thought of, this

00:04:55.860 --> 00:04:58.279
sounds way too good to be true. Huh, right. The

00:04:58.279 --> 00:05:00.600
psychological barrier. Because we're so used

00:05:00.600 --> 00:05:03.579
to powerful tech costing a lot. Exactly. There's

00:05:03.579 --> 00:05:06.920
this ingrained skepticism about free when it

00:05:06.920 --> 00:05:09.699
comes to high -end AI. It's about shifting your

00:05:09.699 --> 00:05:12.779
mindset from scarcity to, well, abundance in

00:05:12.779 --> 00:05:15.879
this space. That's a profound shift. Okay, let's

00:05:15.879 --> 00:05:20.329
talk video. For years, AI video felt clunky,

00:05:20.449 --> 00:05:23.889
short clips, weird motion. Yeah, definitely hit

00:05:23.889 --> 00:05:26.550
or miss, often more miss. But then ByteDance,

00:05:26.589 --> 00:05:30.009
the TikTok people, quietly dropped CDance 1 .0,

00:05:30.069 --> 00:05:32.689
and now it's apparently topping the leaderboards.

00:05:32.910 --> 00:05:36.069
It really is. The speed and quality are just

00:05:36.069 --> 00:05:39.610
remarkable. You can get a really nice 5 -second

00:05:39.610 --> 00:05:42.910
1080p video clip in about 40 seconds on just

00:05:42.910 --> 00:05:45.709
one regular graphics card. That's fast. But what

00:05:45.709 --> 00:05:48.069
makes it stand out beyond speed? The cinematic

00:05:48.069 --> 00:05:50.810
understanding. It can generate multi -shot sequences

00:05:50.810 --> 00:05:53.250
from a single prompt. You know, like start with

00:05:53.250 --> 00:05:55.350
a wide shot, then cut to a close -up, maybe add

00:05:55.350 --> 00:05:57.389
a tracking shot. It gets it. How does it learn

00:05:57.389 --> 00:05:59.269
that? That feels like a director's skill. It's

00:05:59.269 --> 00:06:01.449
down to the training. They use this multi -dimensional

00:06:01.449 --> 00:06:03.370
reward system, so it wasn't just learning follow

00:06:03.370 --> 00:06:05.290
the prompt or make it look pretty. It learned

00:06:05.290 --> 00:06:07.829
all those things together. Visuals, motion, following

00:06:07.829 --> 00:06:10.509
instructions, even scene composition. All at

00:06:10.509 --> 00:06:13.000
once. That seems to be the key difference. It

00:06:13.000 --> 00:06:15.360
understands how shops should connect, how motion

00:06:15.360 --> 00:06:17.300
should look natural within a scene. It's not

00:06:17.300 --> 00:06:19.800
just stitching images. Okay, so for someone wanting

00:06:19.800 --> 00:06:23.339
to use this, maybe a small business making ads

00:06:23.339 --> 00:06:26.240
or a creator, how do they access C -Dance? You'd

00:06:26.240 --> 00:06:28.019
usually go through a platform that hosts it,

00:06:28.060 --> 00:06:30.220
like Wavespeed AI is one example. And how would

00:06:30.220 --> 00:06:32.860
you start? Again, start simple. Text to video.

00:06:33.019 --> 00:06:35.040
A sharp knife cutting through a copper gear.

00:06:35.199 --> 00:06:37.860
Something clear. Okay. Then maybe try image to

00:06:37.860 --> 00:06:45.720
video. Upload a picture of, say, Gotcha. And

00:06:45.720 --> 00:06:47.699
the multi -shot. Yeah. Then describe a sequence.

00:06:48.920 --> 00:06:52.860
Scene one. Detective enters a dark room. Scene

00:06:52.860 --> 00:06:55.899
two. Clothes up on clues on a table. Scene three.

00:06:56.139 --> 00:06:58.600
Detective picks up an object. Looks thoughtful.

00:06:58.759 --> 00:07:00.959
It understands that structure. What is it fundamentally

00:07:00.959 --> 00:07:03.439
doing better to make the video look so real?

00:07:03.680 --> 00:07:06.420
It seems to grasp motion and physics better.

00:07:06.829 --> 00:07:08.910
It's not just generating plausible frames. It's

00:07:08.910 --> 00:07:10.889
generating plausible movement between frames.

00:07:11.069 --> 00:07:13.670
That makes sense. Physics is hard. Okay, on to

00:07:13.670 --> 00:07:17.310
the third one. KimiDev72B, the superhuman code

00:07:17.310 --> 00:07:20.110
doctor. This hits close to home for anyone who

00:07:20.110 --> 00:07:23.550
codes. Oh, yeah. Debugging is the worst sometimes.

00:07:23.810 --> 00:07:27.029
Tell me about it. I still wrestle with prompt

00:07:27.029 --> 00:07:29.689
drift myself when I'm debugging my own small

00:07:29.689 --> 00:07:32.790
projects, trying one thing, then another, getting

00:07:32.790 --> 00:07:36.170
lost. The idea of an AI cutting through that.

00:07:36.740 --> 00:07:40.800
Huge. And KimiDev from Moonshot AI seems to do

00:07:40.800 --> 00:07:43.879
just that. It's free, and it's outperforming

00:07:43.879 --> 00:07:47.079
paid tools on standard tests. How good is it?

00:07:47.139 --> 00:07:48.399
Like, what are the numbers? Well, there's this

00:07:48.399 --> 00:07:51.439
benchmark called SWBench. It uses real -world

00:07:51.439 --> 00:07:54.420
broken code from GitHub projects. The previous

00:07:54.420 --> 00:07:57.699
best free models scored around 40%. Yeah. Top

00:07:57.699 --> 00:08:00.800
paid ones were maybe 50%, 55%. Okay. And KimiDev?

00:08:00.920 --> 00:08:03.339
Scored 60 .4%. That's not just a little better.

00:08:03.420 --> 00:08:05.980
That's a massive jump. Like a 50 % improvement

00:08:05.980 --> 00:08:08.480
over the prior best free model. Wow. Okay. How

00:08:08.480 --> 00:08:10.279
does it actually work? What's the magic? They

00:08:10.279 --> 00:08:12.259
call it a two -brain approach. Yeah. Kind of

00:08:12.259 --> 00:08:14.680
cool. One brain is the bug fixer. It analyzes

00:08:14.680 --> 00:08:17.060
the code, pinpoints the problem with real proficient.

00:08:17.160 --> 00:08:18.779
It finds the needle in the haystack. Pretty much.

00:08:18.899 --> 00:08:21.399
Then the second brain is a test writer. It automatically

00:08:21.399 --> 00:08:23.040
writes new tests to make sure the fix actually

00:08:23.040 --> 00:08:24.920
worked and didn't break something else. So it

00:08:24.920 --> 00:08:27.199
fixes the bug and writes the unit test to prove

00:08:27.199 --> 00:08:29.939
it. Exactly. Built -in quality check. That's

00:08:29.939 --> 00:08:32.440
really clever. How do developers actually use

00:08:32.440 --> 00:08:34.419
this? Can you just point it at your messy code?

00:08:34.720 --> 00:08:36.919
You can often run it locally, which is a big

00:08:36.919 --> 00:08:38.860
plus for privacy. You'd find it on places like

00:08:38.860 --> 00:08:42.159
GitHub or Hugging Face. Right, keep your proprietary

00:08:42.159 --> 00:08:44.799
code on your machine. Yeah, you give it the bug

00:08:44.799 --> 00:08:47.840
report, the code base, it'll propose a fix, and

00:08:47.840 --> 00:08:50.279
the tests. Definitely start with simpler bugs

00:08:50.279 --> 00:08:52.559
first. And you still need to review it, obviously.

00:08:52.899 --> 00:08:54.539
Well, absolutely. It's an incredibly powerful

00:08:54.539 --> 00:08:57.440
assistant, but human oversight is still crucial.

00:08:57.870 --> 00:09:00.629
You review the fix, review the tests, then merge

00:09:00.629 --> 00:09:03.029
it. So KimiDev isn't replacing developers tomorrow.

00:09:03.289 --> 00:09:05.649
No, no, it's augmenting them, making them way

00:09:05.649 --> 00:09:07.409
more efficient at one of the most time -consuming

00:09:07.409 --> 00:09:10.129
parts of the job. Sponsor. Okay, let's pull back

00:09:10.129 --> 00:09:12.049
a bit. We've looked at Minimax for websites,

00:09:12.509 --> 00:09:16.210
CDance for video, KimiDev for code. What's the

00:09:16.210 --> 00:09:18.769
big picture here? What does this all really mean?

00:09:19.090 --> 00:09:21.409
I think it signals a really fundamental shift,

00:09:21.649 --> 00:09:25.269
maybe even the end of an era for expensive, closed

00:09:25.269 --> 00:09:28.679
-off AI tools. These capabilities are becoming

00:09:28.679 --> 00:09:31.379
accessible to everyone, often for free. Which

00:09:31.379 --> 00:09:34.200
is huge for me. Individuals, small businesses.

00:09:34.750 --> 00:09:36.990
startups, anyone who previously couldn't afford

00:09:36.990 --> 00:09:41.149
access to this level of AI power. Suddenly, you

00:09:41.149 --> 00:09:43.750
have these incredible tools at your fingertips.

00:09:44.049 --> 00:09:46.610
The playing field really is leveling out. Completely.

00:09:46.830 --> 00:09:48.789
Yeah. It's less about your budget now and more

00:09:48.789 --> 00:09:51.289
about your ideas, your creativity, your speed

00:09:51.289 --> 00:09:54.350
in using these tools. And what about the companies

00:09:54.350 --> 00:09:56.029
that have been charging those high subscription

00:09:56.029 --> 00:09:58.149
fees? Well, the pressure is definitely on them

00:09:58.149 --> 00:10:01.210
now. It's much harder to justify charging a lot

00:10:01.210 --> 00:10:03.389
when there are free alternatives that are just

00:10:03.389 --> 00:10:05.710
as good or sometimes even better. So they'll

00:10:05.710 --> 00:10:08.389
need to find new ways to add value. Exactly.

00:10:08.809 --> 00:10:11.129
Maybe through better user experience, specialized

00:10:11.129 --> 00:10:13.990
industry models, integrations, enterprise level

00:10:13.990 --> 00:10:17.429
support, things beyond just the raw AI capability

00:10:17.429 --> 00:10:20.110
itself. Looking across all three tools, there's

00:10:20.110 --> 00:10:22.289
a common thread that strikes me. Speed. Absolutely.

00:10:22.850 --> 00:10:25.330
Websites in minutes, videos in seconds, code

00:10:25.330 --> 00:10:27.470
fixed almost instantly. That changes the whole

00:10:27.470 --> 00:10:29.710
pace of creation and innovation, doesn't it?

00:10:29.929 --> 00:10:32.370
It creates a massive competitive advantage. If

00:10:32.370 --> 00:10:34.549
you can iterate, experiment and deploy that much

00:10:34.549 --> 00:10:38.269
faster, you can learn and adapt far quicker than

00:10:38.269 --> 00:10:40.789
competitors stuck on slower cycles. So the takeaway

00:10:40.789 --> 00:10:43.950
seems to be this AI revolution isn't some future

00:10:43.950 --> 00:10:45.909
thing we're waiting for. Not at all. It's here

00:10:45.909 --> 00:10:49.090
now. And it's being driven by a different philosophy,

00:10:49.309 --> 00:10:52.190
this democratization idea. And it's available

00:10:52.190 --> 00:10:55.830
to you today for free. Take these insights. Think

00:10:55.830 --> 00:10:58.309
about how they could change how you work, what

00:10:58.309 --> 00:11:00.600
you create. Yeah, it's not just about doing the

00:11:00.600 --> 00:11:03.159
same things faster. It's about what entirely

00:11:03.159 --> 00:11:05.639
new things become possible now. The only real

00:11:05.639 --> 00:11:07.080
question left is what are you going to build

00:11:07.080 --> 00:11:09.779
with it? Thanks for joining us on this deep dive.

00:11:09.980 --> 00:11:12.279
We hope this leaves you feeling truly well informed.
