WEBVTT

00:00:00.000 --> 00:00:04.599
A fully finished 23 second video intro beat generated

00:00:04.599 --> 00:00:07.639
from a single slash goal prompt. Wow. There was

00:00:07.639 --> 00:00:10.099
no step by step babysitting involved at all.

00:00:10.199 --> 00:00:12.480
It took one hour and 15 minutes to complete.

00:00:12.660 --> 00:00:14.960
Yeah, that is wild. And it didn't even max out

00:00:14.960 --> 00:00:17.679
the one million context window. Right. And it

00:00:17.679 --> 00:00:19.460
used three hundred fifty seven thousand tokens

00:00:19.460 --> 00:00:22.579
to do it. This completely shifts how we have

00:00:22.579 --> 00:00:25.019
to look at open source capabilities today. Welcome

00:00:25.019 --> 00:00:27.399
to the deep dive. We are very glad you are here

00:00:27.399 --> 00:00:30.460
with us. Absolutely. Today, our mission is to

00:00:30.460 --> 00:00:34.200
unpack a really rigorous hands -on guide. We

00:00:34.200 --> 00:00:38.460
are testing the open source GLM 5 .2 model, and

00:00:38.460 --> 00:00:40.460
we're doing this directly inside Cloud Code.

00:00:40.640 --> 00:00:42.719
We are going to walk through some actual real

00:00:42.719 --> 00:00:45.780
world tests here. We pit GLM 5 .2 against the

00:00:45.780 --> 00:00:48.579
heavyweight OPUS 4 .8. Right. We will outline

00:00:48.579 --> 00:00:51.119
a fascinating 80 -20 workflow split for your

00:00:51.119 --> 00:00:53.780
daily tasks. We will also break down the surprisingly

00:00:53.780 --> 00:00:56.359
simple non -technical setup. Finally, we will

00:00:56.359 --> 00:00:58.939
reveal why this specific shift makes open source

00:00:58.939 --> 00:01:02.200
AI impossible to ignore. Beat. But to really

00:01:02.200 --> 00:01:03.899
understand why this matters, we have to look

00:01:03.899 --> 00:01:06.590
at the data. Yeah, what actually happened when

00:01:06.590 --> 00:01:08.730
these two models went head -to -head in the wild?

00:01:08.930 --> 00:01:10.730
That is the most important question. There were

00:01:10.730 --> 00:01:14.129
several highly practical tests run here. Right.

00:01:14.129 --> 00:01:17.150
We wanted to see if GLM 5 .2 could genuinely

00:01:17.150 --> 00:01:20.189
compete on real work. Or did it just feel good

00:01:20.189 --> 00:01:22.629
because it was the cheaper option? Well, the

00:01:22.629 --> 00:01:25.969
first test was a one -shot web design task. The

00:01:25.969 --> 00:01:29.250
exact same prompt was used for both models. They

00:01:29.250 --> 00:01:31.290
both had to build a fully functional landing

00:01:31.290 --> 00:01:34.459
page. GLM finished the job in 3 minutes and 59

00:01:34.459 --> 00:01:38.060
seconds. Wow. And Opus took 14 minutes and 59

00:01:38.060 --> 00:01:40.560
seconds to finish. That is an absolutely massive

00:01:40.560 --> 00:01:43.420
difference in time. And GLM was around five times

00:01:43.420 --> 00:01:45.959
cheaper to run. Yeah. But the really fascinating

00:01:45.959 --> 00:01:48.700
part to me is the actual quality. Exactly. These

00:01:48.700 --> 00:01:51.859
were not clunky early 2000s wireframes. Right.

00:01:52.480 --> 00:01:55.019
Both results were highly polished and visually

00:01:55.019 --> 00:01:58.140
impressive. The pages had movement built into

00:01:58.140 --> 00:02:00.519
the design. They had properly structured sections

00:02:00.519 --> 00:02:02.959
and very clear calls to action. This was not

00:02:02.959 --> 00:02:04.959
some flimsy budget result you would have to rewrite

00:02:04.959 --> 00:02:08.099
anyway. No. It was a completely viable starting

00:02:08.099 --> 00:02:10.840
point for a real project. Looking at them side

00:02:10.840 --> 00:02:13.280
by side, you would have a hard time justifying

00:02:13.280 --> 00:02:15.819
that massive price tag for Opus. When the final

00:02:15.819 --> 00:02:18.520
result is that close, the price starts to matter

00:02:18.520 --> 00:02:21.400
heavily. It really does. But we didn't stop at

00:02:21.400 --> 00:02:24.919
visual design. Next up was hard coding. So GLM

00:02:24.919 --> 00:02:28.259
crushes basic web layout, which is great. But

00:02:28.259 --> 00:02:30.879
visual spacing is one thing. Logical reasoning

00:02:30.879 --> 00:02:33.639
is another entirely. Absolutely. That naturally

00:02:33.639 --> 00:02:37.860
makes you wonder what happened. Less forgiving.

00:02:38.039 --> 00:02:40.800
It was a complex coding assignment was evaluated

00:02:40.800 --> 00:02:43.699
using codecs. Using codecs kept the evaluation

00:02:43.699 --> 00:02:47.219
entirely neutral and fair. Exactly. And GLM 5

00:02:47.219 --> 00:02:50.159
.2 actually did very well here. It handled the

00:02:50.159 --> 00:02:52.530
bulk of the task perfectly. But. Opus caught

00:02:52.530 --> 00:02:55.569
one very subtle edge case that GLM missed entirely.

00:02:57.610 --> 00:02:59.849
tricky database values, it caught the subtle

00:02:59.849 --> 00:03:03.009
difference between true and one, or one and one

00:03:03.009 --> 00:03:04.810
point now. Oh, wow. If you have ever stared at

00:03:04.810 --> 00:03:06.729
a broken database at two in the morning, you

00:03:06.729 --> 00:03:09.169
know exactly why that matters. Yeah, bugs always

00:03:09.169 --> 00:03:11.490
live in those tiny details. It feels like a very

00:03:11.490 --> 00:03:14.050
natural division of labor. GLM is kind of like

00:03:14.050 --> 00:03:16.569
the fast junior developer. It writes the bulk

00:03:16.569 --> 00:03:20.110
of the code very quickly. And Opus is the senior

00:03:20.110 --> 00:03:22.590
architect catching the tricky database bugs before

00:03:22.590 --> 00:03:25.500
they launch. That is a perfect analogy for how

00:03:25.500 --> 00:03:29.379
this works. GLN is fantastic for fast implementation

00:03:29.379 --> 00:03:32.780
and initial scaffolding. Right. And OPUS is necessary

00:03:32.780 --> 00:03:36.039
for careful reasoning and tricky edge case handling.

00:03:36.280 --> 00:03:39.139
Exactly. I have to be honest here. I still wrestle

00:03:39.139 --> 00:03:42.629
with prompt drift myself. beat. You know, you

00:03:42.629 --> 00:03:45.530
give an AI a long list of instructions. And by

00:03:45.530 --> 00:03:48.289
step three, it completely forgets step one. So

00:03:48.289 --> 00:03:50.110
watching an open source model hold its focus

00:03:50.110 --> 00:03:52.370
over complex instructions is deeply impressive

00:03:52.370 --> 00:03:54.409
to me. It really is an incredible leap forward.

00:03:54.469 --> 00:03:56.689
Right. But there's a specific speed quirk we

00:03:56.689 --> 00:03:58.969
definitely should mention. Right. GLM is not

00:03:58.969 --> 00:04:01.569
always the faster model. Yeah. On a highly creative

00:04:01.569 --> 00:04:05.830
HTML task, GLM took 35 minutes. Opus finished

00:04:05.830 --> 00:04:08.610
that exact same task in only 11 minutes. The

00:04:08.610 --> 00:04:10.870
rule here seems to revolve entirely around reasoning.

00:04:11.050 --> 00:04:13.349
Exactly. The more reasoning required by the prompt,

00:04:13.569 --> 00:04:16.569
the slower GLM feels in practice. Execution -heavy

00:04:16.569 --> 00:04:19.750
tasks are incredibly fast. Planning and creative

00:04:19.750 --> 00:04:22.629
taste just take much longer. They both succeeded

00:04:22.629 --> 00:04:26.170
on the creative task, though. GLM built an interactive

00:04:26.170 --> 00:04:28.449
page called the Anatomy of Attention. Right.

00:04:28.529 --> 00:04:30.689
It featured moving background elements and token

00:04:30.689 --> 00:04:34.750
visuals. and Opus built the life of a Death Star.

00:04:35.050 --> 00:04:37.430
It was a beautifully structured timeline -style

00:04:37.430 --> 00:04:40.149
page. Both were excellent one -shot results.

00:04:40.689 --> 00:04:43.930
The final test here was deep research. This utilized

00:04:43.930 --> 00:04:46.910
a storm -style workflow. For anyone unfamiliar,

00:04:47.069 --> 00:04:49.250
that just means multiple sub -agents work together

00:04:49.250 --> 00:04:51.720
using different personas. Right. They compiled

00:04:51.720 --> 00:04:55.420
a very rich, highly detailed HTML report. It

00:04:55.420 --> 00:04:58.120
combined different expert lenses and actively

00:04:58.120 --> 00:05:01.139
challenged its own assumptions. And GLM managed

00:05:01.139 --> 00:05:04.339
this complex agent workflow beautifully. The

00:05:04.339 --> 00:05:06.800
significantly lower cost changes the math entirely

00:05:06.800 --> 00:05:09.240
for us here. Running multiple autonomous agents

00:05:09.240 --> 00:05:11.759
suddenly becomes justifiable for daily work.

00:05:11.959 --> 00:05:13.579
Yeah, you can test 10 different angles without

00:05:13.579 --> 00:05:16.100
burning through your cash. But wait, if GLM struggles

00:05:16.100 --> 00:05:17.879
with heavy reasoning, doesn't that make it too

00:05:17.879 --> 00:05:20.790
risky for professional work? Well, not necessarily.

00:05:21.410 --> 00:05:23.629
It is really about matching the model to the

00:05:23.629 --> 00:05:25.589
risk profile. You have to match the specific

00:05:25.589 --> 00:05:28.269
tool to the specific task. Right. Scaffolding

00:05:28.269 --> 00:05:30.870
a basic website layout carries very low risk.

00:05:31.170 --> 00:05:33.949
Exactly. But migrating a production database

00:05:33.949 --> 00:05:37.029
carries extremely high risk. You always use the

00:05:37.029 --> 00:05:38.829
heavy reasoning model for the high risk work.

00:05:38.949 --> 00:05:41.149
So you don't compare final answers, you compare

00:05:41.149 --> 00:05:43.569
your acceptable risk for the task. Precisely.

00:05:43.910 --> 00:05:46.470
Which leads us to a much broader mindset shift

00:05:46.470 --> 00:05:49.319
we need to discuss. Yeah. The real skill today

00:05:49.319 --> 00:05:53.459
is knowing exactly when to use GLM 5 .2. AI work

00:05:53.459 --> 00:05:55.980
is not a simple contest anymore. It is not about

00:05:55.980 --> 00:05:59.519
the best model wins mentality. No, it is a multi

00:05:59.519 --> 00:06:03.120
-step, highly iterative process. Most work involves

00:06:03.120 --> 00:06:05.339
researching, drafting, and continuous testing.

00:06:05.699 --> 00:06:07.839
Then you edit, make decisions, and finally ship

00:06:07.839 --> 00:06:10.500
the product. Each of those steps demands a completely

00:06:10.500 --> 00:06:12.819
different level of intelligence. Exactly. Let's

00:06:12.819 --> 00:06:15.379
break down the 80 % work first. This is the natural

00:06:15.379 --> 00:06:19.060
domain of GLM 5 .2. It easily handles first drafts

00:06:19.060 --> 00:06:21.480
and gathering initial research. It does basic

00:06:21.480 --> 00:06:24.000
web design and cleans up your messy notes. Right.

00:06:24.060 --> 00:06:26.360
It generates your initial options for a project.

00:06:27.300 --> 00:06:29.899
Cost matters immensely here because you are doing

00:06:29.899 --> 00:06:32.300
constant iterative testing. You are exploring

00:06:32.300 --> 00:06:34.540
different ideas. You don't want maximum pricing

00:06:34.540 --> 00:06:37.019
when you are just sketching a rough draft. Exactly.

00:06:37.459 --> 00:06:39.459
Then we have the remaining 10 to 20 percent work.

00:06:39.819 --> 00:06:42.740
This is the domain of OPUS 4 .8. This is the

00:06:42.740 --> 00:06:45.720
heavy thinking. Final reasoning, edge case review,

00:06:45.939 --> 00:06:48.800
and high -risk coding tasks. The system context

00:06:48.800 --> 00:06:51.420
is crucial here, too. The harness really matters.

00:06:51.579 --> 00:06:53.839
Right. For clarity, a harness is just a digital

00:06:53.839 --> 00:06:56.839
workspace where the AI can use tools. Cloud Code

00:06:56.839 --> 00:06:59.040
provides that powerful harness for us. It lets

00:06:59.040 --> 00:07:01.939
the model read local files and actively run terminal

00:07:01.939 --> 00:07:05.680
commands. So GLM is not just a cheap OPUS replacement.

00:07:05.980 --> 00:07:08.540
It is a cheaper worker operating inside the exact

00:07:08.540 --> 00:07:11.680
same system. Mm -hmm. Beat. But aren't we just

00:07:11.680 --> 00:07:13.660
complicating things by juggling multiple workers

00:07:13.660 --> 00:07:16.019
for a single project? It might seem that way

00:07:16.019 --> 00:07:18.629
at first glance. But using one expensive model

00:07:18.629 --> 00:07:21.629
for basic sorting tasks burns cash needlessly.

00:07:22.269 --> 00:07:24.029
Right. Splitting the work optimizes both your

00:07:24.029 --> 00:07:26.810
budget and the applied brain power. You get significantly

00:07:26.810 --> 00:07:29.149
better efficiency across the board. Let the cheap

00:07:29.149 --> 00:07:31.730
model gather the lumber. Let the expensive model

00:07:31.730 --> 00:07:34.129
build the house. Exactly. It is about working

00:07:34.129 --> 00:07:36.709
much smarter within the environment you already

00:07:36.709 --> 00:07:39.459
use every day. Placeholder for sponsor reads

00:07:39.459 --> 00:07:42.680
skip promotional text from newsletter, use provided

00:07:42.680 --> 00:07:45.620
sponsor copy. Hearing about those massive price

00:07:45.620 --> 00:07:47.980
differences makes me want to try this immediately.

00:07:48.379 --> 00:07:50.540
Yeah. But whenever we talk about open source

00:07:50.540 --> 00:07:52.920
routing, it usually involves spinning up servers

00:07:52.920 --> 00:07:56.319
or Docker containers. Right. Is this actually

00:07:56.319 --> 00:08:00.480
feasible for a normal user to set up? It absolutely

00:08:00.480 --> 00:08:03.459
is. You are not learning a brand new tool here.

00:08:03.459 --> 00:08:05.740
You are just routing the model call to a different

00:08:05.740 --> 00:08:08.379
location. You simply route it to z .ai instead

00:08:08.379 --> 00:08:10.639
of Anthropix server. Exactly. Step one is going

00:08:10.639 --> 00:08:14.439
to the z .ai API console online. You can actually

00:08:14.439 --> 00:08:16.620
test the model out right there first. They have

00:08:16.620 --> 00:08:19.600
3D generation tools and small mini games available.

00:08:19.819 --> 00:08:22.019
It gives you a great feel for how the model responds.

00:08:22.180 --> 00:08:24.040
Yeah. Then you choose your preferred billing

00:08:24.040 --> 00:08:26.740
method. You can pay per token or choose a set

00:08:26.740 --> 00:08:29.720
monthly plan. Those monthly plans run roughly

00:08:29.720 --> 00:08:34.860
$16, $64, or $144. There's some really great

00:08:34.860 --> 00:08:37.360
practical advice on this point. Keep your Claude

00:08:37.360 --> 00:08:40.600
plan active and just add z .ai on the side. Yeah,

00:08:40.600 --> 00:08:42.279
you don't have to choose just one platform. You

00:08:42.279 --> 00:08:44.340
use both of them together. Next, you generate

00:08:44.340 --> 00:08:47.580
a secure API key inside the console. This brings

00:08:47.580 --> 00:08:50.740
us to editing a specific local file. It is called

00:08:50.740 --> 00:08:55.559
settings .local .json. You find this file sitting

00:08:55.559 --> 00:08:58.179
inside the .clod folder on your machine. What

00:08:58.179 --> 00:09:00.440
is brilliant here is that you aren't installing

00:09:00.440 --> 00:09:03.759
heavy new software. Clod code is already looking

00:09:03.759 --> 00:09:06.399
for a brain at a specific web address. Right.

00:09:06.539 --> 00:09:08.980
All you were doing in that settings file is changing

00:09:08.980 --> 00:09:11.179
the address book. You pointed away from Anthropic

00:09:11.179 --> 00:09:14.549
servers. You route the base URL directly to z

00:09:14.549 --> 00:09:18.049
.ai instead. You leave the Anthropic API key

00:09:18.049 --> 00:09:21.409
completely blank, and you put your new z .ai

00:09:21.409 --> 00:09:24.110
key in as the auth token. Finally, you set the

00:09:24.110 --> 00:09:26.950
specific model name you want to use. It is surprisingly

00:09:26.950 --> 00:09:29.190
simple. But there is a genius trick we need to

00:09:29.190 --> 00:09:31.389
highlight here. You create two entirely separate

00:09:31.389 --> 00:09:33.429
folders on your local machine. Yeah, this is

00:09:33.429 --> 00:09:36.070
incredibly clever. One folder is named slash

00:09:36.070 --> 00:09:39.389
gm. The other folder is named slash opus. You

00:09:39.389 --> 00:09:41.909
put the custom routing configuration file only

00:09:41.909 --> 00:09:45.830
inside the slash glm folder. Wait, so just by

00:09:45.830 --> 00:09:48.470
changing directories in your terminal, you instantly

00:09:48.470 --> 00:09:51.269
switch the brain powering your workspace, zero

00:09:51.269 --> 00:09:54.169
friction. That is brilliant. You open clod code

00:09:54.169 --> 00:09:56.330
in the glm folder for your rough drafts. You

00:09:56.330 --> 00:09:58.610
open it in the opus folder for your final code

00:09:58.610 --> 00:10:01.529
reviews. Exactly. But I have to ask a security

00:10:01.529 --> 00:10:04.769
question here. If I put my API key in a local

00:10:04.769 --> 00:10:07.570
JSON file, isn't there a risk I accidentally

00:10:07.570 --> 00:10:09.990
share it? Oh, there is absolutely a major risk

00:10:09.990 --> 00:10:12.149
there. That is why you must be extremely careful

00:10:12.149 --> 00:10:15.149
with this file. Right. Treat that API key exactly

00:10:15.149 --> 00:10:17.929
like a banking password. Keep it completely out

00:10:17.929 --> 00:10:21.289
of public repos or shared team screenshots. Treat

00:10:21.289 --> 00:10:23.389
the key like a password. Never push your local

00:10:23.389 --> 00:10:25.950
settings to the public. Exactly. Keep it secure

00:10:25.950 --> 00:10:28.649
and the entire workflow remains brilliant and

00:10:28.649 --> 00:10:30.750
safe. Now, we really need to look at the bigger

00:10:30.750 --> 00:10:34.710
picture here. Why does GLM 5 .2 make open source

00:10:34.710 --> 00:10:37.799
AI models impossible to ignore right now? Well,

00:10:37.980 --> 00:10:40.120
open source is finally practical for the daily

00:10:40.120 --> 00:10:42.820
professional workday. It is no longer just a

00:10:42.820 --> 00:10:45.299
theoretical weekend project for developers. The

00:10:45.299 --> 00:10:47.639
massive scale of this model is truly staggering

00:10:47.639 --> 00:10:51.799
to think about. GLM 5 .2 operates with around

00:10:51.799 --> 00:10:57.450
753 to 756 billion parameters. Yeah. Let's pause

00:10:57.450 --> 00:10:59.990
on that number for a second. That is far too

00:10:59.990 --> 00:11:02.230
massive to run locally on a normal computer.

00:11:02.509 --> 00:11:04.850
You would need serious, incredibly expensive

00:11:04.850 --> 00:11:07.470
server hardware in your house. That is exactly

00:11:07.470 --> 00:11:10.370
why API renting through providers is necessary.

00:11:10.529 --> 00:11:12.950
It is the pragmatic middle ground for users right

00:11:12.950 --> 00:11:15.009
now. You get all the massive power without the

00:11:15.009 --> 00:11:17.269
massive infrastructure headache. This brings

00:11:17.269 --> 00:11:20.610
us to the underlying cost of tokens. Tokens are

00:11:20.610 --> 00:11:23.309
just tiny puzzle pieces of text. It is kind of

00:11:23.309 --> 00:11:26.370
like stacking logo blocks of data. Exactly. The

00:11:26.370 --> 00:11:28.929
AI processes these tiny pieces to understand

00:11:28.929 --> 00:11:31.090
and generate language. When you look closely

00:11:31.090 --> 00:11:33.250
at the token pricing, the math is undeniable.

00:11:33.870 --> 00:11:37.250
Right. Opus costs $5 in and $25 out per 1 million

00:11:37.250 --> 00:11:42.070
tokens. GLM sits at roughly $1 .40 in and $4

00:11:42.070 --> 00:11:46.789
.40 out. Two secs silence. Whoa. I mean, imagine

00:11:46.789 --> 00:11:48.870
scaling to a billion queries without bankrupt.

00:11:48.840 --> 00:11:52.480
yourself. That five times price difference changes

00:11:52.480 --> 00:11:55.480
everything about how we build software. It is

00:11:55.480 --> 00:11:57.919
especially critical for complex agent workflows.

00:11:58.480 --> 00:12:01.259
These autonomous models constantly read folders,

00:12:01.759 --> 00:12:04.379
revise their own work, and call subagents. They

00:12:04.379 --> 00:12:06.639
burn tokens invisibly in the background while

00:12:06.639 --> 00:12:09.580
you drink your coffee. Exactly. Significantly

00:12:09.580 --> 00:12:12.019
lower prices encourage you to experiment freely

00:12:12.019 --> 00:12:14.759
without watching the meter. There are also performance

00:12:14.759 --> 00:12:18.350
benchmarks to consider. GLM actively beats GPT

00:12:18.350 --> 00:12:22.309
5 .5 and Opus 4 .8 on some specific software

00:12:22.309 --> 00:12:25.029
benchmarks. Right, but there is a very crucial

00:12:25.029 --> 00:12:27.210
point about those numbers. Benchmarks are just

00:12:27.210 --> 00:12:29.580
a signal. They simply tell you a model is worth

00:12:29.580 --> 00:12:32.820
testing yourself. Daily usefulness on your own

00:12:32.820 --> 00:12:35.759
specific files is the only real test. Yeah, how

00:12:35.759 --> 00:12:37.879
does it handle your unique code and your messy

00:12:37.879 --> 00:12:40.279
personal notes? There is also a major strategic

00:12:40.279 --> 00:12:42.539
advantage we should point out here. It acts as

00:12:42.539 --> 00:12:45.639
a necessary hedge. Closed platforms change their

00:12:45.639 --> 00:12:47.700
rules and their pricing structures all the time.

00:12:47.879 --> 00:12:51.100
A vital feature can suddenly move behind an expensive

00:12:51.100 --> 00:12:53.559
paywall tomorrow. Learning how to route open

00:12:53.559 --> 00:12:55.879
source models protects your daily workflow from

00:12:55.879 --> 00:12:58.519
those sudden shifts. But think about this dependency

00:12:58.519 --> 00:13:01.360
for a moment. If open source models are this

00:13:01.360 --> 00:13:04.820
huge, won't we always be dependent on cloud providers

00:13:04.820 --> 00:13:07.830
to run them anyway? Yes. For now, that is the

00:13:07.830 --> 00:13:10.730
reality. Yes. API rental is the necessary middle

00:13:10.730 --> 00:13:13.529
ground today. Right. But it still fundamentally

00:13:13.529 --> 00:13:16.149
breaks the monopoly. You aren't relying entirely

00:13:16.149 --> 00:13:19.870
on one single closed ecosystems rules anymore.

00:13:20.059 --> 00:13:22.779
It's about having a backup plan when the closed

00:13:22.779 --> 00:13:25.940
platforms change their rules. Exactly. You maintain

00:13:25.940 --> 00:13:28.200
your professional options and your creative freedom.

00:13:28.460 --> 00:13:30.100
So let's bring all of these different pieces

00:13:30.100 --> 00:13:32.820
together for you. We are looking at a fundamental

00:13:32.820 --> 00:13:34.960
shift in how we approach knowledge work. The

00:13:34.960 --> 00:13:37.820
era of treating open source models as clunky

00:13:37.820 --> 00:13:40.850
budget alternatives is officially over. The true

00:13:40.850 --> 00:13:43.789
modern skill isn't finding the one perfect AI

00:13:43.789 --> 00:13:46.049
to do everything. You are essentially becoming

00:13:46.049 --> 00:13:48.149
an orchestrator of multiple minds. Right. You

00:13:48.149 --> 00:13:50.389
have to know exactly when to deploy a fast, cheap

00:13:50.389 --> 00:13:54.230
worker. You use GLM 5 .2 for the heavy, repetitive

00:13:54.230 --> 00:13:56.129
lifting and the rough drafting. And you must

00:13:56.129 --> 00:13:58.769
know exactly when to bring in the expensive protectionist.

00:13:59.029 --> 00:14:01.870
You call on Opus 4 .8 to close the deal and ensure

00:14:01.870 --> 00:14:04.730
total precision. It is a beautiful synergy when

00:14:04.730 --> 00:14:06.590
you set it up correctly. We want you to look

00:14:06.590 --> 00:14:10.100
at your own daily tasks today. Identify the 80

00:14:10.100 --> 00:14:12.940
% work you are currently doing manually. Where

00:14:12.940 --> 00:14:15.419
are you overpaying for AI compute right now?

00:14:15.879 --> 00:14:18.899
Where could a fast, highly capable open source

00:14:18.899 --> 00:14:21.740
model handle the drafting and sorting for you?

00:14:22.279 --> 00:14:24.720
Finding that exact balance will completely change

00:14:24.720 --> 00:14:27.740
your daily productivity. Thank you for joining

00:14:27.740 --> 00:14:30.460
us on this deep dive today. We always appreciate

00:14:30.460 --> 00:14:32.840
you spending your valuable time with us. It has

00:14:32.840 --> 00:14:35.629
been a truly fantastic exploration today. We

00:14:35.629 --> 00:14:37.809
want to leave you with one final provocative

00:14:37.809 --> 00:14:40.429
thought to mull over. If an open source model

00:14:40.429 --> 00:14:43.230
is already handling 80 % of the workflow today

00:14:43.230 --> 00:14:45.830
at a fraction of the cost, what happens to the

00:14:45.830 --> 00:14:48.309
value of closed models when open source naturally

00:14:48.309 --> 00:14:51.830
creeps up to cover 90 or 95 %? Does intelligence

00:14:51.830 --> 00:14:53.230
become essentially free?
