WEBVTT

00:00:00.000 --> 00:00:02.339
Welcome to the deep dive. Yeah, thanks for having

00:00:02.339 --> 00:00:05.860
me. I have to say, the debate isn't about which

00:00:05.860 --> 00:00:08.839
AI coding tool is better anymore. It's really

00:00:08.839 --> 00:00:10.960
about knowing exactly which artificial brain

00:00:10.960 --> 00:00:13.140
to hire for the job right in front of you. I

00:00:13.140 --> 00:00:15.410
completely agree. I've been, um... I've been

00:00:15.410 --> 00:00:17.510
thinking a lot about this recently. We spend

00:00:17.510 --> 00:00:20.449
so much energy treating these models like rival

00:00:20.449 --> 00:00:22.969
sports teams. Oh, totally. Like it's a zero -sum

00:00:22.969 --> 00:00:25.789
game. Exactly. But today, we're taking a much

00:00:25.789 --> 00:00:28.609
more practical approach. We are examining an

00:00:28.609 --> 00:00:32.530
extensive, rigorous side -by -side test comparing...

00:00:35.739 --> 00:00:38.320
Right, and our mission here isn't to crown a

00:00:38.320 --> 00:00:41.159
singular champion. We want to unpack their unique

00:00:41.159 --> 00:00:43.820
underlying philosophies. Yeah, and run their

00:00:43.820 --> 00:00:46.259
real -world numbers on a complex research task

00:00:46.259 --> 00:00:48.719
and figure out how to combine them into one seamless

00:00:48.719 --> 00:00:50.659
workflow. It's gonna be a really fun breakdown.

00:00:51.280 --> 00:00:53.659
But before we get completely lost in the weeds,

00:00:53.740 --> 00:00:55.840
we should probably establish our baseline. For

00:00:55.840 --> 00:00:57.859
anyone listening, what is a coding agent? Let's

00:00:57.859 --> 00:01:00.380
just say it's AI that writes, tests, and edits

00:01:00.380 --> 00:01:02.719
software right on your computer. Yeah, that is

00:01:02.719 --> 00:01:05.299
the perfect distillation. Because we aren't talking

00:01:05.299 --> 00:01:07.439
about simple autocomplete anymore. You know,

00:01:07.640 --> 00:01:09.879
we are talking about a digital co -worker that

00:01:09.879 --> 00:01:13.159
can execute highly complex multi -step instructions

00:01:13.159 --> 00:01:15.939
across your entire file system. It's a fascinating

00:01:15.939 --> 00:01:19.239
shift in how we interact with our machines. But

00:01:19.239 --> 00:01:21.939
to really understand how these two specific tools

00:01:21.939 --> 00:01:24.459
diverge, we first need to look at the ground

00:01:24.459 --> 00:01:26.719
they actually share. Right. Framing them not

00:01:26.719 --> 00:01:29.579
as bitter competitors, but as distinct, highly

00:01:29.579 --> 00:01:33.319
specialized tool sets. Exactly. That is a crucial

00:01:33.319 --> 00:01:36.519
mindset shift for developers right now. So let's

00:01:36.519 --> 00:01:40.219
start by looking at Npropix Claude code. Yeah.

00:01:40.299 --> 00:01:42.180
So Claude is fundamentally designed from the

00:01:42.180 --> 00:01:44.939
ground up to act as a thinking partner. Like,

00:01:45.260 --> 00:01:48.120
it excels at complex architectural planning,

00:01:48.780 --> 00:01:50.959
intricate front -end design, and these heavily

00:01:50.959 --> 00:01:53.359
customized workflows. And what I find incredibly

00:01:53.359 --> 00:01:55.700
interesting about Claude's philosophy is its

00:01:55.700 --> 00:01:58.200
willingness to challenge you. Oh yeah, the pushback

00:01:58.200 --> 00:02:00.500
feature. Right. If you give it instructions that

00:02:00.500 --> 00:02:02.439
seem structurally flawed, it actually pushes

00:02:02.439 --> 00:02:04.879
back. It makes you pause and reconsider your

00:02:04.879 --> 00:02:07.060
approach before it writes a single line of code.

00:02:07.219 --> 00:02:09.439
Which is so valuable. And currently, it's running

00:02:09.439 --> 00:02:12.620
on their Opus, Sonnet, and Haiku models, depending

00:02:12.620 --> 00:02:14.969
on, you know, complexity of the task you give

00:02:14.969 --> 00:02:17.610
it. Right. Then on the other side of the spectrum,

00:02:17.830 --> 00:02:20.810
we have OpenAI's chat GPT codex. And we need

00:02:20.810 --> 00:02:22.409
to clarify this immediately, by the way. This

00:02:22.409 --> 00:02:25.270
is a brand new, highly advanced agent system.

00:02:25.650 --> 00:02:28.370
This is not the original codex model from way

00:02:28.370 --> 00:02:31.469
back in 2021. Right, right. This modern iteration

00:02:31.469 --> 00:02:34.389
of codex is engineered for pure unadulterated

00:02:34.389 --> 00:02:37.469
speed. Yes, absolutely. It doesn't want to debate

00:02:37.469 --> 00:02:40.050
architecture with you. It focuses entirely on

00:02:40.050 --> 00:02:42.789
rapid execution, designed to take well -defined

00:02:42.789 --> 00:02:45.310
tasks straight to the finish line as fast as

00:02:45.310 --> 00:02:48.129
possible. It is all about momentum. It runs on

00:02:48.129 --> 00:02:51.189
the primary GPT codecs model, but it also offers

00:02:51.189 --> 00:02:54.520
a lighter, highly responsive GPT codec spark

00:02:54.520 --> 00:02:57.060
version for when you just need blistering fast

00:02:57.060 --> 00:02:59.659
simple execution exactly So if we think about

00:02:59.659 --> 00:03:02.039
it in human terms Claude is like the thoughtful

00:03:02.039 --> 00:03:04.419
Architect asking you a dozen questions about

00:03:04.419 --> 00:03:07.000
the foundation while codex is the speedy form

00:03:07.000 --> 00:03:08.840
and just getting the bricks laid Oh, that's a

00:03:08.840 --> 00:03:11.120
great way to put it. That's exactly it. But despite

00:03:11.120 --> 00:03:13.759
those vastly different personalities They actually

00:03:13.759 --> 00:03:16.400
share a massive foundational baseline, right?

00:03:16.400 --> 00:03:18.960
For instance, both of them feature incredibly

00:03:18.960 --> 00:03:21.639
robust local code editing capabilities Yeah,

00:03:21.680 --> 00:03:25.000
and they both offer dedicated standalone desktop

00:03:25.000 --> 00:03:27.580
apps for Mac and Windows environments. They also

00:03:27.580 --> 00:03:30.280
integrate flawlessly into your existing setup.

00:03:30.780 --> 00:03:34.039
Both plug right into VS Code via dedicated extensions.

00:03:34.960 --> 00:03:37.259
And they both handle standard command line interface

00:03:37.259 --> 00:03:39.560
workflows without breaking a sweat. They both

00:03:39.560 --> 00:03:42.219
allow you to delegate those heavy compute -intensive

00:03:42.219 --> 00:03:44.580
tasks up to the cloud. And they share an open

00:03:44.580 --> 00:03:46.990
architecture for extensibility. meaning they

00:03:46.990 --> 00:03:49.689
both feature diverse marketplace plugins and

00:03:49.689 --> 00:03:52.349
utilize highly reusable markdown skills. They

00:03:52.349 --> 00:03:55.189
also both utilize MCP support. For those listening,

00:03:55.370 --> 00:03:58.289
MCP support is a protocol letting AI tools securely

00:03:58.289 --> 00:04:00.409
connect to your local files. Yeah, it essentially

00:04:00.409 --> 00:04:03.050
acts as a secure bridge between the AI's brain

00:04:03.050 --> 00:04:05.389
and your local hard drive. So with all these

00:04:05.389 --> 00:04:07.389
different versions, do their basic environments

00:04:07.389 --> 00:04:10.189
actually overlap? They really do. The underlying

00:04:10.189 --> 00:04:13.139
architecture is surprisingly compatible. Because

00:04:13.139 --> 00:04:15.300
they both hook directly into your standard desktop

00:04:15.300 --> 00:04:17.720
environment and utilize intelligent sub -agents

00:04:17.720 --> 00:04:20.579
for task management, moving between them isn't

00:04:20.579 --> 00:04:22.660
nearly as jarring as you might expect. So yes,

00:04:22.740 --> 00:04:26.220
both handle local editing, desktop apps, and

00:04:26.220 --> 00:04:29.459
sub -agents seamlessly. Exactly. The shared foundation

00:04:29.459 --> 00:04:32.740
is incredibly solid. The true divergence only

00:04:32.740 --> 00:04:35.160
really happens in how they build their specific

00:04:35.160 --> 00:04:38.300
workflows on top of that foundation. Claude's

00:04:38.300 --> 00:04:40.360
obsessive control over your workflow is great

00:04:40.360 --> 00:04:42.740
for planning, so I want to explore exactly how

00:04:42.740 --> 00:04:45.079
it gives you that granular, almost surgical control

00:04:45.079 --> 00:04:47.339
over your daily projects. This is where we have

00:04:47.339 --> 00:04:49.939
to talk about custom workflows. Yeah. Because

00:04:49.939 --> 00:04:52.420
this is the area where Claude code really separates

00:04:52.420 --> 00:04:54.639
itself from the rest of the pack. Right. To give

00:04:54.639 --> 00:04:56.620
you an idea of the scale, Codex offers about

00:04:56.620 --> 00:04:59.870
six basic hook events. Claude drops in with around

00:04:59.870 --> 00:05:03.009
30 different hooks. Wow, that is a staggering

00:05:03.009 --> 00:05:05.949
difference in operational capability. Hups, for

00:05:05.949 --> 00:05:07.949
anyone unfamiliar, are essentially automatic

00:05:07.949 --> 00:05:10.329
triggers. Yeah, little scripts that fire off

00:05:10.329 --> 00:05:13.269
invisibly whenever a very specific event happens

00:05:13.269 --> 00:05:15.350
within your coding session. And Claude gives

00:05:15.350 --> 00:05:18.170
you immense power here. It really does. For example,

00:05:18.290 --> 00:05:20.920
they have a pre -tool use hook. This mechanism

00:05:20.920 --> 00:05:23.379
actually intercepts a command before Claude even

00:05:23.379 --> 00:05:26.000
attempts to execute it. So you can use it to

00:05:26.000 --> 00:05:28.779
validate a destructive command or enforce strict

00:05:28.779 --> 00:05:31.420
security rules before the AI touches your active

00:05:31.420 --> 00:05:33.899
files? Precisely. And then on the flip side,

00:05:34.019 --> 00:05:36.139
you have the post tool use hook, which reacts

00:05:36.139 --> 00:05:38.699
immediately after a tool finishes its execution.

00:05:39.129 --> 00:05:41.329
I have to play devil's advocate here though.

00:05:41.910 --> 00:05:45.069
Why would an average developer need 30 different

00:05:45.069 --> 00:05:47.850
hook events? It sounds like you need a PhD in

00:05:47.850 --> 00:05:50.730
prompt engineering just to configure your workspace.

00:05:50.949 --> 00:05:53.339
I totally get that reaction. It sounds like absolute

00:05:53.339 --> 00:05:55.720
overkill at first glance. But think about the

00:05:55.720 --> 00:05:58.600
practical daily application of that power. OK,

00:05:58.699 --> 00:06:00.579
give me an example. Well, you could set a simple

00:06:00.579 --> 00:06:03.220
post tool use hook to automatically format your

00:06:03.220 --> 00:06:06.579
code through Prettier or ESLint. That means the

00:06:06.579 --> 00:06:09.060
formatting happens instantly and invisibly after

00:06:09.060 --> 00:06:12.310
every single file edit the AI makes. Oh, wow.

00:06:12.370 --> 00:06:14.269
So you don't even have to think about it. I still

00:06:14.269 --> 00:06:17.149
wrestle with prompt drift myself. So having a

00:06:17.149 --> 00:06:19.829
hook automatically pull the AI back on track

00:06:19.829 --> 00:06:22.149
sounds incredible. It really is. It just keeps

00:06:22.149 --> 00:06:25.689
the agent hyper focused on the actual goal. You

00:06:25.689 --> 00:06:28.850
can literally force Claude to request an entirely

00:06:28.850 --> 00:06:31.959
separate test suite from a sub agent. preventing

00:06:31.959 --> 00:06:35.519
it from trying to haphazardly write its own tests

00:06:35.519 --> 00:06:38.199
in the same breath as the feature code. Exactly.

00:06:38.720 --> 00:06:41.199
And speaking of subagents, both of these tools

00:06:41.199 --> 00:06:44.180
support them, but Claude Code ships with three

00:06:44.180 --> 00:06:47.180
distinct built -in types right out of the box.

00:06:47.480 --> 00:06:49.759
Yeah, Anthropic gives you the Explore Agent,

00:06:50.040 --> 00:06:52.540
the Plan Agent, and the General Popis Agent.

00:06:52.720 --> 00:06:54.939
Right, and the Explore Agent is designed to just

00:06:54.939 --> 00:06:57.680
quietly read and analyze your entire massive

00:06:57.680 --> 00:07:00.250
code base without altering anything. While the

00:07:00.250 --> 00:07:02.629
plan agent gathers all that deep context to present

00:07:02.629 --> 00:07:05.310
a highly structured architectural strategy. And

00:07:05.310 --> 00:07:07.170
then the general purpose agent steps in to handle

00:07:07.170 --> 00:07:09.790
the actual tasks requiring a mix of both reading

00:07:09.790 --> 00:07:13.129
and writing. And the cool part is Claude intelligently

00:07:13.129 --> 00:07:15.189
automatically delegates between these three.

00:07:15.430 --> 00:07:17.430
So you don't have to manually micromanage the

00:07:17.430 --> 00:07:20.730
handoffs. Not at all. Beyond the subagents Claude

00:07:20.730 --> 00:07:23.569
code features some incredibly powerful slash

00:07:23.569 --> 00:07:27.540
commands. The standout is slash ultra plan. Right.

00:07:27.740 --> 00:07:29.740
When you trigger this, it doesn't just process

00:07:29.740 --> 00:07:32.740
locally, it actually sends the entire deep planning

00:07:32.740 --> 00:07:35.439
stage up to a heavy cloud session. It builds

00:07:35.439 --> 00:07:37.860
a massive structural map and presents it to you

00:07:37.860 --> 00:07:40.759
in a browser UI, allowing you to review the architecture

00:07:40.759 --> 00:07:43.439
before committing. Yeah. And there is also the

00:07:43.439 --> 00:07:46.759
slash ultra review command. This runs a deep

00:07:46.759 --> 00:07:49.759
multi -agent code review across your entire project,

00:07:50.160 --> 00:07:52.980
returning incredibly detailed security and structural

00:07:52.980 --> 00:07:55.420
findings. It is worth noting that Pro and Max

00:07:55.420 --> 00:07:58.160
users only get free runs of this before it becomes

00:07:58.160 --> 00:08:00.899
a build feature, because it is so computationally

00:08:00.899 --> 00:08:03.480
expensive. Very true. We also have to highlight

00:08:03.480 --> 00:08:06.319
the slash loop command. This is fascinating.

00:08:06.360 --> 00:08:08.399
It runs recurring prompts on a set automated

00:08:08.399 --> 00:08:10.699
schedule. It essentially keeps Claude code in

00:08:10.699 --> 00:08:12.899
a state of continuous maintenance mode. Yeah,

00:08:12.899 --> 00:08:15.139
it just sits in the background, handling lingering

00:08:15.139 --> 00:08:18.699
PR comments, resolving frustrating merge conflicts,

00:08:18.959 --> 00:08:20.879
and cleaning up technical debt while you sleep.

00:08:21.199 --> 00:08:24.439
It's essentially an always on tireless developer

00:08:24.439 --> 00:08:27.290
assistant. Claude also recognizes that developers

00:08:27.290 --> 00:08:29.569
aren't always sitting at their keyboards. It

00:08:29.569 --> 00:08:31.810
securely connects to your mobile device via Telegram,

00:08:31.970 --> 00:08:34.899
Discord, or iMessage. Oh, that's wild. Yeah,

00:08:34.899 --> 00:08:37.220
you can literally just text your agent from your

00:08:37.220 --> 00:08:39.539
phone to check on a bill's progress while you're

00:08:39.539 --> 00:08:41.299
standing in line for coffee. It also provides

00:08:41.299 --> 00:08:45.200
an official agent SDK for both Python and TypeScript,

00:08:45.559 --> 00:08:47.580
which is a massive advantage for teams looking

00:08:47.580 --> 00:08:50.539
to build entirely custom internal systems. And

00:08:50.539 --> 00:08:53.419
for massive enterprise users, it integrates securely

00:08:53.419 --> 00:08:57.039
with Bedrock, Vertex AI, and Microsoft Foundry.

00:08:57.159 --> 00:08:59.740
Hooks give you surgical control to automate tasks

00:08:59.740 --> 00:09:03.179
like formatting code instantly. Precisely. It

00:09:03.179 --> 00:09:05.720
fundamentally changes how you trust the AI to

00:09:05.720 --> 00:09:08.539
run autonomously. Now, Claude's obsessive control

00:09:08.539 --> 00:09:11.019
over your workflow is great for planning, but

00:09:11.019 --> 00:09:12.779
sometimes you don't want a planner. Sometimes

00:09:12.779 --> 00:09:14.919
you just want to move as fast as humanly possible.

00:09:15.039 --> 00:09:18.320
Which brings us to OpenAI's philosophy with Codex.

00:09:18.480 --> 00:09:21.039
Right. It is built to move fast without breaking

00:09:21.039 --> 00:09:24.240
your existing work. Codex is fundamentally engineered

00:09:24.240 --> 00:09:28.559
for rapid, fearless execution. And the primary

00:09:28.559 --> 00:09:31.559
way to achieve this incredible speed safely is

00:09:31.559 --> 00:09:33.860
through highly isolated development environments.

00:09:34.399 --> 00:09:37.419
This brings us to native Git work trees. And

00:09:37.419 --> 00:09:40.120
just to clarify the jargon here, Git work trees

00:09:40.120 --> 00:09:42.500
are separate sandbox copies of your code, so

00:09:42.500 --> 00:09:45.460
changes never collide. Right. Now Claude can

00:09:45.460 --> 00:09:47.799
technically utilize work trees too, if you configure

00:09:47.799 --> 00:09:50.940
it manually, but Codex makes this feature entirely

00:09:50.940 --> 00:09:53.789
native. frictionless. It is seamlessly integrated

00:09:53.789 --> 00:09:56.309
right into the core building process. And this

00:09:56.309 --> 00:09:58.850
matters so much when you are trying to run multiple

00:09:58.850 --> 00:10:01.409
complex tasks at the exact same time. Right,

00:10:01.409 --> 00:10:04.149
because each separate task runs in its own completely

00:10:04.149 --> 00:10:07.409
isolated copy of the repository. Exactly. Your

00:10:07.409 --> 00:10:09.789
parallel experimental work never interferes with

00:10:09.789 --> 00:10:12.269
your main stable code base. And if an experiment

00:10:12.269 --> 00:10:14.769
fails entirely, you just discard that specific

00:10:14.769 --> 00:10:16.750
individual work tree. Yeah, you haven't touched

00:10:16.750 --> 00:10:19.200
anything else in your active project. It is an

00:10:19.200 --> 00:10:21.480
incredibly safe way to move fast and break things

00:10:21.480 --> 00:10:23.379
without actually breaking anything that matters.

00:10:23.759 --> 00:10:26.399
Codex also features a highly functional built

00:10:26.399 --> 00:10:29.360
-in desktop browser. After the agent builds a

00:10:29.360 --> 00:10:31.659
new web component, you can view the rendered

00:10:31.659 --> 00:10:34.399
result right there inside the agent interface.

00:10:34.659 --> 00:10:37.179
You can leave visual, pinpoint comments without

00:10:37.179 --> 00:10:39.639
ever switching windows or opening Chrome. It

00:10:39.639 --> 00:10:43.000
keeps the entire build and review loop cleanly

00:10:43.000 --> 00:10:46.419
inside one single unified environment. The computer

00:10:46.419 --> 00:10:49.139
use feature is also notably polished here. You

00:10:49.139 --> 00:10:51.740
can literally ask Codex to physically test a

00:10:51.740 --> 00:10:53.860
local application for you. It actually opens

00:10:53.860 --> 00:10:56.019
the application on your machine. Yeah, it actively

00:10:56.019 --> 00:10:58.740
clicks through the UI elements, finds visual

00:10:58.740 --> 00:11:01.480
bugs, and returns highly structured, detailed

00:11:01.480 --> 00:11:03.659
triage reports. Those reports include the bug

00:11:03.659 --> 00:11:06.019
severity, the expected behavior, and the precise

00:11:06.019 --> 00:11:08.539
steps required to reproduce the issue. It acts

00:11:08.539 --> 00:11:11.340
as an incredibly thorough automated QA process.

00:11:11.720 --> 00:11:14.100
Codex also connects directly and cleanly to your

00:11:14.100 --> 00:11:17.019
GitHub repositories. You simply tag at codex

00:11:17.019 --> 00:11:18.759
in a pull request comment. And it immediately

00:11:18.759 --> 00:11:21.379
spins up a cloud sandbox task right from that

00:11:21.379 --> 00:11:23.759
comment thread. triggering agent work completely

00:11:23.759 --> 00:11:26.620
frictionless for distributed teams. Both tools,

00:11:26.620 --> 00:11:29.000
it should be mentioned, also now support the

00:11:29.000 --> 00:11:31.100
slash goal command. Right, where you define a

00:11:31.100 --> 00:11:34.240
macro goal with a very clear hard stopping condition

00:11:34.240 --> 00:11:36.860
and the agent just keeps iterating until that

00:11:36.860 --> 00:11:39.679
exact condition is definitively met. But another

00:11:39.679 --> 00:11:42.679
massive, unique advantage for Codex's execution

00:11:42.679 --> 00:11:47.070
speed is visual assets. It has direct native

00:11:47.070 --> 00:11:50.570
integration with GPT Image 2 right inside the

00:11:50.570 --> 00:11:52.809
workflow. So it generates images as part of its

00:11:52.809 --> 00:11:54.769
primary execution. Yeah, if you need a quick

00:11:54.769 --> 00:11:57.950
product mock -up for a UI layout or a simple

00:11:57.950 --> 00:12:00.950
placeholder game asset, Codex handles it natively

00:12:00.950 --> 00:12:02.909
without skipping a beat. Whereas Claude currently

00:12:02.909 --> 00:12:05.590
requires an entirely external third -party setup

00:12:05.590 --> 00:12:08.690
to generate any kind of visual assets. So why

00:12:08.690 --> 00:12:10.970
is making worktrees native such a big deal for

00:12:10.970 --> 00:12:13.500
execution? Because it completely removes the

00:12:13.500 --> 00:12:15.840
friction of branching and stashing. It allows

00:12:15.840 --> 00:12:18.279
the agent to build fearlessly and rapidly by

00:12:18.279 --> 00:12:20.379
running multiple development threads safely at

00:12:20.379 --> 00:12:22.659
the exact same time. So they let the agent build

00:12:22.659 --> 00:12:24.480
parallel features without breaking your main

00:12:24.480 --> 00:12:27.360
code. Exactly. It completely neutralizes the

00:12:27.360 --> 00:12:29.940
paralyzing fear of catastrophic code errors.

00:12:30.039 --> 00:12:31.700
I think this is a perfect time to take a quick

00:12:31.700 --> 00:12:34.080
pause. Today's deep dive is supported by our

00:12:34.080 --> 00:12:36.519
friends at TechMinds. If you're building the

00:12:36.519 --> 00:12:38.639
next generation of software, you know keeping

00:12:38.639 --> 00:12:41.809
your team aligned is harder than ever. TechMinds

00:12:41.809 --> 00:12:44.110
provides the infrastructure you need to seamlessly

00:12:44.110 --> 00:12:47.169
integrate AI tools into your daily scrums. Their

00:12:47.169 --> 00:12:49.669
platform gives you full visibility into agent

00:12:49.669 --> 00:12:52.929
workloads, resource allegation, and project timelines.

00:12:53.450 --> 00:12:56.029
You can try it free for 30 days by visiting their

00:12:56.029 --> 00:12:58.970
website. Alright, welcome back. So, theories

00:12:58.970 --> 00:13:00.870
and distinct philosophies are always great to

00:13:00.870 --> 00:13:04.330
discuss. But I love looking at hard data. Me

00:13:04.330 --> 00:13:07.409
too. How do these tools actually perform when

00:13:07.409 --> 00:13:09.909
they are tasked with doing the exact same job

00:13:09.909 --> 00:13:12.259
under pressure? Well, the author of the newsletter

00:13:12.259 --> 00:13:15.039
ran a truly fantastic, highly structured real

00:13:15.039 --> 00:13:18.080
-world test. They tasked both agents with creating

00:13:18.080 --> 00:13:21.899
a highly branded, professional research PDF analyzing

00:13:21.899 --> 00:13:24.080
small -to -medium business automation tools.

00:13:24.340 --> 00:13:26.120
And crucially, the agents had to use active web

00:13:26.120 --> 00:13:28.480
search to gather the raw market data, synthesize

00:13:28.480 --> 00:13:30.799
it, and format it. Right. Let's look at the precise

00:13:30.799 --> 00:13:33.419
numbers for Codex first. It was running on the

00:13:33.419 --> 00:13:37.769
GPT 5 .5 model set to high performance. and Codex

00:13:37.769 --> 00:13:39.769
sprinted through the complex task, finishing

00:13:39.769 --> 00:13:42.210
in exactly eight minutes and one second. Along

00:13:42.210 --> 00:13:45.029
the way, it consumed approximately 2 .8 million

00:13:45.029 --> 00:13:47.990
tokens to process and generate the data. It successfully

00:13:47.990 --> 00:13:50.649
produced a very clean, blue, and gold structured

00:13:50.649 --> 00:13:54.289
cover page. It even generated a custom SMB mark

00:13:54.289 --> 00:13:56.649
natively, since it couldn't locate the original

00:13:56.649 --> 00:13:59.379
requested logo online. Yeah, and it proved to

00:13:59.379 --> 00:14:02.799
be fantastic at structuring raw, messy data into

00:14:02.799 --> 00:14:05.259
tight, readable tables. Now let's pivot and look

00:14:05.259 --> 00:14:07.740
at the CLOD results. It was running on the massive

00:14:07.740 --> 00:14:10.940
OPUS 4 .7 model, also set to high performance.

00:14:11.340 --> 00:14:13.759
CLOD took slightly longer, clocking in at 8 minutes

00:14:13.759 --> 00:14:16.269
and 15 seconds. But the token consumption is

00:14:16.269 --> 00:14:19.289
where things get wild. Yeah. It consumed a massive

00:14:19.289 --> 00:14:22.129
4 .7 million tokens for the exact same task.

00:14:22.309 --> 00:14:24.610
However, it actually successfully searched for

00:14:24.610 --> 00:14:27.789
and placed the correct AI fire logo. It also

00:14:27.789 --> 00:14:31.129
created a beautiful, highly nuanced dark orange

00:14:31.129 --> 00:14:33.929
gradient cover. It displayed much stronger overall

00:14:33.929 --> 00:14:36.350
brand fidelity throughout the document. And instead

00:14:36.350 --> 00:14:39.429
of just raw tables, it route the entire report

00:14:39.429 --> 00:14:42.919
in a compelling, flowing narrative style. But

00:14:42.919 --> 00:14:45.480
we have to talk seriously about token efficiency

00:14:45.480 --> 00:14:47.480
and the underlying economics here. Oh, absolutely.

00:14:47.820 --> 00:14:50.000
Because tokens are essentially the agent's battery

00:14:50.000 --> 00:14:52.399
life for the day. If you drain them too fast,

00:14:52.679 --> 00:14:54.659
you're done working. That is the perfect way

00:14:54.659 --> 00:14:58.039
to visualize it. Codex usage is completely included

00:14:58.039 --> 00:15:01.259
in standard chat GPT plans. That covers the free,

00:15:01.450 --> 00:15:05.070
Plus, and Pro tiers without any convoluted extra

00:15:05.070 --> 00:15:07.470
setup. Claude, on the other hand, requires a

00:15:07.470 --> 00:15:10.990
Pro subscription at $20, or the Max 5X tier at

00:15:10.990 --> 00:15:15.549
$100, or the Max 20X tier at $200 a month. And

00:15:15.549 --> 00:15:17.649
when you are burning tokens that fast, those

00:15:17.649 --> 00:15:19.830
subscription costs add up incredibly quickly

00:15:19.830 --> 00:15:22.519
for a team. To put the raw output into perspective,

00:15:23.000 --> 00:15:26.320
Codex generally outputs roughly 16 ,000 to 20

00:15:26.320 --> 00:15:28.980
,000 tokens per standard task. Yeah, while Claude

00:15:28.980 --> 00:15:31.600
is outputting roughly 80 ,000 to 84 ,000 tokens

00:15:31.600 --> 00:15:34.139
per task. Claude hits those strict platform usage

00:15:34.139 --> 00:15:36.659
caps so much faster because of that massive context

00:15:36.659 --> 00:15:40.480
window. Whoa, imagine chewing through 4 .7 million

00:15:40.480 --> 00:15:43.259
tokens in just eight minutes. It's crazy. The

00:15:43.259 --> 00:15:45.639
scale is staggering. It really is mind -bending.

00:15:45.860 --> 00:15:48.340
It shows exactly why Opus is such an expensive

00:15:48.340 --> 00:15:51.639
model to run continuously. It is constantly rereading

00:15:51.639 --> 00:15:55.000
its own massive system prompt to maintain that

00:15:55.000 --> 00:15:57.419
architectural perfection. We also need to note

00:15:57.419 --> 00:15:59.860
a crucial platform policy difference regarding

00:15:59.860 --> 00:16:03.679
token usage. OpenAI officially supports and allows

00:16:03.679 --> 00:16:07.059
third -party API wrappers like OpenClaw or Hermes

00:16:07.059 --> 00:16:10.299
to manage these workflows. Right, whereas Anthropic

00:16:10.299 --> 00:16:14.820
requires explicit strict approval for any third

00:16:14.820 --> 00:16:17.340
party developers trying to tap into their agent

00:16:17.340 --> 00:16:20.000
workflows. Beyond the subscription price, why

00:16:20.000 --> 00:16:21.840
does the token difference actually matter to

00:16:21.840 --> 00:16:24.940
the user? It is entirely about your daily session

00:16:24.940 --> 00:16:27.559
longevity. Higher token output means you hit

00:16:27.559 --> 00:16:30.799
your strict daily rate limits rapidly. It completely

00:16:30.799 --> 00:16:33.100
stalls your workflow when you suddenly run out

00:16:33.100 --> 00:16:35.480
of computational battery life. right before a

00:16:35.480 --> 00:16:37.659
deadline. Higher token usage drains your daily

00:16:37.659 --> 00:16:40.240
session limits much faster. Exactly. It physically

00:16:40.240 --> 00:16:42.580
forces you to stop working entirely, which is

00:16:42.580 --> 00:16:44.940
a massive friction point for professional developers

00:16:44.940 --> 00:16:47.200
trying to stay in a flow state. So we've seen

00:16:47.200 --> 00:16:49.840
the careful, obsessive architectural planner

00:16:49.840 --> 00:16:52.759
in Claude. We've seen the incredibly speedy,

00:16:52.980 --> 00:16:56.120
fearless executor in Codex. So much of the discussion

00:16:56.120 --> 00:16:58.399
online is about declaring a winner. But how do

00:16:58.399 --> 00:17:00.720
we actually apply this knowledge to our work

00:17:00.720 --> 00:17:03.980
tomorrow morning? The big idea here is incredibly

00:17:03.980 --> 00:17:06.980
liberating, honestly. Don't pledge blind loyalty

00:17:06.980 --> 00:17:10.579
to just one app or ecosystem. Code is ultimately

00:17:10.579 --> 00:17:13.440
highly portable text. Your projects are not locked

00:17:13.440 --> 00:17:16.500
into proprietary formats. Your projects are entirely

00:17:16.500 --> 00:17:19.599
completely portable. The real long -term value

00:17:19.599 --> 00:17:22.680
in this new era isn't the specific AI tool you

00:17:22.680 --> 00:17:25.160
subscribe to. No, not at all. The true value

00:17:25.160 --> 00:17:27.859
is the underlying systems and the flexible pipeline

00:17:27.859 --> 00:17:29.880
skills you build along the way. Yeah, you aren't

00:17:29.880 --> 00:17:32.480
permanently locked into one single agent just

00:17:32.480 --> 00:17:34.660
because you spent six months learning its quirks.

00:17:34.799 --> 00:17:37.180
You can and should move between them fluidly.

00:17:37.309 --> 00:17:39.829
This leads us directly to the optimal strategy

00:17:39.829 --> 00:17:42.450
for modern development. It's the dynamic hybrid

00:17:42.450 --> 00:17:45.529
workflow. Yes. You open up Clog Code for the

00:17:45.529 --> 00:17:47.789
heavy architectural lifting up front. You use

00:17:47.789 --> 00:17:50.490
its deep planning capabilities, its custom hooks,

00:17:50.569 --> 00:17:53.430
and its interactive front -end UI designed to

00:17:53.430 --> 00:17:55.869
map out exactly what you want to build. Then,

00:17:55.950 --> 00:17:58.910
once the blueprint is rock solid, you seamlessly

00:17:58.910 --> 00:18:01.369
switch your project directory over to Codex.

00:18:01.430 --> 00:18:03.950
Right. You unleash Codex for pure execution,

00:18:04.410 --> 00:18:07.910
deep web research, generating structured PDFs

00:18:07.910 --> 00:18:10.509
and rapid GitHub shipping. Moving fluidly between

00:18:10.509 --> 00:18:13.430
these powerful agents based entirely on the specific

00:18:13.430 --> 00:18:15.769
task in front of you is the optimal strategy.

00:18:16.069 --> 00:18:19.069
You literally get the absolute best of both unique

00:18:19.069 --> 00:18:22.049
worlds without compromising on speed or quality.

00:18:22.240 --> 00:18:24.460
So, the ultimate takeaway isn't picking a winner,

00:18:24.660 --> 00:18:27.319
but building a pipeline. Exactly. You don't need

00:18:27.319 --> 00:18:30.079
a singular software champion. You need a functioning,

00:18:30.500 --> 00:18:32.980
highly efficient, and flexible development pipeline

00:18:32.980 --> 00:18:35.680
that leverages the strengths of every tool available.

00:18:35.859 --> 00:18:38.339
Right. Use Claude to plan the building and Codex

00:18:38.339 --> 00:18:40.720
to swing the hammers. That's the exact mental

00:18:40.720 --> 00:18:43.180
model developers need to adopt right now. Keep

00:18:43.180 --> 00:18:46.019
your projects easily portable, use Markdown extensively,

00:18:46.380 --> 00:18:48.759
and keep a remarkably open mind as these tools

00:18:48.759 --> 00:18:52.180
continue to evolve at breakneck speed. incredible

00:18:52.180 --> 00:18:54.799
breakdown of the current coding agent landscape.

00:18:55.220 --> 00:18:58.299
It really highlights exactly how fundamentally

00:18:58.299 --> 00:19:00.480
different these underlying philosophies are,

00:19:00.660 --> 00:19:02.460
even when they are trying to solve the exact

00:19:02.460 --> 00:19:05.180
same problems. It's a wildly exciting time to

00:19:05.180 --> 00:19:07.480
be building software. The WIP abilities are expanding

00:19:07.480 --> 00:19:09.920
every single week. I want to leave you with one

00:19:09.920 --> 00:19:13.069
final thought to mull over. OK. If we are already

00:19:13.069 --> 00:19:16.190
building hybrid workflows using Claude to meticulously

00:19:16.190 --> 00:19:19.150
plan the architecture and codecs to rapidly execute

00:19:19.150 --> 00:19:21.950
the code, what happens when we inevitably get

00:19:21.950 --> 00:19:25.349
an overarching AI agent whose only job is to

00:19:25.349 --> 00:19:27.369
automatically manage the workflow between these

00:19:27.369 --> 00:19:30.509
other agents? Oh, man, that is exactly when things

00:19:30.509 --> 00:19:32.750
are going to get truly wild. Try spinning up

00:19:32.750 --> 00:19:34.950
a simple local project today. Keep your files

00:19:34.950 --> 00:19:38.069
portable. Test out a custom hook or spin up a

00:19:38.069 --> 00:19:40.289
native work tree just to see how it feels. Yeah,

00:19:40.289 --> 00:19:42.690
just see how actively changes your daily workflow

00:19:42.690 --> 00:19:44.650
and your relationship with the code base. Thank

00:19:44.650 --> 00:19:47.329
you for joining us for this deep dive. We appreciate

00:19:47.329 --> 00:19:49.450
you spending your valuable time with us today.

00:19:49.710 --> 00:19:50.210
Keep building.
