WEBVTT

00:00:00.000 --> 00:00:04.299
Klarna just replaced 700 employees with a single

00:00:04.299 --> 00:00:07.980
AI chatbot. Yeah, it's a staggering shift, honestly.

00:00:08.119 --> 00:00:11.240
I mean, Duolingo just cut 10 % of its translation

00:00:11.240 --> 00:00:13.880
team. For the exact same reason. Right, exactly.

00:00:14.320 --> 00:00:16.399
And those are just the big names, you know. Goldman

00:00:16.399 --> 00:00:19.199
Sachs estimates AI could replace 300 million

00:00:19.199 --> 00:00:21.719
jobs globally. It's not a scare tactic. It's

00:00:21.719 --> 00:00:24.170
simply the current data. Absolutely. The people

00:00:24.170 --> 00:00:26.250
who survive this shift aren't necessarily the

00:00:26.250 --> 00:00:27.949
smartest ones. They're the ones who learn to

00:00:27.949 --> 00:00:30.129
work with AI. They figure it out before the rest

00:00:30.129 --> 00:00:32.609
of the industry catches up. Welcome to the Deep

00:00:32.609 --> 00:00:35.890
Dive. Today, our mission is intensely practical.

00:00:36.189 --> 00:00:39.229
We're unpacking a specific AI engineer roadmap.

00:00:39.750 --> 00:00:42.250
It's designed for you, even if you are starting

00:00:42.250 --> 00:00:44.850
from absolute zero. We are covering five distinct

00:00:44.850 --> 00:00:46.990
levels to future -proof your career. And the

00:00:46.990 --> 00:00:49.549
best part, you do not need a computer science

00:00:49.549 --> 00:00:51.969
degree for this. You don't need complex math

00:00:51.969 --> 00:00:54.200
theory either. You simply need to follow the

00:00:54.200 --> 00:00:57.520
path. Before we climb those five levels, we must

00:00:57.520 --> 00:00:59.579
define the game. What are we actually playing

00:00:59.579 --> 00:01:02.579
here? There is massive confusion out there. People

00:01:02.579 --> 00:01:05.120
constantly mix up data science and AI engineering.

00:01:05.519 --> 00:01:08.319
Oh, it happens all the time. That confusion sends

00:01:08.319 --> 00:01:11.120
a lot of people in the completely wrong direction.

00:01:12.099 --> 00:01:13.859
Here's a simple way to think about the difference.

00:01:14.340 --> 00:01:16.939
Let's use a house analogy. I like analogies.

00:01:17.120 --> 00:01:20.140
A software engineer builds the actual house.

00:01:20.459 --> 00:01:23.019
They pour the foundation. They build the walls,

00:01:23.200 --> 00:01:25.299
the roof, and the doors. They create the structural

00:01:25.299 --> 00:01:28.260
integrity. An AI engineer, though, makes that

00:01:28.260 --> 00:01:31.040
house smart. You add the sensors that turn on

00:01:31.040 --> 00:01:33.319
the lights. You add the voice assistant for the

00:01:33.319 --> 00:01:36.540
music. That is your specific layer. So you aren't

00:01:36.540 --> 00:01:39.450
building these algorithms from scratch. Instead,

00:01:39.489 --> 00:01:42.390
you take existing intelligence. You elegantly

00:01:42.390 --> 00:01:44.849
plug it into real -world products. You solve

00:01:44.849 --> 00:01:47.689
problems with practical tools. Exactly. You don't

00:01:47.689 --> 00:01:50.489
need to be a math genius inventing new algorithms.

00:01:50.670 --> 00:01:52.790
You just need to know how to pick the right tools.

00:01:53.049 --> 00:01:55.310
You make them work together seamlessly. It's

00:01:55.310 --> 00:01:57.510
like we are moving from building the engines

00:01:57.510 --> 00:01:59.849
to simply driving the cars. We are operating

00:01:59.849 --> 00:02:02.170
the machines now. We just need to know how to

00:02:02.170 --> 00:02:04.209
steer. That is the perfect way to look at it.

00:02:04.209 --> 00:02:06.849
You are the driver. The Frontier models, well,

00:02:06.870 --> 00:02:10.069
they're your engine. To sex silence? I still

00:02:10.069 --> 00:02:13.129
wrestle with prompt drift myself. The unpredictability

00:02:13.129 --> 00:02:15.550
of it all. It can be quite humbling. So we aren't

00:02:15.550 --> 00:02:17.729
building the brain, just the nervous system.

00:02:18.090 --> 00:02:20.870
Precisely. You're wiring the connections. You

00:02:20.870 --> 00:02:23.490
let the massive pre -trained brains do the heavy

00:02:23.490 --> 00:02:26.129
lifting. You just direct their energy exactly

00:02:26.129 --> 00:02:29.090
where it needs to go. Exactly. We connect existing

00:02:29.090 --> 00:02:32.370
intelligence to real -world applications. Now

00:02:32.370 --> 00:02:35.219
we understand the job itself. But you cannot

00:02:35.219 --> 00:02:37.620
build a skyscraper on sand. No, you definitely

00:02:37.620 --> 00:02:40.240
cannot. We must lay the groundwork first. We

00:02:40.240 --> 00:02:43.500
have to do this before touching shiny AI tools.

00:02:43.719 --> 00:02:46.180
We use the SALT framework. Let's break down the

00:02:46.180 --> 00:02:49.060
SALT foundation. The S stands for software fluency.

00:02:49.280 --> 00:02:51.639
Specifically, you need to learn Python. Python

00:02:51.639 --> 00:02:54.000
is the language of AI. Yeah. Almost everything

00:02:54.000 --> 00:02:56.319
you do in this field runs on Python. It's actually

00:02:56.319 --> 00:02:58.419
one of the easiest languages to pick up. But

00:02:58.419 --> 00:03:00.439
beginners often fall into a nasty trap. They

00:03:00.439 --> 00:03:03.080
fall into tutorial hell. Yes. You watch a three

00:03:03.080 --> 00:03:05.099
-hour coding video. You feel like you completely

00:03:05.099 --> 00:03:07.719
get it. Then you open a blank file. Your mind

00:03:07.719 --> 00:03:10.139
goes completely empty. The fix is incredibly

00:03:10.139 --> 00:03:13.060
simple, but people resist it. You must learn

00:03:13.060 --> 00:03:16.740
by doing. You can't just watch passively. A tool

00:03:16.740 --> 00:03:19.900
like CAUTI .Tech forces you to write code immediately.

00:03:20.199 --> 00:03:21.860
It makes the concepts actually stick in your

00:03:21.860 --> 00:03:24.180
brain. You need to focus on four core things.

00:03:24.759 --> 00:03:27.319
Variables are first. Think of variables as digital

00:03:27.319 --> 00:03:30.620
buckets. When that massive AI brain sends back

00:03:30.620 --> 00:03:33.120
an answer, your program needs a bucket to catch

00:03:33.120 --> 00:03:36.659
it. Next, you need lists and dictionaries. That's

00:03:36.659 --> 00:03:39.699
how you organize lots of data. If the AI returns

00:03:39.699 --> 00:03:41.900
50 different estimate names, you need a structured

00:03:41.900 --> 00:03:43.860
way to hold them. Then you move to functions.

00:03:44.199 --> 00:03:46.599
This is how you write reusable code. You package

00:03:46.599 --> 00:03:48.939
a set of instructions together. You avoid repeating

00:03:48.939 --> 00:03:51.599
yourself. Finally, you have loops. Loops make

00:03:51.599 --> 00:03:54.259
the computer repeat a task automatically. If

00:03:54.259 --> 00:03:56.960
you have 100 emails, a loop processes them one

00:03:56.960 --> 00:03:59.439
by one. You write the code once. That covers

00:03:59.439 --> 00:04:02.620
the software fluency part. The A in salt is API

00:04:02.620 --> 00:04:05.060
architecture. Once you know Python, you need

00:04:05.060 --> 00:04:07.520
a way to talk to the AI. APIs are how models

00:04:07.520 --> 00:04:10.340
receive your requests. The easiest way to picture

00:04:10.340 --> 00:04:12.860
it is a waiter at a restaurant. You are the customer

00:04:12.860 --> 00:04:15.000
sitting at the table. And the kitchen in the

00:04:15.000 --> 00:04:17.339
back is the AI model. You don't walk into the

00:04:17.339 --> 00:04:19.439
kitchen yourself. You tell the waiter what you

00:04:19.439 --> 00:04:21.910
want. They relay your order to the kitchen, then

00:04:21.910 --> 00:04:24.170
they bring your cooked food back to your table.

00:04:24.670 --> 00:04:26.889
APIs work exactly the same way. In practice,

00:04:26.889 --> 00:04:29.430
you use Python's requests library. It acts as

00:04:29.430 --> 00:04:31.930
your waiter. It sends those digital messages

00:04:31.930 --> 00:04:34.810
for you behind the scenes. Now, for the L, that

00:04:34.810 --> 00:04:38.110
stands for life cycle and version control. Specifically,

00:04:38.230 --> 00:04:40.810
we're talking about a tool called Git. Every

00:04:40.810 --> 00:04:43.850
professional project involves mistakes. Code

00:04:43.850 --> 00:04:46.430
breaks all the time. Features get deleted by

00:04:46.430 --> 00:04:49.639
accident. A tiny typo can crash an entire server.

00:04:49.819 --> 00:04:52.360
Panic sets in. Git is the safety net. It lets

00:04:52.360 --> 00:04:54.980
you roll back to any previous version. It tracks

00:04:54.980 --> 00:04:57.660
every single change you make over time. It is

00:04:57.660 --> 00:05:00.139
basically an undo button for your entire code

00:05:00.139 --> 00:05:02.339
base. GitHub is the website where those saves

00:05:02.339 --> 00:05:05.160
live online. Every single tech company uses it

00:05:05.160 --> 00:05:08.139
today. Finally, the T is the text deck. This

00:05:08.139 --> 00:05:10.560
is building end -to -end. You need a database,

00:05:10.800 --> 00:05:13.519
like MongoDB, to store user information permanently.

00:05:13.800 --> 00:05:16.980
You need a backend, like Flask or FastAPI. That

00:05:16.980 --> 00:05:19.879
handles the unseen logic. And you need a frontend,

00:05:20.120 --> 00:05:22.839
like React, for the visual user interface. Can't

00:05:22.839 --> 00:05:25.019
we just skip to the AI stuff and learn Python

00:05:25.019 --> 00:05:27.389
later? You really can't. Without Python, you

00:05:27.389 --> 00:05:29.250
have no way to communicate with the models. You

00:05:29.250 --> 00:05:31.629
would just be staring at APIs with no way to

00:05:31.629 --> 00:05:34.389
connect them to anything. No. Python is the absolute

00:05:34.389 --> 00:05:37.629
necessary foundation for everything else. Now

00:05:37.629 --> 00:05:39.990
that the foundation is poured, we move to level

00:05:39.990 --> 00:05:43.750
two. It is time to invite the AI inside. This

00:05:43.750 --> 00:05:46.649
level is controlled intelligence. You are finally

00:05:46.649 --> 00:05:49.350
using the best models available. Let's start

00:05:49.350 --> 00:05:52.389
with the OpenAI API. You have probably used the

00:05:52.389 --> 00:05:55.500
ChatGPT website. But as an engineer, you interact

00:05:55.500 --> 00:05:57.959
with it completely differently. You write code

00:05:57.959 --> 00:06:01.230
to automate it. The difference is massive. Instead

00:06:01.230 --> 00:06:04.029
of manually typing into a chat box, your script

00:06:04.029 --> 00:06:07.370
runs things automatically. It operates at superhuman

00:06:07.370 --> 00:06:10.870
speeds. Think about a script processing 100 customer

00:06:10.870 --> 00:06:14.509
emails. It flags complaints instantly. It extracts

00:06:14.509 --> 00:06:16.810
the emotional sentiment in seconds. It sorts

00:06:16.810 --> 00:06:19.649
them into positive, neutral, or negative categories.

00:06:20.009 --> 00:06:22.329
You do not touch a single thing manually. Your

00:06:22.329 --> 00:06:24.870
script handles the entire inbox. The prompt you

00:06:24.870 --> 00:06:27.029
write in the code is so simple. You tell the

00:06:27.029 --> 00:06:29.110
model to act as a helpful assistant. You ask

00:06:29.110 --> 00:06:31.180
it for the sentiment. You ask it for the main

00:06:31.180 --> 00:06:34.319
problem in one sentence. You wrap that in a Python

00:06:34.319 --> 00:06:37.620
loop. It processes an entire inbox instantly.

00:06:38.000 --> 00:06:40.639
Once you understand how to automate OpenAI, you

00:06:40.639 --> 00:06:44.220
explore hugging face. Hugging Face is an incredible

00:06:44.220 --> 00:06:46.980
resource. Think of it as the app store for AI

00:06:46.980 --> 00:06:49.540
models. The best part is that most of it is completely

00:06:49.540 --> 00:06:52.639
free. You don't have to rely entirely on expensive,

00:06:52.819 --> 00:06:55.379
closed models. It handles so many different tasks.

00:06:55.680 --> 00:06:57.699
It handles language translation. It gives you

00:06:57.699 --> 00:06:59.759
image descriptions. It handles sound transcription.

00:07:00.060 --> 00:07:02.360
You just pick pre -trained models off the shelf.

00:07:02.540 --> 00:07:04.740
You run them locally. If you want a structured

00:07:04.740 --> 00:07:06.899
way to practice all this, look at DataCam. They

00:07:06.899 --> 00:07:09.519
have an associate AI engineer track. It covers

00:07:09.519 --> 00:07:12.199
exactly what you need to master these tools.

00:07:12.839 --> 00:07:15.639
Is Hugging Face just a repository for text generation

00:07:15.639 --> 00:07:18.420
models? Not at all. Text is just the beginning.

00:07:18.579 --> 00:07:21.000
It has thousands of models for analyzing images,

00:07:21.420 --> 00:07:23.680
processing audio files, and translating spoken

00:07:23.680 --> 00:07:26.060
languages in real time. No, it offers audio,

00:07:26.199 --> 00:07:28.120
vision, and language translation models too.

00:07:28.300 --> 00:07:31.459
Sponsor. Generic models are great. But in 2026,

00:07:31.860 --> 00:07:34.620
companies don't want a generic AI. They want

00:07:34.620 --> 00:07:37.899
an AI that deeply knows their specific data.

00:07:38.199 --> 00:07:41.319
That brings us to level three. Intelligent systems

00:07:41.319 --> 00:07:44.920
and RRAG. This is the heart of the roadmap. This

00:07:44.920 --> 00:07:46.800
is what companies are actually paying big money

00:07:46.800 --> 00:07:50.779
for. Let's define this jargon. What is ARG? Giving

00:07:50.779 --> 00:07:53.800
an AI your custom data to read before it answers.

00:07:53.879 --> 00:07:56.759
That is the core idea. Imagine you have a 500

00:07:56.759 --> 00:07:59.899
page policy manual for a company. A standard

00:07:59.899 --> 00:08:03.199
model, well, it does not know what is inside

00:08:03.199 --> 00:08:05.360
that specific manual. If a customer asks about

00:08:05.360 --> 00:08:08.120
a niche return policy, the model might just guess.

00:08:08.459 --> 00:08:11.740
It hallucinates. Argi fixes this problem entirely.

00:08:11.959 --> 00:08:13.879
There are four clear steps here to make it work.

00:08:13.959 --> 00:08:16.519
Step one, you break that massive manual into

00:08:16.519 --> 00:08:18.819
small, manageable chunks. Step two, you turn

00:08:18.819 --> 00:08:20.899
those pieces into embeddings. Embeddings are

00:08:20.899 --> 00:08:22.980
just coordinates. It mathematically turns words

00:08:22.980 --> 00:08:25.459
into numbers. Exactly. The AI understands those

00:08:25.459 --> 00:08:28.279
numbers deeply. If two concepts mean similar

00:08:28.279 --> 00:08:30.259
things, their numerical coordinates sit close

00:08:30.259 --> 00:08:32.399
together. Step three, you store those numbers

00:08:32.399 --> 00:08:34.840
in a vector database. Tools like Pinecone or

00:08:34.840 --> 00:08:36.860
Weaviate are built specifically to hold these

00:08:36.860 --> 00:08:40.000
coordinates. Step four is the magic. A user asks

00:08:40.000 --> 00:08:42.860
a question. Your system rapidly searches the

00:08:42.860 --> 00:08:45.379
database. It finds the most relevant pieces of

00:08:45.379 --> 00:08:48.039
information. It feeds those specific pieces to

00:08:48.039 --> 00:08:50.779
the AI. The AI reads them. Then it answers the

00:08:50.779 --> 00:08:53.000
user. It sounds like an absolute expert on your

00:08:53.000 --> 00:08:54.879
company. Here is a prompt example you would write

00:08:54.879 --> 00:08:58.240
in the code. You tell the AI, use this context,

00:08:58.299 --> 00:09:00.580
don't make things up. You pass it the chunk containing

00:09:00.580 --> 00:09:03.820
a 30 -day return policy. The user asks, can I

00:09:03.820 --> 00:09:06.980
return this laptop after 40 days? The AI confidently

00:09:06.980 --> 00:09:09.960
says no. It bases that answer purely on the reference

00:09:09.960 --> 00:09:12.299
notes you just handed it. It doesn't guess. It

00:09:12.299 --> 00:09:15.159
is brilliant. We also have LandGraph and MCP

00:09:15.159 --> 00:09:17.840
in this level. LandGraph allows the AI to think

00:09:17.840 --> 00:09:21.139
in logical steps. It is fascinating. The AI can

00:09:21.139 --> 00:09:23.750
decide to search the web, check a database, and

00:09:23.750 --> 00:09:25.549
then write a summary. It follows a workflow.

00:09:25.889 --> 00:09:29.049
MCP stands for Model Context Protocol. It provides

00:09:29.049 --> 00:09:31.809
strict rules for external tools. It tells the

00:09:31.809 --> 00:09:34.190
AI exactly how to use a calculator safely. It

00:09:34.190 --> 00:09:35.610
creates boundaries. Yeah. You don't want the

00:09:35.610 --> 00:09:39.330
AI doing whatever it wants. MCP acts as the supervisor

00:09:39.330 --> 00:09:41.789
for the tools it accesses. Two secs silence.

00:09:42.269 --> 00:09:45.429
Whoa. Imagine stacking Lego blocks of data, scaling

00:09:45.429 --> 00:09:47.970
that up to a billion queries across an entire

00:09:47.970 --> 00:09:50.490
company's history. It's truly staggering. Does

00:09:50.490 --> 00:09:53.889
Arsh physically alter the AI model's core training?

00:09:54.230 --> 00:09:56.309
It does not touch the core training at all. The

00:09:56.309 --> 00:09:58.950
underlying model remains exactly the same. You

00:09:58.950 --> 00:10:01.110
are simply sliding a cheat sheet across the desk

00:10:01.110 --> 00:10:03.769
before it speaks. No, it just gives the AI reference

00:10:03.769 --> 00:10:06.870
notes before answering. We now have a brilliant

00:10:06.870 --> 00:10:10.129
customized AI running perfectly on our laptop.

00:10:10.529 --> 00:10:13.629
We arrive at level four. Putting apps online.

00:10:13.970 --> 00:10:16.950
How do you let 10 ,000 people use your app without

00:10:16.950 --> 00:10:19.889
it instantly crashing? That is a completely different

00:10:19.889 --> 00:10:22.370
engineering problem. We start with Docker. Every

00:10:22.370 --> 00:10:24.889
developer knows the pain of sharing code. You

00:10:24.889 --> 00:10:27.389
say, it works perfectly on my machine. But you

00:10:27.389 --> 00:10:29.679
send it away and it breaks immediately. The client

00:10:29.679 --> 00:10:32.039
has an older version of Python. The libraries

00:10:32.039 --> 00:10:35.139
clash. Docker fixes this massive headache. It

00:10:35.139 --> 00:10:37.580
wraps the app in a protective container. It packs

00:10:37.580 --> 00:10:39.940
everything the app needs to run inside that bubble.

00:10:40.000 --> 00:10:42.480
It is entirely self -sufficient. It runs identically

00:10:42.480 --> 00:10:44.559
anywhere in the world. It is completely consistent.

00:10:44.759 --> 00:10:47.100
You eliminate the environmental variables that

00:10:47.100 --> 00:10:49.279
cause random crashes. Then we have cloud hosting.

00:10:49.620 --> 00:10:52.500
You cannot run a popular app from your bedroom

00:10:52.500 --> 00:10:55.639
laptop. You use AWS or Google Cloud. You put

00:10:55.639 --> 00:10:58.240
your containerized app on their massive, powerful

00:10:58.240 --> 00:11:01.179
servers. This is how you dynamically scale up

00:11:01.179 --> 00:11:03.879
during sudden traffic spikes. If a thousand people

00:11:03.879 --> 00:11:06.539
log on at once, the cloud handles the load gracefully.

00:11:06.759 --> 00:11:09.480
It spins up more resources automatically. But

00:11:09.480 --> 00:11:13.100
there is a catch. Every API call to an AI model

00:11:13.100 --> 00:11:16.360
costs actual money. That brings us to Redis Caching.

00:11:16.730 --> 00:11:19.710
This is the secret weapon of professional engineers.

00:11:19.970 --> 00:11:22.730
It is pure economics. Every time you ask the

00:11:22.730 --> 00:11:25.570
AI a question, a computer somewhere burns energy.

00:11:25.769 --> 00:11:28.610
You pay for that energy. Imagine 1000 people

00:11:28.610 --> 00:11:31.590
asking for the exact same return policy. Without

00:11:31.590 --> 00:11:34.750
caching, you pay for 1000 separate AI generations.

00:11:34.929 --> 00:11:37.909
The bill adds up incredibly fast. Redis solves

00:11:37.909 --> 00:11:40.549
this beautifully. It stores the answer the very

00:11:40.549 --> 00:11:42.929
first time it is generated. It returns that saved

00:11:42.929 --> 00:11:46.769
answer instantly the next 999 times. Cost for

00:11:46.769 --> 00:11:49.129
those repeat answers is absolutely zero. It is

00:11:49.129 --> 00:11:51.090
the difference between a hobbyist leaking money

00:11:51.090 --> 00:11:53.909
and a viable tech business. Is Docker literally

00:11:53.909 --> 00:11:56.649
just a heavy virtual machine? It is very different

00:11:56.649 --> 00:11:59.570
from a virtual machine. A virtual machine loads

00:11:59.570 --> 00:12:02.769
an entire heavy operating system. Docker just

00:12:02.769 --> 00:12:04.889
isolates the app and its immediate dependencies,

00:12:05.350 --> 00:12:07.669
making it incredibly fast. It's much lighter,

00:12:07.870 --> 00:12:11.149
packing only exactly what the app needs. We finally

00:12:11.149 --> 00:12:13.769
arrive at level five, managing a professional

00:12:13.769 --> 00:12:17.970
roadmap. also known as LLMOAPES. The app is live.

00:12:18.590 --> 00:12:20.350
Millions are potentially using it. It is cheap

00:12:20.350 --> 00:12:23.870
to run. But is it actually giving good answers?

00:12:24.200 --> 00:12:26.700
Or is it embarrassing the company? You must shift

00:12:26.700 --> 00:12:28.500
your mindset completely here. You go from being

00:12:28.500 --> 00:12:30.320
a builder to being a business owner. You have

00:12:30.320 --> 00:12:32.960
to monitor the outcomes constantly. Models hallucinate.

00:12:33.080 --> 00:12:35.639
They get things wildly wrong. You cannot manually

00:12:35.639 --> 00:12:38.200
read 10 ,000 automated customer service chats.

00:12:38.740 --> 00:12:41.360
We use DeepView for this. It runs automated quality

00:12:41.360 --> 00:12:44.340
checks continuously. It flags inaccurate answers.

00:12:44.500 --> 00:12:47.299
It catches unhelpful or off -tone responses before

00:12:47.299 --> 00:12:49.799
PR disasters happen. It is your automated quality

00:12:49.799 --> 00:12:52.379
assurance team. Then we have analytics. Tools

00:12:52.379 --> 00:12:55.139
like PostFog track actual user behavior inside

00:12:55.139 --> 00:12:56.980
the app. You see where users get frustrated.

00:12:57.279 --> 00:12:59.120
You see exactly where they drop off entirely.

00:12:59.500 --> 00:13:01.379
The data tells you exactly what parts of your

00:13:01.379 --> 00:13:04.039
app need fixing. Finally, we have model routing.

00:13:04.500 --> 00:13:07.019
This is incredibly important for long -term efficiency.

00:13:07.279 --> 00:13:09.700
You don't need a supercomputer for every single

00:13:09.700 --> 00:13:13.200
question. You send simple FAQs to a fast sheet

00:13:13.200 --> 00:13:17.419
model, like Claude Haiku. You only send complex,

00:13:17.539 --> 00:13:20.870
nuanced tasks to a powerful model. Like Claude

00:13:20.870 --> 00:13:22.950
Opus. Model routing is like having an ecosystem.

00:13:23.049 --> 00:13:25.549
You don't ask the village elder to chop wood.

00:13:25.909 --> 00:13:27.970
You don't ask the lumberjack for cosmic wisdom.

00:13:28.190 --> 00:13:30.970
You match the task appropriately. That is a very

00:13:30.970 --> 00:13:32.710
philosophical way to view it. It makes perfect

00:13:32.710 --> 00:13:35.090
sense. Isn't routing between different models

00:13:35.090 --> 00:13:37.409
over complicating the architecture? It adds a

00:13:37.409 --> 00:13:40.429
routing step, sure, but the payoff is undeniable.

00:13:40.830 --> 00:13:43.350
The cheaper models respond faster and they cost

00:13:43.350 --> 00:13:45.889
a fraction of the frontier models. No, it saves

00:13:45.889 --> 00:13:48.009
massive amounts of computing money and time.

00:13:48.250 --> 00:13:50.769
This brings us to the end of our roadmap. Let's

00:13:50.769 --> 00:13:53.470
do a big idea recap. We covered a lot of crucial

00:13:53.470 --> 00:13:56.549
ground today. Becoming an AI engineer in 2026

00:13:56.549 --> 00:13:59.309
isn't about being a math genius. It is about

00:13:59.309 --> 00:14:01.870
consistency. It's about laying a solid, salt

00:14:01.870 --> 00:14:04.610
foundation. You connect APIs cleanly. You ground

00:14:04.610 --> 00:14:07.169
your data strictly with a rag. You deploy safely

00:14:07.169 --> 00:14:10.009
in containers. You monitor your operations constantly.

00:14:10.669 --> 00:14:13.110
A smart doorbell is totally useless if the walls

00:14:13.110 --> 00:14:16.350
fall down. Start with level one today. Learn

00:14:16.350 --> 00:14:19.269
Python. Build a simple website first. The high

00:14:19.269 --> 00:14:21.389
salaries exist because it is hard, persistent

00:14:21.389 --> 00:14:24.389
work, but it is deeply achievable. You just have

00:14:24.389 --> 00:14:26.769
to follow the roadmap step by step. Don't skip

00:14:26.769 --> 00:14:29.350
the basics. Put the time in. You will be miles

00:14:29.350 --> 00:14:31.169
ahead of the curve before the rest of your industry

00:14:31.169 --> 00:14:33.960
even wakes up. Neat. We've spent this time talking

00:14:33.960 --> 00:14:36.720
about how AI engineers connect tools to automate

00:14:36.720 --> 00:14:39.840
tasks. But as models get smarter at coding, how

00:14:39.840 --> 00:14:42.220
long until the AI engineer relies on an AI to

00:14:42.220 --> 00:14:44.879
automate the AI engineering roadmap itself? Something

00:14:44.879 --> 00:14:46.220
to ponder. Keep exploring.
