WEBVTT

00:00:00.000 --> 00:00:03.700
Welcome back to the Deep Dive. Today we are strapping

00:00:03.700 --> 00:00:06.879
ourselves in really getting into the engine room

00:00:06.879 --> 00:00:09.880
of the AWS cloud. Yeah, absolutely. Forget databases,

00:00:10.220 --> 00:00:12.140
storage, networking for a minute. We're focusing

00:00:12.140 --> 00:00:15.800
just on compute. The core stuff. Exactly. These

00:00:15.800 --> 00:00:17.780
services, they're the foundation, right? The

00:00:17.780 --> 00:00:20.260
virtual servers, containers, the places your

00:00:20.260 --> 00:00:23.440
code actually runs. If compute's the engine,

00:00:23.640 --> 00:00:26.089
then well... understanding it is like having

00:00:26.089 --> 00:00:28.190
the driver's manual for a cloud architecture.

00:00:28.609 --> 00:00:32.649
You just can't build anything scalable or resilient

00:00:32.649 --> 00:00:35.390
or, you know, cost -effective without getting

00:00:35.390 --> 00:00:38.450
this layer right. So our mission today is pretty

00:00:38.450 --> 00:00:40.530
specific. Yeah, we want to give you that critical

00:00:40.530 --> 00:00:43.170
shortcut, sort of distill the must -know info

00:00:43.170 --> 00:00:46.310
about EC2, Lambda, ELB, auto scaling. And how

00:00:46.310 --> 00:00:48.689
they all work together. Exactly. For scalability,

00:00:49.009 --> 00:00:50.890
elasticity, and crucially, keeping costs down.

00:00:51.259 --> 00:00:53.880
And for you, listening, this is probably the

00:00:53.880 --> 00:00:56.520
area to focus on for quick expertise gains. It's

00:00:56.520 --> 00:00:58.740
not just knowing the names. No, not at all. It's

00:00:58.740 --> 00:01:01.060
about understanding the trade -offs. The why

00:01:01.060 --> 00:01:03.179
behind choosing one over another. If someone

00:01:03.179 --> 00:01:05.640
asks where we run this code, well, the answer

00:01:05.640 --> 00:01:08.079
is always complex. Always. But for traditional

00:01:08.079 --> 00:01:10.579
apps, the kind that are always on, the starting

00:01:10.579 --> 00:01:12.540
point is almost always the same thing, isn't

00:01:12.540 --> 00:01:16.060
it? It is. Amazon EC2. Elastic compute cloud.

00:01:16.319 --> 00:01:18.620
Right. When you think virtual machine or server

00:01:18.620 --> 00:01:21.060
in the cloud, nine times out of 10, you're thinking

00:01:21.060 --> 00:01:25.060
EC2. Honestly, it's probably the single most

00:01:25.060 --> 00:01:27.659
important service to really get your head around

00:01:27.659 --> 00:01:30.250
first. Because it's infrastructure as a service.

00:01:30.430 --> 00:01:32.310
Exactly. It gives you that secure, resizable

00:01:32.310 --> 00:01:35.170
compute capacity. Those virtual servers are called

00:01:35.170 --> 00:01:38.150
instances. It's the closest thing you get to,

00:01:38.150 --> 00:01:40.609
like, bare metal in the cloud. OK. Let's unpack

00:01:40.609 --> 00:01:44.090
that, then. EC2, the virtual server. Launching

00:01:44.090 --> 00:01:46.530
one isn't just a button click, though. You have

00:01:46.530 --> 00:01:48.769
to define things. The recipe, the power, the

00:01:48.769 --> 00:01:50.730
access. Right. You need to configure it. We can

00:01:50.730 --> 00:01:53.209
break it down into three core components you

00:01:53.209 --> 00:01:55.750
must define. OK. First up, the Amazon machine

00:01:55.750 --> 00:02:00.239
image, or AMI. AMI. Think of the AMI as the template.

00:02:00.519 --> 00:02:02.680
It's the recipe for the server's brain, basically.

00:02:02.980 --> 00:02:05.319
OK. It contains all the software config, the

00:02:05.319 --> 00:02:08.300
OS, maybe an application server like Apache or

00:02:08.300 --> 00:02:10.900
applications. It defines what the server actually

00:02:10.900 --> 00:02:14.159
is. So the AMI sets the software blueprint. Got

00:02:14.159 --> 00:02:16.800
it. But servers need hardware, too. How do we

00:02:16.800 --> 00:02:19.219
define the horsepower for this virtual machine?

00:02:19.639 --> 00:02:21.680
That's the instance type. And this is where.

00:02:21.870 --> 00:02:24.689
Frankly, architects spend a lot of time optimizing.

00:02:24.830 --> 00:02:27.069
Makes sense. The instance type determines the

00:02:27.069 --> 00:02:29.490
whole hardware profile of the computer it's running

00:02:29.490 --> 00:02:33.889
on. We're talking virtual CPUs, vCPUs, RAM, storage

00:02:33.889 --> 00:02:35.949
options, network performance. It's the whole

00:02:35.949 --> 00:02:38.189
package. And it's not just one size fits all,

00:02:38.430 --> 00:02:40.969
right? AWS has loads of these. Different families.

00:02:41.110 --> 00:02:43.409
Oh, absolutely. Tons. And knowing the families

00:02:43.409 --> 00:02:46.069
is key, it often gets glossed over. You really

00:02:46.069 --> 00:02:48.330
need to pick the right tool for the job. So what

00:02:48.330 --> 00:02:51.030
are the main ones we should know? Well, The most

00:02:51.030 --> 00:02:53.490
common you'll bump into are probably the M, C,

00:02:53.729 --> 00:02:57.490
and R families. MCR, OK. M family, think M for

00:02:57.490 --> 00:03:00.449
main, or maybe multipurpose. It's general purpose.

00:03:00.889 --> 00:03:04.150
Good balance of CPU, memory, network. Great starting

00:03:04.150 --> 00:03:06.009
point for web servers, small databases, that

00:03:06.009 --> 00:03:08.090
kind of thing. OK, general stuff. Then the C

00:03:08.090 --> 00:03:11.780
family. C for compute optimized. These are CPU

00:03:11.780 --> 00:03:14.879
powerhouses, high -performance computing, batch

00:03:14.879 --> 00:03:18.599
jobs, video encoding, anything that really hammers

00:03:18.599 --> 00:03:21.960
the processor. Got it. CPU heavy. And conversely,

00:03:22.099 --> 00:03:25.900
the R family. R for RAM. Memory optimized. These

00:03:25.900 --> 00:03:28.419
are for things that need loads of memory. Big

00:03:28.419 --> 00:03:31.060
databases, memory caches, data analytics that

00:03:31.060 --> 00:03:33.479
chew through RAM. So M for balanced, C for CPU,

00:03:33.599 --> 00:03:35.319
R for RAM. That makes sense. So if I'm running

00:03:35.319 --> 00:03:37.080
just a standard website, probably start with

00:03:37.080 --> 00:03:39.919
an M. But if I'm doing heavy data crunching...

00:03:39.919 --> 00:03:42.259
You'd look at an R, definitely. Or maybe a C

00:03:42.259 --> 00:03:45.319
if the crunching is CPU bound. Making that choice

00:03:45.319 --> 00:03:47.780
right saves money. You're not paying for RAM

00:03:47.780 --> 00:03:50.319
you don't need. or a CPU you're not using. That

00:03:50.319 --> 00:03:51.979
makes a lot of sense. Are there other important

00:03:51.979 --> 00:03:53.979
types? Yeah, a couple of others worth mentioning.

00:03:54.060 --> 00:03:56.860
The T family, like T2, T3, T4G, those are the

00:03:56.860 --> 00:03:59.039
burst instances. Burst. Yeah, they have a baseline

00:03:59.039 --> 00:04:01.740
CPU level but can burst higher for short periods.

00:04:02.060 --> 00:04:03.680
Great for dev test environments, things that

00:04:03.680 --> 00:04:06.319
are mostly idle but need occasional power. Very

00:04:06.319 --> 00:04:08.300
cost effective for that. OK. And then you get

00:04:08.300 --> 00:04:11.240
into the really specialized stuff. PNG families

00:04:11.240 --> 00:04:14.539
for machine learning with GPUs, X families for

00:04:14.539 --> 00:04:18.000
massive memory needs, but MC, R, and T cover

00:04:18.000 --> 00:04:21.379
a huge range of use cases. Choosing the instance

00:04:21.379 --> 00:04:24.620
type is fundamentally about performance versus

00:04:24.620 --> 00:04:28.680
cost. Right. OK, so we have the recipe AMI and

00:04:28.680 --> 00:04:31.800
the power instance type. We need to weigh in

00:04:31.800 --> 00:04:33.939
securely. That's the third piece. That's the

00:04:33.939 --> 00:04:36.790
key pair. absolutely essential for secure admin

00:04:36.790 --> 00:04:40.670
access, like SSHing into a Linux box or RDPing

00:04:40.670 --> 00:04:42.370
into Windows. How does that work? It's a pair

00:04:42.370 --> 00:04:44.790
of cryptographic keys. There's a public key,

00:04:45.009 --> 00:04:47.089
which AWS stores and puts on the instance when

00:04:47.089 --> 00:04:48.810
it launches. And then there's a private key.

00:04:48.990 --> 00:04:50.449
And that's the one I keep. That's the one you

00:04:50.449 --> 00:04:52.569
keep. And you guard it carefully on your own

00:04:52.569 --> 00:04:55.990
machine. AWS doesn't keep the private key, lose

00:04:55.990 --> 00:04:58.550
it, and you can't log in. Oh, OK. And this is

00:04:58.550 --> 00:05:00.670
a perfect, really clear example of the shared

00:05:00.670 --> 00:05:03.350
responsibility model. Right. AWS secures the

00:05:03.350 --> 00:05:05.430
cloud itself. The hardware, the network, the

00:05:05.430 --> 00:05:07.290
facilities. But I'm responsible for security

00:05:07.290 --> 00:05:09.310
in the cloud. Exactly. Managing that private

00:05:09.310 --> 00:05:11.689
key, patching your OS, configuring firewalls.

00:05:12.170 --> 00:05:15.129
It's on you. That distinction is so important.

00:05:15.389 --> 00:05:18.449
Okay, let's pivot now. We know what an EC2 instance

00:05:18.449 --> 00:05:21.750
is, how do we pay for it? Because this is where

00:05:21.750 --> 00:05:25.230
decisions have huge financial impact for pricing

00:05:25.230 --> 00:05:27.589
models. Yep, four main ways. And getting this

00:05:27.589 --> 00:05:30.610
right is, like, foundational for cost optimization,

00:05:31.129 --> 00:05:33.449
which is a pillar of the well -architected framework.

00:05:33.870 --> 00:05:35.730
Let's start with the default, the most flexible

00:05:35.730 --> 00:05:39.050
one, on -demand instances. Simplest model. Pay

00:05:39.050 --> 00:05:41.649
by the hour, or even by the second now, for Linux

00:05:41.649 --> 00:05:45.269
instances. Zero upfront commitment. Need capacity.

00:05:45.329 --> 00:05:47.509
You get it immediately. So when would I use that?

00:05:47.670 --> 00:05:50.029
Perfect for short term stuff. Irregular workloads.

00:05:50.230 --> 00:05:52.490
Things you can't predict. Development. Testing.

00:05:53.310 --> 00:05:55.509
Anything where you need capacity now, you don't

00:05:55.509 --> 00:05:57.370
want it interrupted, but you don't know how long

00:05:57.370 --> 00:05:59.149
you'll need it for. But you pay the highest price

00:05:59.149 --> 00:06:01.009
for that flexibility. You do. It's the baseline

00:06:01.009 --> 00:06:03.470
rate. OK. So if on -demand is about flexibility,

00:06:03.550 --> 00:06:05.490
the next one is about commitment for savings.

00:06:05.689 --> 00:06:08.629
Reserved instances and the newer savings plans.

00:06:08.769 --> 00:06:11.230
Exactly. This is the core strategy if you know

00:06:11.230 --> 00:06:13.829
you need servers running long term. Think steady

00:06:13.829 --> 00:06:17.089
state. predictable workloads. Like my main production

00:06:17.089 --> 00:06:19.970
web servers. Precisely. If you commit to using

00:06:19.970 --> 00:06:22.329
compute for a one -year or three -year term,

00:06:23.149 --> 00:06:26.329
AWS gives you a significant discount, up to 72

00:06:26.329 --> 00:06:29.329
% off the on -demand price. 72%. That's massive.

00:06:29.430 --> 00:06:32.189
It really is. And the newer savings plans are

00:06:32.189 --> 00:06:34.490
generally better, more flexible than the older

00:06:34.490 --> 00:06:37.089
IIs. How so? Well, traditional IIs locked you

00:06:37.089 --> 00:06:39.990
into a specific instance, family, size, region,

00:06:40.029 --> 00:06:43.529
OS. Quite rigid. Savings plans are more flexible.

00:06:43.639 --> 00:06:46.060
You commit to a certain dollar amount of spend

00:06:46.060 --> 00:06:49.060
per hour, say $10 an hour, for one or three years.

00:06:49.279 --> 00:06:51.579
Okay. And that discount applies much more broadly.

00:06:52.019 --> 00:06:54.199
If you change instance families within that commitment,

00:06:54.240 --> 00:06:57.100
say from an M5 to an M6, or even move region

00:06:57.100 --> 00:06:59.899
sometimes, the savings plan discount often still

00:06:59.899 --> 00:07:02.399
applies. Much easier to manage. So the bottom

00:07:02.399 --> 00:07:05.680
line, if it runs 247, commit to a savings plan.

00:07:05.939 --> 00:07:08.000
Absolutely. That's a primary way to save big

00:07:08.000 --> 00:07:10.569
on predictable usage. Okay, now the extreme end

00:07:10.569 --> 00:07:12.389
of savings, the one that sounds almost too good

00:07:12.389 --> 00:07:15.850
to be true, spot instances. Up to 90 % off. Yep,

00:07:16.029 --> 00:07:18.649
a spot can offer truly massive discounts. You're

00:07:18.649 --> 00:07:21.850
basically bidding on spare, unused EC2 capacity

00:07:21.850 --> 00:07:24.230
that AWS has lying around. So what's the catch?

00:07:24.310 --> 00:07:26.829
There's always a catch with 90 % off. The catch

00:07:26.829 --> 00:07:29.899
is interruption. Because it's bare capacity,

00:07:30.420 --> 00:07:32.439
AWS can reclaim it whenever they need it back

00:07:32.439 --> 00:07:34.680
for on -demand or reserves customers. Reclaim

00:07:34.680 --> 00:07:36.899
it, like, just turn it off. Pretty much. They

00:07:36.899 --> 00:07:38.360
give you a two -minute warning and then boom,

00:07:38.480 --> 00:07:41.100
the instance is terminated. Two minutes? That's

00:07:41.100 --> 00:07:43.639
not a lot of time. It's not. Which means Spot

00:07:43.639 --> 00:07:46.019
is only suitable for certain types of workloads.

00:07:46.779 --> 00:07:49.579
Like, what? How do you use that warning? Your

00:07:49.579 --> 00:07:51.680
application has to be genuinely fault tolerant

00:07:51.680 --> 00:07:55.939
and, ideally, stateless. It needs to be able

00:07:55.939 --> 00:07:58.779
to handle that interruption gracefully, meaning

00:07:58.779 --> 00:08:01.319
it shouldn't store critical data only on that

00:08:01.319 --> 00:08:04.079
instance. It needs to be able to, say, checkpoint

00:08:04.079 --> 00:08:07.019
its work, save its state somewhere durable, like

00:08:07.019 --> 00:08:09.899
S3 or a database, and shut down cleanly within

00:08:09.899 --> 00:08:12.240
those two minutes. Then another instance, maybe

00:08:12.240 --> 00:08:14.519
the spot instance and on -demand one, can pick

00:08:14.519 --> 00:08:16.839
up where it left off. OK, so things like batch

00:08:16.839 --> 00:08:21.459
processing, big data jobs, rendering. Exactly.

00:08:21.720 --> 00:08:24.040
Tasks that can be stopped and restarted without

00:08:24.040 --> 00:08:26.399
losing everything. Things that are maybe distributed

00:08:26.399 --> 00:08:29.339
across many instances anyway. If it's your main

00:08:29.339 --> 00:08:32.039
customer -facing database or a critical web server

00:08:32.039 --> 00:08:34.460
that absolutely cannot go down unexpectedly.

00:08:34.559 --> 00:08:37.480
Use Spot. Never use Spot for that. It's a huge

00:08:37.480 --> 00:08:40.799
architectural trade -off. Stability versus massive

00:08:40.799 --> 00:08:43.179
cost savings. Got it. Okay, one more pricing

00:08:43.179 --> 00:08:46.690
model. The niche one. Dedicated hosts. Right.

00:08:47.090 --> 00:08:49.289
A dedicated host is just what it sounds like.

00:08:49.610 --> 00:08:52.669
A physical server in an AWS data center that's

00:08:52.669 --> 00:08:56.149
entirely dedicated to your use. No other AWS

00:08:56.149 --> 00:08:58.470
customer shares that underlying hardware. Why

00:08:58.470 --> 00:09:00.090
would someone need that? Isn't the whole point

00:09:00.090 --> 00:09:02.470
of cloud -shared infrastructure? Usually it's

00:09:02.470 --> 00:09:04.950
not about cost savings. It's almost always about

00:09:04.950 --> 00:09:07.710
compliance or licensing. Ah, licensing. Like

00:09:07.710 --> 00:09:10.350
software that's licensed per physical core? Exactly.

00:09:10.629 --> 00:09:12.899
Some legacy software, maybe... Bring your own

00:09:12.899 --> 00:09:16.120
license BYOL scenarios. Have licensing terms

00:09:16.120 --> 00:09:19.360
tied to physical hardware sockets or cores. Dedicated

00:09:19.360 --> 00:09:22.000
hosts let you meet those terms. Or, sometimes,

00:09:22.340 --> 00:09:24.840
very strict regulatory requirements might mandate

00:09:24.840 --> 00:09:26.960
physical isolation. So it's a compliance tool,

00:09:27.120 --> 00:09:29.600
not really a cost -saving tool. Primarily, yes.

00:09:29.720 --> 00:09:32.120
It's more expensive than other options. OK, that

00:09:32.120 --> 00:09:35.000
covers the single server EC2 really well. But

00:09:35.000 --> 00:09:36.980
like we said, one server isn't resilient. It

00:09:36.980 --> 00:09:38.940
can't handle big traffic swings. So we need to

00:09:38.940 --> 00:09:42.120
move up a level, domain two, scaling and high

00:09:42.120 --> 00:09:44.659
availability. Right. Moving beyond the single

00:09:44.659 --> 00:09:47.039
instance to the services that make your application

00:09:47.039 --> 00:09:50.500
resilient, available, and able to handle fluctuating

00:09:50.500 --> 00:09:54.159
demand. This naturally brings us to elastic load

00:09:54.159 --> 00:09:57.080
balancing. The load balancer. So this sits in

00:09:57.080 --> 00:09:59.600
front of my EC2 instances. Correct. It acts as

00:09:59.600 --> 00:10:01.779
the single front door, the single point of contact

00:10:01.779 --> 00:10:04.919
for all your users or clients. Its main job is

00:10:04.919 --> 00:10:07.220
to take that incoming traffic and distribute

00:10:07.220 --> 00:10:10.480
it intelligently across your fleet of EC2 instances

00:10:10.480 --> 00:10:12.789
running the application. Distributing the load

00:10:12.789 --> 00:10:14.970
makes sense. But it does more than that, right,

00:10:14.990 --> 00:10:17.330
for availability? Absolutely. Its second key

00:10:17.330 --> 00:10:19.690
function is ensuring high availability. The best

00:10:19.690 --> 00:10:22.490
practice is to run your EC2 instances across

00:10:22.490 --> 00:10:25.529
multiple availability zones, or AZs. And AZs

00:10:25.529 --> 00:10:28.210
are separate data centers within a region. Physically

00:10:28.210 --> 00:10:31.649
separate, yes. Independent power, cooling, networking.

00:10:32.450 --> 00:10:35.509
So the ELB spreads the traffic across instances

00:10:35.509 --> 00:10:39.960
in, say, AZ1 and AZ2 if something catastrophic

00:10:39.960 --> 00:10:43.200
happens, and the entire AZ1 goes offline. The

00:10:43.200 --> 00:10:45.299
load balancer just sends traffic to the instances

00:10:45.299 --> 00:10:48.320
still healthy in AZ2. Precisely. Your application

00:10:48.320 --> 00:10:50.679
stays up. That's high availability in practice,

00:10:51.220 --> 00:10:53.460
enabled by the ELB. You said it's a single point

00:10:53.460 --> 00:10:55.840
of contact, but there isn't just one type of

00:10:55.840 --> 00:10:58.120
load balancer anymore, is there? This trips people

00:10:58.120 --> 00:11:00.539
up. That's a really important point. There are

00:11:00.539 --> 00:11:02.580
a few types, but the main two you need to understand

00:11:02.580 --> 00:11:05.820
are the application load balancer, ALB, and the

00:11:05.820 --> 00:11:09.080
network load balancer, NLB. ALB and NLB. Difference.

00:11:09.279 --> 00:11:10.899
They operated different layers of the network

00:11:10.899 --> 00:11:13.399
stack. The ALB works at layer 7, the application

00:11:13.399 --> 00:11:16.799
layer. Meaning it understands HTTP, HTTPS. Exactly.

00:11:16.960 --> 00:11:18.879
It's smart. It can look inside the request at

00:11:18.879 --> 00:11:22.159
the URL path, the host name, HTTP headers, and

00:11:22.159 --> 00:11:24.080
make sophisticated routing decisions based on

00:11:24.080 --> 00:11:26.379
that content. Want to send images traffic to

00:11:26.379 --> 00:11:28.820
one set of servers and AP to another? ALB could

00:11:28.820 --> 00:11:31.559
do that. perfect for modern microservices or

00:11:31.559 --> 00:11:34.240
complex web apps. OK. ALB is the smart one for

00:11:34.240 --> 00:11:37.220
web traffic. What about NLB? NLB operates a layer

00:11:37.220 --> 00:11:40.639
4, the transport layer, TCP UDP. So it doesn't

00:11:40.639 --> 00:11:42.639
look inside the application request? Nope. It

00:11:42.639 --> 00:11:44.659
just looks at IP address and port information.

00:11:45.100 --> 00:11:48.039
It's incredibly fast and designed for ultra high

00:11:48.039 --> 00:11:50.519
performance, millions of requests per second

00:11:50.519 --> 00:11:53.200
with very low latency. Because it's simpler.

00:11:53.440 --> 00:11:57.440
it's faster. Great for TCP traffic, UDP, or situations

00:11:57.440 --> 00:11:59.600
where you need extreme throughput or a static

00:11:59.600 --> 00:12:02.299
IP address for your load balancer endpoint. So

00:12:02.299 --> 00:12:05.639
ALB for smart application routing, NLB for raw

00:12:05.639 --> 00:12:07.899
speed and throughput. Got it. And the load balancer

00:12:07.899 --> 00:12:10.200
also checks if the instances behind it are actually

00:12:10.200 --> 00:12:12.179
working. Health checks. Critically important,

00:12:12.299 --> 00:12:15.000
yes. The ELB constantly pings its registered

00:12:15.000 --> 00:12:17.679
targets, your EC2 instances, using configured

00:12:17.679 --> 00:12:20.100
health checks. Could be checking a specific web

00:12:20.100 --> 00:12:22.679
page, a port. And if an instance fails that check,

00:12:22.980 --> 00:12:25.600
maybe the app crashed. The ELB immediately marks

00:12:25.600 --> 00:12:27.840
it as unhealthy and stops sending any new traffic

00:12:27.840 --> 00:12:30.340
to it automatically, instantly. It routes traffic

00:12:30.340 --> 00:12:32.820
only to the healthy instances. This is key for

00:12:32.820 --> 00:12:35.139
user experience. Nobody wants to hit a dead server.

00:12:35.370 --> 00:12:37.429
OK, that covers spreading the load and staying

00:12:37.429 --> 00:12:39.990
highly available. But what about handling those

00:12:39.990 --> 00:12:42.649
traffic spikes? Staling up for a sale, scaling

00:12:42.649 --> 00:12:46.029
down overnight, that elasticity part. Ah, that's

00:12:46.029 --> 00:12:48.529
where the partner service comes in. Amazon EC2

00:12:48.529 --> 00:12:52.570
auto scaling, ASG. Auto scaling. This is the

00:12:52.570 --> 00:12:56.389
magic behind true elasticity. And honestly, probably

00:12:56.389 --> 00:12:59.190
the single biggest cost optimization tool for

00:12:59.190 --> 00:13:02.879
workloads with variable traffic. ASG automatically

00:13:02.879 --> 00:13:06.240
adds more EC2 instances that scale out when demand

00:13:06.240 --> 00:13:09.279
goes up and removes instances scale in when demand

00:13:09.279 --> 00:13:11.360
drops. So you're only paying for the capacity

00:13:11.360 --> 00:13:13.820
you actually need right now. Exactly. No more

00:13:13.820 --> 00:13:15.779
over -provisioning massive fleets just to handle

00:13:15.779 --> 00:13:17.820
peak load, which was the old data center nightmare.

00:13:18.279 --> 00:13:21.779
ASG matches capacity to demand dynamically. How

00:13:21.779 --> 00:13:23.899
do you set it up? You must tell it how to scale.

00:13:23.960 --> 00:13:27.059
You do. You define an auto -scaling group, and

00:13:27.059 --> 00:13:29.259
there are three key capacity settings you can

00:13:29.259 --> 00:13:33.019
figure. The minimum capacity. the absolute smallest

00:13:33.019 --> 00:13:34.759
number of instances you always want running,

00:13:35.039 --> 00:13:37.759
maybe for base availability. The maximum capacity,

00:13:38.720 --> 00:13:41.080
the upper limit, your cost ceiling stops it running

00:13:41.080 --> 00:13:45.500
away. And the desired capacity, what the ASG

00:13:45.500 --> 00:13:47.460
should aim for under normal conditions or what

00:13:47.460 --> 00:13:50.899
it starts with. Min, max, desired, the boundaries.

00:13:51.419 --> 00:13:53.809
And what actually triggers the scaling? Adding

00:13:53.809 --> 00:13:55.590
or removing instances. Those are the scaling

00:13:55.590 --> 00:13:57.409
policies you define the conditions It could be

00:13:57.409 --> 00:14:00.370
a simple policy like if average CPU across the

00:14:00.370 --> 00:14:03.129
fleet goes above 70 % for five minutes add two

00:14:03.129 --> 00:14:06.409
instances Okay, but the more modern and usually

00:14:06.409 --> 00:14:09.309
recommended approach is target tracking scaling

00:14:09.309 --> 00:14:11.309
target tracking Yeah, it's much simpler to manage.

00:14:11.309 --> 00:14:13.730
You just tell the ASG. I want to maintain an

00:14:13.730 --> 00:14:17.389
average CPU utilization of say 60 % across all

00:14:17.389 --> 00:14:19.830
instances. And the ESG does the math automatically

00:14:19.830 --> 00:14:22.529
adding or removing instances as needed to keep

00:14:22.529 --> 00:14:25.049
that target metric stable. You can target CPU,

00:14:25.230 --> 00:14:27.549
network traffic, request count per instance from

00:14:27.549 --> 00:14:30.370
the ELB. It's quite powerful. Oh, that sounds

00:14:30.370 --> 00:14:32.409
much easier to set the target. It generally is.

00:14:32.590 --> 00:14:35.629
So we see the synergy here. The ELB takes the

00:14:35.629 --> 00:14:38.250
incoming traffic, spreads it across AZs for availability.

00:14:38.600 --> 00:14:42.460
The ASG watches the instances behind the ELB

00:14:42.460 --> 00:14:45.580
and automatically adjusts the fleet size to match

00:14:45.580 --> 00:14:48.200
the demand the ELB is seeing based on metrics

00:14:48.200 --> 00:14:51.039
like CPU. That's a classic bread -and -butter

00:14:51.039 --> 00:14:54.179
architecture for a scalable, resilient web application

00:14:54.179 --> 00:14:58.639
on AWS. You see two instances managed by an ASG

00:14:58.639 --> 00:15:00.799
sitting behind an ELB, usually an ELB. You'll

00:15:00.799 --> 00:15:03.519
see that pattern everywhere. Okay, but... What

00:15:03.519 --> 00:15:05.779
if I don't want to manage any of that? No EC2

00:15:05.779 --> 00:15:09.500
instances, no AMIs, no OS patching, no configuring

00:15:09.500 --> 00:15:12.080
ASGs. Ah, now you're talking about fundamental

00:15:12.080 --> 00:15:15.779
shift. That brings us neatly into Domain 3. Serverless

00:15:15.779 --> 00:15:18.399
compute, and the poster child for that is AWS

00:15:18.399 --> 00:15:21.000
Lambda. Serverless, which as you always have

00:15:21.000 --> 00:15:22.879
to point out, doesn't mean no servers. Right.

00:15:23.000 --> 00:15:24.860
It's a bit of a misnomer. The servers definitely

00:15:24.860 --> 00:15:26.940
exist. The crucial difference is you don't manage

00:15:26.940 --> 00:15:29.500
them. AWS handles all the underlying infrastructure.

00:15:29.779 --> 00:15:31.759
And Lambda is the core service here, function

00:15:31.759 --> 00:15:34.480
as a service or face. Exactly. Lambda lets you

00:15:34.480 --> 00:15:36.279
run your code without thinking about provisioning

00:15:36.279 --> 00:15:38.480
or managing servers at all. What's the key principle?

00:15:38.600 --> 00:15:40.879
What burden does it lift? Basically everything

00:15:40.879 --> 00:15:43.940
related to infrastructure management. AWS handles

00:15:43.940 --> 00:15:46.200
provisioning the compute capacity, the automatic

00:15:46.200 --> 00:15:48.759
scaling, patching the operating systems, all

00:15:48.759 --> 00:15:51.879
the underlying administration. You, the developer,

00:15:52.100 --> 00:15:53.940
just focus on writing the code for your function.

00:15:54.620 --> 00:15:56.840
That's it. That sounds like a massive reduction

00:15:56.840 --> 00:15:59.860
in operational work. It is. Huge. And unlike

00:15:59.860 --> 00:16:02.460
an EC2 web server that's running 24 -7 waiting

00:16:02.460 --> 00:16:05.500
for requests, Lambda works differently. Fundamentally

00:16:05.500 --> 00:16:07.860
different. Lambda is event -driven. Event -driven.

00:16:08.440 --> 00:16:11.080
Meaning... Your Lambda function sits there, dormant,

00:16:11.539 --> 00:16:13.870
until something triggers it. An event happens

00:16:13.870 --> 00:16:16.909
in your AWS environment, and that event invokes

00:16:16.909 --> 00:16:19.090
the function. What kind of events? All sorts.

00:16:19.549 --> 00:16:21.549
An image file gets uploaded to an S3 bucket,

00:16:21.909 --> 00:16:24.789
a message arrives on an SQS queue, a record gets

00:16:24.789 --> 00:16:28.110
updated in a DynamoDB table, a user logs in via

00:16:28.110 --> 00:16:31.549
Cognito, or very commonly an HTTP request comes

00:16:31.549 --> 00:16:33.990
in through API Gateway. The function only runs

00:16:33.990 --> 00:16:35.690
in response to these triggers. It's reactive.

00:16:35.990 --> 00:16:38.049
Totally reactive. It sleeps until woken up by

00:16:38.049 --> 00:16:40.070
an event. And that has a huge impact on cost,

00:16:40.190 --> 00:16:42.179
right? Yeah. The pay -per -use billing. This

00:16:42.179 --> 00:16:44.960
is the killer feature for cost optimization on

00:16:44.960 --> 00:16:47.559
intermittent workloads. If your Lambda function

00:16:47.559 --> 00:16:50.840
isn't running, if it's idle, you pay absolutely

00:16:50.840 --> 00:16:53.820
nothing. Zero. Wow. You only get charged for

00:16:53.820 --> 00:16:56.299
two things. The number of times your function

00:16:56.299 --> 00:16:59.360
is invoked, the number of requests, and the total

00:16:59.360 --> 00:17:01.980
time your code spends executing, measured in

00:17:01.980 --> 00:17:04.420
milliseconds. Milliseconds. Down to the millisecond.

00:17:04.740 --> 00:17:07.500
If your function runs for, say, 500 milliseconds,

00:17:08.359 --> 00:17:11.319
you pay for exactly 500 milliseconds of compute

00:17:11.319 --> 00:17:14.240
time. Compare that to an EC2 instance that you

00:17:14.240 --> 00:17:17.940
pay for 247, even if it's idle 90 % of the time.

00:17:18.539 --> 00:17:21.019
The efficiency is incredible for the right workload.

00:17:21.079 --> 00:17:24.079
And scaling. So I'd only get 1 ,000 S3 uploads

00:17:24.079 --> 00:17:26.269
at once. The scaling is also totally different

00:17:26.269 --> 00:17:28.710
from EC2. It's automatic and near instantaneous.

00:17:29.490 --> 00:17:31.930
Lambda has massive built -in concurrency. If

00:17:31.930 --> 00:17:34.309
the thousand events hit simultaneously, Lambda

00:17:34.309 --> 00:17:36.549
will try to spin up a thousand parallel execution

00:17:36.549 --> 00:17:38.650
environments to run your function concurrently

00:17:38.650 --> 00:17:40.849
to handle that burst. You don't configure min

00:17:40.849 --> 00:17:42.990
-max instances or scaling policies like with

00:17:42.990 --> 00:17:45.829
ASG. It just happens. It just handles it. Up

00:17:45.829 --> 00:17:47.390
to your counts concurrency limits, of course,

00:17:47.390 --> 00:17:49.529
but those are typically very high. That seamless

00:17:49.529 --> 00:17:52.750
scaling sounds amazing, but there must be a trade

00:17:52.750 --> 00:17:55.700
-off. What about performance? Especially for

00:17:55.700 --> 00:17:58.079
that first request, I've heard about cold starts.

00:17:58.440 --> 00:18:02.519
Ah, yes, the cold start. That is the main operational

00:18:02.519 --> 00:18:04.660
trade -off you need to be aware of, especially

00:18:04.660 --> 00:18:07.079
for latency -sensitive applications. What is

00:18:07.079 --> 00:18:10.240
it exactly? Well, if your Lambda function has

00:18:10.240 --> 00:18:13.099
been invoked for a while, AWS might tear down

00:18:13.099 --> 00:18:15.380
the underlying container or execution environment

00:18:15.380 --> 00:18:18.380
to save resources. When the next request comes

00:18:18.380 --> 00:18:21.599
in, Lambda has to do some setup work first. Find

00:18:21.599 --> 00:18:23.799
capacity, provision the environment, download

00:18:23.799 --> 00:18:26.299
your code, initialize the runtime. Before your

00:18:26.299 --> 00:18:29.259
code even starts running. Exactly. That initialization

00:18:29.259 --> 00:18:32.160
phase is the cold start. It can add latency sometimes

00:18:32.160 --> 00:18:35.160
just tens or hundreds of milliseconds, but sometimes

00:18:35.160 --> 00:18:37.519
for complex functions or certain runtimes, it

00:18:37.519 --> 00:18:39.559
could be a second or two. And that extra latency

00:18:39.559 --> 00:18:41.859
happens only on the first request after a period

00:18:41.859 --> 00:18:44.500
of inactivity. Generally, yes, or when scaling

00:18:44.500 --> 00:18:47.039
up rapidly requires new environments. Subsequent

00:18:47.039 --> 00:18:49.019
requests to a warm environment are usually much

00:18:49.019 --> 00:18:53.029
faster. So that cold start latency could be unacceptable

00:18:53.029 --> 00:18:56.569
for, say, a real -time API endpoint where users

00:18:56.569 --> 00:18:59.210
expect an instant response. It absolutely could.

00:18:59.470 --> 00:19:01.809
You're essentially trading potentially higher,

00:19:02.029 --> 00:19:05.009
unpredictable latency on some invocations for

00:19:05.009 --> 00:19:07.670
the benefit of lower cost and zero server management.

00:19:07.789 --> 00:19:10.289
So it's a conscious choice based on the workload's

00:19:10.289 --> 00:19:12.809
priorities. Always. Is consistent low latency

00:19:12.809 --> 00:19:16.069
paramount? Maybe EC2 is better. Is cost efficiency

00:19:16.069 --> 00:19:18.819
and minimal ops overhead the driver? Lambda looks

00:19:18.819 --> 00:19:21.160
very attractive. This leads us right into the

00:19:21.160 --> 00:19:24.819
core comparison, EC2 versus Lambda. Let's really

00:19:24.819 --> 00:19:26.619
nail down the key differences, because this comes

00:19:26.619 --> 00:19:28.460
up all the time. Definitely. Let's break it down.

00:19:28.640 --> 00:19:31.559
OK, first, the basic model. EC2 is Infrastructure

00:19:31.559 --> 00:19:34.980
as a Service, IAS. You rent the virtual server

00:19:34.980 --> 00:19:38.380
hardware. Lambda is Function as a Service, AAS,

00:19:38.599 --> 00:19:41.079
or Serverless. You rent code execution time.

00:19:41.279 --> 00:19:43.680
Server management. With EC2, it's all you. OS

00:19:43.680 --> 00:19:45.960
patching, installing software, managing scaling,

00:19:46.200 --> 00:19:48.779
security hardening, Lambda. AWS manages everything

00:19:48.779 --> 00:19:50.339
under the hood. You just manage your function

00:19:50.339 --> 00:19:53.000
code. EC2 needs you to configure auto -scaling

00:19:53.000 --> 00:19:56.059
groups. To find policies, set min -max limits.

00:19:56.559 --> 00:19:59.339
Lambda scaling is built -in, automatic, instantaneous.

00:19:59.440 --> 00:20:02.759
Billing. EC2. Pay while it's running, even if

00:20:02.759 --> 00:20:06.000
idle. You pay for that standby capacity. Lambda.

00:20:06.599 --> 00:20:09.619
Pay only for requests and execution time in milliseconds.

00:20:10.180 --> 00:20:12.200
Zero costs when idle. And the big constraint.

00:20:12.299 --> 00:20:15.250
How long can they run? EC2 instances can run

00:20:15.250 --> 00:20:18.589
indefinitely, 247 for years if you want. Lambda

00:20:18.589 --> 00:20:20.930
functions have a maximum execution timeout. Currently,

00:20:21.210 --> 00:20:23.750
it's 15 minutes. 900 seconds. Only 15 minutes.

00:20:24.130 --> 00:20:26.390
Yep. If your task takes 16 minutes to complete,

00:20:26.730 --> 00:20:28.789
lambda is simply not an option. You'd need EC2

00:20:28.789 --> 00:20:31.349
or maybe containers. OK. So that clearly defines

00:20:31.349 --> 00:20:34.319
the sweet spot for each, EC2 -4. long -running

00:20:34.319 --> 00:20:37.099
applications, web servers, APIs needing consistent

00:20:37.099 --> 00:20:39.599
low -latency databases, anything that needs to

00:20:39.599 --> 00:20:41.759
run continuously or for longer than 15 minutes,

00:20:42.099 --> 00:21:04.359
persistent services. That distinction is critical.

00:21:04.400 --> 00:21:06.619
Get that wrong, and you're either overspending

00:21:06.619 --> 00:21:08.680
or using the wrong tool. Absolutely fundamental.

00:21:09.079 --> 00:21:12.019
Okay, moving on. We've done VMs, EC2, functions,

00:21:12.200 --> 00:21:14.920
Lambda. What about containers? Docker is huge

00:21:14.920 --> 00:21:17.240
now. Packaging apps and dependencies together.

00:21:17.460 --> 00:21:20.339
That brings us to domain four. Container services.

00:21:20.539 --> 00:21:22.339
Right. Containers solve that classic, well, it

00:21:22.339 --> 00:21:25.079
works on my machine problem by packaging everything

00:21:25.079 --> 00:21:27.660
needed into a single unit. Ensures consistency.

00:21:28.319 --> 00:21:31.099
But running containers at scale, that needs management.

00:21:31.390 --> 00:21:34.509
An orchestrator. And AWS gives us choices for

00:21:34.509 --> 00:21:36.769
orchestration. Two main ones. First, there's

00:21:36.769 --> 00:21:39.829
Amazon Elastic Container Service, ECS. ECS. That's

00:21:39.829 --> 00:21:43.269
AWS's own one. Yep. Their native homegrown container

00:21:43.269 --> 00:21:45.690
orchestration service. It's highly scalable,

00:21:46.150 --> 00:21:48.309
fast, and designed for really tight integration

00:21:48.309 --> 00:21:52.630
with other AWS services like IAM, VPC, load balancers.

00:21:52.930 --> 00:21:54.990
Often preferred if you're already deep in the

00:21:54.990 --> 00:21:58.170
AWS ecosystem and want something simpler to manage.

00:21:58.460 --> 00:22:00.619
than the alternative. And the alternative is?

00:22:00.799 --> 00:22:04.440
The industry standard, Kubernetes. AWS offers

00:22:04.440 --> 00:22:08.240
Amazon Elastic Kubernetes Service, EKS. EKS for

00:22:08.240 --> 00:22:10.880
Kubernetes. Right. EKS is a managed service that

00:22:10.880 --> 00:22:13.299
lets you run standard open source Kubernetes

00:22:13.299 --> 00:22:15.900
on AWS without having to deal with the nightmare

00:22:15.900 --> 00:22:17.940
of installing and managing the Kubernetes control

00:22:17.940 --> 00:22:20.400
plane yourself, which is notoriously complex.

00:22:21.039 --> 00:22:23.039
Teams already using Kubernetes or wanting that

00:22:23.039 --> 00:22:26.069
open source portability, often choose EKS. So

00:22:26.069 --> 00:22:28.970
ECS is the AWS native way. EKS is the managed

00:22:28.970 --> 00:22:31.210
open source Kubernetes way. But with both of

00:22:31.210 --> 00:22:32.789
these, traditionally, you still had to run your

00:22:32.789 --> 00:22:34.809
containers on something, right? Like on EC2 instances?

00:22:35.289 --> 00:22:37.869
Exactly. You'd set up ECS or EKS, but then you'd

00:22:37.869 --> 00:22:40.990
also need to provision and manage a cluster of

00:22:40.990 --> 00:22:44.410
EC2 instances to actually host your container

00:22:44.410 --> 00:22:47.059
tasks or pods. You still have to worry about

00:22:47.059 --> 00:22:49.500
patching those EC2 instances, scaling them with

00:22:49.500 --> 00:22:52.720
ASG's, optimizing the instance types. Which sounds

00:22:52.720 --> 00:22:55.279
like we're back to managing servers again, even

00:22:55.279 --> 00:22:57.660
with containers. It does, yeah. And that's where

00:22:57.660 --> 00:22:59.700
the real game changer comes in, combining containers

00:22:59.700 --> 00:23:02.880
with that serverless idea we talked about, AWS

00:23:02.880 --> 00:23:05.940
Fargate. Fargate. Okay, what is Fargate? Is it

00:23:05.940 --> 00:23:08.160
another orchestrator? No, that's the key thing.

00:23:08.400 --> 00:23:10.599
Fargate is not an orchestrator. It's a serverless

00:23:10.599 --> 00:23:13.539
compute engine that works with both ECS and EKS.

00:23:13.700 --> 00:23:16.119
A serverless engine for containers? Precisely.

00:23:16.579 --> 00:23:18.740
When you use Fargate, you completely eliminate

00:23:18.740 --> 00:23:21.519
the need to provision, manage, patch, or scale

00:23:21.519 --> 00:23:24.059
that underlying cluster of EC2 virtual machines.

00:23:24.200 --> 00:23:26.180
How does that work? You just define your container

00:23:26.180 --> 00:23:29.980
task for ECS or pod for EKS as usual. You specify

00:23:29.980 --> 00:23:32.480
the CPU and memory resources your container needs.

00:23:32.960 --> 00:23:35.900
Then you tell ECS or EKS to launch it using the

00:23:35.900 --> 00:23:37.799
Fargate launch type. And Fargate just runs it

00:23:37.799 --> 00:23:40.420
without me picking an EC2 instance. Exactly.

00:23:40.799 --> 00:23:43.160
Fargate finds the capacity and runs your container

00:23:43.160 --> 00:23:45.940
on fully managed infrastructure. You don't see

00:23:45.940 --> 00:23:48.880
or manage any EC2 instances at all. It's like

00:23:48.880 --> 00:23:51.900
Lambda. but for containers. Wow. So the takeaway

00:23:51.900 --> 00:23:54.880
is Fargate is the serverless way to run containers

00:23:54.880 --> 00:23:58.680
on AWS using either ECS or EKS as the orchestrator.

00:23:58.740 --> 00:24:01.059
That's the perfect summary. You choose your orchestrator,

00:24:01.240 --> 00:24:03.839
ECS or EKS, and then you choose your launch type,

00:24:04.259 --> 00:24:07.079
either EC2 if you want control over the VMs or

00:24:07.079 --> 00:24:09.339
Fargate if you want serverless. That feels like

00:24:09.339 --> 00:24:12.400
a really powerful option for teams who love containers

00:24:12.400 --> 00:24:15.160
but hate managing VMs. It really is. It simplifies

00:24:15.160 --> 00:24:17.650
operations massively. Focus on your containers,

00:24:17.809 --> 00:24:20.349
not the infrastructure they run on. Okay, we've

00:24:20.349 --> 00:24:23.190
covered the big three execution models. IAS with

00:24:23.190 --> 00:24:25.930
EC2, FAS with Lambda, and serverless containers

00:24:25.930 --> 00:24:28.849
with Fargate. Let's quickly touch on Domain 5,

00:24:29.130 --> 00:24:31.910
other key compute -related services. These seem

00:24:31.910 --> 00:24:34.349
like tools for specific situations. Yeah, these

00:24:34.349 --> 00:24:36.509
are more specialized, often providing higher

00:24:36.509 --> 00:24:38.609
levels of extraction or solving niche problems

00:24:38.609 --> 00:24:42.210
like location. Let's start with AWS Elastic Beanstalk.

00:24:42.430 --> 00:24:44.869
I hear this described as Patas platform as a

00:24:44.869 --> 00:24:47.990
service. That's exactly what it is. Elastic Beanstalk

00:24:47.990 --> 00:24:52.210
sits between raw IS like EC2 and pure Fados like

00:24:52.210 --> 00:24:55.049
Lambda. What's its job? Its core function is

00:24:55.049 --> 00:24:57.750
to make it super easy to deploy and scale standard

00:24:57.750 --> 00:25:00.509
web applications and services. Think apps written

00:25:00.509 --> 00:25:05.730
in Java, .NET, PHP, Node .js, Python, Ruby, Go,

00:25:05.809 --> 00:25:08.059
Docker. Okay, so how does it make it easy? The

00:25:08.059 --> 00:25:10.599
magic of Beanstalk is abstraction. You basically

00:25:10.599 --> 00:25:12.759
just upload your application code. My war file

00:25:12.759 --> 00:25:15.480
or my Python code. Yep. And Elastic Beanstalk

00:25:15.480 --> 00:25:17.900
automatically handles provisioning and managing

00:25:17.900 --> 00:25:20.700
the entire underlying environment stack needed

00:25:20.700 --> 00:25:22.759
to run it. The whole stack, what does that include?

00:25:22.980 --> 00:25:25.819
It provisions the EC2 instances, sets up the

00:25:25.819 --> 00:25:28.009
auto scaling group. configures the elastic load

00:25:28.009 --> 00:25:31.150
balancer, creates security groups, manages application

00:25:31.150 --> 00:25:33.769
versions, handles deployments like rolling updates

00:25:33.769 --> 00:25:36.470
or immutable deployments. All of that infrastructure

00:25:36.470 --> 00:25:39.349
complexity is managed for you based on your application

00:25:39.349 --> 00:25:43.650
type. So it's using EC2, ASG, ELB under the covers,

00:25:43.930 --> 00:25:46.210
but hiding the configuration details. Precisely.

00:25:46.329 --> 00:25:48.930
It abstracts away that lower -level infrastructure

00:25:48.930 --> 00:25:51.410
configuration, letting developers focus just

00:25:51.410 --> 00:25:55.089
on writing code, not on wiring together AWS services.

00:25:55.230 --> 00:25:57.349
So, if you hear about a developer wanting to

00:25:57.349 --> 00:25:59.910
deploy a web app quickly, needing scaling and

00:25:59.910 --> 00:26:02.509
load balancing, but explicitly not wanting to

00:26:02.509 --> 00:26:05.970
manage the underlying AWS resources, Elastic

00:26:05.970 --> 00:26:08.809
Beanstalk sounds like the answer. It's very often

00:26:08.809 --> 00:26:10.730
the answer in that scenario. The perfect big

00:26:10.730 --> 00:26:13.329
yeast is middle ground. Okay. Next up, something

00:26:13.329 --> 00:26:17.559
quite different. AWS Outposts. This sounds like

00:26:17.559 --> 00:26:21.240
bringing AWS home. It kind of is. Outposts is

00:26:21.240 --> 00:26:24.400
a unique service that literally extends AWS infrastructure,

00:26:25.200 --> 00:26:27.859
the same hardware services, APIs, and tools you

00:26:27.859 --> 00:26:30.799
use in an AWS region, directly into your own

00:26:30.799 --> 00:26:33.299
on -premises data center or co -location facility.

00:26:33.519 --> 00:26:35.119
Why would you need to do that? Isn't the point

00:26:35.119 --> 00:26:37.420
of cloud not running stuff in your own data center?

00:26:37.579 --> 00:26:40.500
Usually for two main reasons. Latency or data

00:26:40.500 --> 00:26:42.940
residency? Latency. Yeah. Imagine you have an

00:26:42.940 --> 00:26:45.119
application that requires extremely low latency,

00:26:45.559 --> 00:26:47.599
like single -digit millisecond response times

00:26:47.599 --> 00:26:50.119
maybe for factory automation or real -time financial

00:26:50.119 --> 00:26:52.460
trading. The physical distance of the nearest

00:26:52.460 --> 00:26:55.220
AWS region might introduce too much network latency.

00:26:55.559 --> 00:26:58.299
Outposts lets you run those AWS services locally

00:26:58.299 --> 00:27:00.440
right next to your end users or equipment. And

00:27:00.440 --> 00:27:03.039
data residency. Some industries or countries

00:27:03.039 --> 00:27:05.960
have very strict regulations that require certain

00:27:05.960 --> 00:27:08.880
data to physically remain within a specific location,

00:27:09.400 --> 00:27:11.119
like the customer's own building or country.

00:27:11.920 --> 00:27:14.660
Outposts allows you to process that data using

00:27:14.660 --> 00:27:17.579
familiar AWS services while ensuring it never

00:27:17.579 --> 00:27:20.039
leaves your premises. So outposts is the answer

00:27:20.039 --> 00:27:23.299
when you need AWS compute locally. Exactly. Cloud

00:27:23.299 --> 00:27:25.720
infrastructure. on your turf. Okay, one more

00:27:25.720 --> 00:27:28.480
in this category. Amazon LightSail. This sounds

00:27:28.480 --> 00:27:31.119
simpler. Much simpler. LightSail is designed

00:27:31.119 --> 00:27:33.460
to be the absolute easiest way to get started

00:27:33.460 --> 00:27:37.700
with a virtual private server, VPS, on AWS. Easier

00:27:37.700 --> 00:27:39.720
than EC2. Way easier. Think of people coming

00:27:39.720 --> 00:27:42.019
from traditional VPS providers like DigitalOcean

00:27:42.019 --> 00:27:45.630
or Linode. EC2, with all its auctions, AMIs,

00:27:45.869 --> 00:27:48.130
instance types, security groups, EBS volumes,

00:27:48.349 --> 00:27:50.910
VPCs, can be overwhelming for beginners. So how

00:27:50.910 --> 00:27:53.210
does LightSale simplify it? It bundles everything

00:27:53.210 --> 00:27:55.349
you need into a single, simple package with a

00:27:55.349 --> 00:27:57.750
predictable low monthly price. You choose a plan

00:27:57.750 --> 00:28:00.250
like $5 per month, and it includes a virtual

00:28:00.250 --> 00:28:03.089
machine with a set amount of CPU, RAM, SSD storage,

00:28:03.190 --> 00:28:05.269
and data transfer allowance, plus networking

00:28:05.269 --> 00:28:08.470
and a static IP address. Ah, fixed monthly cost.

00:28:08.849 --> 00:28:11.930
Very predictable. Very predictable. It's perfect

00:28:11.930 --> 00:28:14.750
for beginners, students, developers wanting to

00:28:14.750 --> 00:28:17.349
run a simple website, blog, dev environment,

00:28:17.509 --> 00:28:20.529
or small application without needing the granular

00:28:20.529 --> 00:28:23.289
control and potential cost variability of full

00:28:23.289 --> 00:28:28.490
EC2. Simple fixed price VPS. Got it. Easy mode

00:28:28.490 --> 00:28:31.319
VPS. Okay, that was a whirlwind tour of the compute

00:28:31.319 --> 00:28:33.079
landscape. Let's try to bring it all together

00:28:33.079 --> 00:28:36.079
in Domain 7, Application and Core Cloud Concept

00:28:36.079 --> 00:28:38.660
Review. How do these services map back to those

00:28:38.660 --> 00:28:40.700
key cloud pillars we started with? Right, this

00:28:40.700 --> 00:28:42.720
is where we connect the dots. The pillars guide

00:28:42.720 --> 00:28:45.319
our choices. Let's start with elasticity, scaling

00:28:45.319 --> 00:28:47.319
up and down automatically. The key service for

00:28:47.319 --> 00:28:51.339
EC2 is Amazon EC2 Auto Scaling, ASG. That's what

00:28:51.339 --> 00:28:53.960
provides the automatic adjustment. For serverless,

00:28:54.099 --> 00:28:56.500
like Lambda and Fargate, elasticity is built

00:28:56.500 --> 00:28:59.059
in, handled by the platform. Okay. What about

00:28:59.059 --> 00:29:02.119
scalability, handling growth over time? For EC2

00:29:02.119 --> 00:29:04.339
-based applications, scalability is really achieved

00:29:04.339 --> 00:29:07.700
by the ELB and ASG working together. The ELB

00:29:07.700 --> 00:29:10.039
distributes increasing load, and the ASG adds

00:29:10.039 --> 00:29:12.440
instances to handle that load. High availability,

00:29:12.740 --> 00:29:15.240
running across multiple AZs surviving failures.

00:29:15.579 --> 00:29:18.059
The linchpin there is the elastic load balancer,

00:29:18.420 --> 00:29:21.680
ELB. It's what distributes traffic across those

00:29:21.680 --> 00:29:24.450
multiple AZs and routes around failures. And

00:29:24.450 --> 00:29:27.549
the big one, cost optimization, getting the best

00:29:27.549 --> 00:29:30.269
bang for your buck. Here, the choices are critical,

00:29:30.730 --> 00:29:32.950
using spot instances for fault -tolerant work

00:29:32.950 --> 00:29:35.750
for massive savings, committing via savings plans

00:29:35.750 --> 00:29:38.609
or IIs for predictable workloads, or choosing

00:29:38.609 --> 00:29:41.069
lambda for intermittent tasks to eliminate idle

00:29:41.069 --> 00:29:43.529
costs entirely. It's about matching the pricing

00:29:43.529 --> 00:29:45.619
model to the workload pattern. And that final

00:29:45.619 --> 00:29:49.180
concept, serverless, where AWS manages the infrastructure.

00:29:49.460 --> 00:29:52.380
That's embodied by AWS Lambda for functions and

00:29:52.380 --> 00:29:55.019
AWS Fargate for containers, shifting operational

00:29:55.019 --> 00:29:57.519
responsibility to AWS. Okay, let's make this

00:29:57.519 --> 00:30:00.460
really concrete. Scenario time. Let's force ourselves

00:30:00.460 --> 00:30:02.640
to pick the right service. Ready? Let's do it.

00:30:02.839 --> 00:30:05.380
Scenario one, an e -commerce site. Global traffic,

00:30:05.519 --> 00:30:07.799
huge spikes during sales, needs to be always

00:30:07.799 --> 00:30:10.400
up, needs to scale automatically. Okay, variable

00:30:10.400 --> 00:30:12.819
traffic, high availability needed, automatic

00:30:12.819 --> 00:30:15.700
scaling. That classic pattern, you need the trifecta.

00:30:15.980 --> 00:30:18.420
EC2 instances for the compute, managed by an

00:30:18.420 --> 00:30:20.319
auto -scaling group for the elasticity and cost

00:30:20.319 --> 00:30:22.779
savings, sitting behind an application load balancer,

00:30:23.079 --> 00:30:25.380
ALB, for the intelligent traffic distribution

00:30:25.380 --> 00:30:28.559
and HA across AZs. Makes perfect sense. Scenario

00:30:28.559 --> 00:30:31.880
2, a simple script. Resizes images whenever they

00:30:31.880 --> 00:30:34.720
land in an S3 bucket. runs for maybe 10 seconds,

00:30:34.880 --> 00:30:37.539
needs to be super cheap. Keywords, event -driven

00:30:37.539 --> 00:30:40.940
S3 upload, short execution time 10 seconds, cost

00:30:40.940 --> 00:30:45.440
is paramount. That screams AWS Lambda. EC2 would

00:30:45.440 --> 00:30:47.420
be massive overkill and way more expensive because

00:30:47.420 --> 00:30:49.700
you'd pay for idle time. Lambda only charges

00:30:49.700 --> 00:30:52.720
for those 10 seconds. Perfect. Scenario 3. A

00:30:52.720 --> 00:30:55.180
big bank runs an analytics app using Docker containers.

00:30:55.740 --> 00:30:57.880
The ops team loves containers but hates managing

00:30:57.880 --> 00:30:59.759
VMs. They want the simplest possible way to run

00:30:59.759 --> 00:31:02.500
these containers. Containers, yes. No VM management.

00:31:02.720 --> 00:31:05.259
That points directly to AWS Fargate. They've

00:31:05.259 --> 00:31:07.279
used Fargate as a launch type with either ECS

00:31:07.279 --> 00:31:09.980
or EKS as their orchestrator. Serverless containers.

00:31:10.400 --> 00:31:14.339
Got it. Takes the EC2 burden away. Scenario 4.

00:31:14.779 --> 00:31:17.019
A developer has a new Ruby on Rails web app.

00:31:17.230 --> 00:31:20.210
needs HA, load balancing, scaling, the works.

00:31:20.829 --> 00:31:23.130
But they don't want to mess with cloud formation

00:31:23.130 --> 00:31:26.569
or manually configure ASGs and ELBs. They just

00:31:26.569 --> 00:31:28.789
want to deploy the code. They're asking for that

00:31:28.789 --> 00:31:31.269
platform as a service abstraction. The answer

00:31:31.269 --> 00:31:35.349
is AWS Elastic Beanstalk. Upload the Ruby code.

00:31:35.750 --> 00:31:38.289
Beanstalk builds and manages the whole scalable

00:31:38.289 --> 00:31:40.789
load balanced environment automatically. Excellent.

00:31:40.930 --> 00:31:42.990
Those scenarios really help crystallize the choices.

00:31:44.009 --> 00:31:45.710
Well, that brings us towards the end of this

00:31:45.710 --> 00:31:47.930
deep dive. We've really traced the evolution,

00:31:47.970 --> 00:31:50.289
haven't we? From managing the virtual machine

00:31:50.289 --> 00:31:52.990
completely yourself with EC2. Right. Maximum

00:31:52.990 --> 00:31:55.230
control, maximum responsibility. To outsourcing

00:31:55.230 --> 00:31:57.490
a lot of that configuration complexity with Elastic

00:31:57.490 --> 00:32:00.130
Beanstalk. Picking up some control for operational

00:32:00.130 --> 00:32:03.089
ease. All the way to letting AWS handle basically

00:32:03.089 --> 00:32:05.289
the entire operational environment for your code

00:32:05.289 --> 00:32:08.470
with Lambda, Faylis, or Fargate. Minimum operational

00:32:08.470 --> 00:32:11.289
overhead, maximum focus on code, but with certain

00:32:11.289 --> 00:32:13.630
constraints like execution time or cold starts.

00:32:13.869 --> 00:32:16.609
Understanding those three models, IES, PICE,

00:32:16.690 --> 00:32:18.829
Face Serverless, is really the core takeaway,

00:32:18.990 --> 00:32:21.210
isn't it? It absolutely is. Because your choice

00:32:21.210 --> 00:32:23.970
isn't just technical. It's a strategic decision

00:32:23.970 --> 00:32:26.589
about where your team invests this time and effort,

00:32:27.089 --> 00:32:29.029
managing infrastructure or building features.

00:32:29.230 --> 00:32:31.190
And that really sets up our final provocative

00:32:31.190 --> 00:32:33.329
thought for you, the listener, to mull over.

00:32:34.029 --> 00:32:36.450
The deepest insight here isn't just knowing what

00:32:36.450 --> 00:32:39.670
EC2 or Lambda is. It's understanding that every

00:32:39.670 --> 00:32:42.890
single architectural choice you make, using Spot,

00:32:43.259 --> 00:32:46.440
accepting a Lambda cold start, paying for a dedicated

00:32:46.440 --> 00:32:49.980
host. It's all fundamentally a trade -off. A

00:32:49.980 --> 00:32:51.920
trade -off between operational control on one

00:32:51.920 --> 00:32:55.039
side and cost optimization on the other. So the

00:32:55.039 --> 00:32:57.119
question to leave you with is, which of those

00:32:57.119 --> 00:32:59.700
two sides control or cost matters most to you

00:32:59.700 --> 00:33:01.640
for your next project or your biggest bottleneck

00:33:01.640 --> 00:33:03.740
right now? Where are you willing specifically

00:33:03.740 --> 00:33:06.099
to maybe give up some fine -grained control to

00:33:06.099 --> 00:33:08.819
slash operational overhead and cost? And conversely,

00:33:09.000 --> 00:33:11.200
where is that absolute control, maybe over latency

00:33:11.200 --> 00:33:14.160
or specific hardware, so critical that it justifies

00:33:14.160 --> 00:33:16.259
the higher cost and management effort? We hope

00:33:16.259 --> 00:33:18.359
this dive has given you a clearer framework for

00:33:18.359 --> 00:33:19.539
making those crucial decisions.
