WEBVTT

00:00:00.000 --> 00:00:02.459
Picture the camera mounted on the highway gantry

00:00:02.459 --> 00:00:04.320
you drove under this morning. Right. Or maybe

00:00:04.320 --> 00:00:06.540
it was one of those sleek little gray boxes perched

00:00:06.540 --> 00:00:08.939
on a stuck metal pole by the interstate. Yeah,

00:00:08.939 --> 00:00:11.039
exactly. And you probably tapped your brakes,

00:00:11.119 --> 00:00:13.179
assuming it was actively shooting a radar beam

00:00:13.179 --> 00:00:15.619
at your car to clock your speed. Which is what

00:00:15.619 --> 00:00:18.219
we all do, right? Good for sure. Yeah. But what

00:00:18.219 --> 00:00:20.199
if it wasn't emitting anything at all? What if

00:00:20.199 --> 00:00:23.280
it was just watching you using pure geometry

00:00:23.280 --> 00:00:25.559
to calculate your velocity? That's a pretty wild

00:00:25.559 --> 00:00:29.420
thought. It is. Welcome to today's Deep Dive.

00:00:29.579 --> 00:00:33.280
We have an incredibly revealing excerpt from

00:00:33.280 --> 00:00:36.500
a Wikipedia article detailing a technology called

00:00:36.500 --> 00:00:41.829
video detection and ranging or Vidar. And our

00:00:41.829 --> 00:00:44.649
mission today is to decode how modern speed enforcement

00:00:44.649 --> 00:00:47.670
has quietly shifted from bouncing active signals

00:00:47.670 --> 00:00:50.369
off your bumper to passively calculating your

00:00:50.369 --> 00:00:53.829
speed using advanced stereoscopic math. Okay,

00:00:53.969 --> 00:00:56.030
let's unpack this. Well, to really appreciate

00:00:56.030 --> 00:00:58.350
the sheer leap in engineering that Vidar represents,

00:00:58.530 --> 00:01:00.750
we kind of have to look at what the source material

00:01:00.750 --> 00:01:03.130
identifies as the old guard. Right, the traditional

00:01:03.130 --> 00:01:07.019
methods it's replacing. Exactly. The text specifically

00:01:07.019 --> 00:01:09.140
points to two foundational technologies. You

00:01:09.140 --> 00:01:12.900
have radar, which relies on Doppler shifts, and

00:01:12.900 --> 00:01:15.819
LiR, which calculates speed using the time of

00:01:15.819 --> 00:01:18.150
flight principle. OK, we all know the basics

00:01:18.150 --> 00:01:20.790
of those old school systems, but let's translate

00:01:20.790 --> 00:01:22.510
that for the listener really quick. Yeah, good

00:01:22.510 --> 00:01:26.269
idea. So radar is reading the Dopple shift, which

00:01:26.269 --> 00:01:28.909
is basically measuring how compressed a radio

00:01:28.909 --> 00:01:31.629
wave gets when it bounces off the grill of a

00:01:31.629 --> 00:01:34.269
car moving toward the sensor. Right, the frequency

00:01:34.269 --> 00:01:36.689
change. Yeah, it's like driving a boat into oncoming

00:01:36.689 --> 00:01:39.450
waves. The faster you drive, the faster the waves

00:01:39.450 --> 00:01:41.810
hit the bow. So the radar gun measures that wave

00:01:41.810 --> 00:01:44.469
compression to find your speed. Spot on. And

00:01:44.469 --> 00:01:46.989
then there is Liar, which abandons the Doppler

00:01:46.989 --> 00:01:49.109
effect entirely. Because it uses light instead

00:01:49.109 --> 00:01:52.209
of radio waves, right? Exactly. As the text outlines,

00:01:52.689 --> 00:01:55.030
LIDAR relies on time of flight. So think of time

00:01:55.030 --> 00:01:56.909
of flight like bouncing a tennis ball against

00:01:56.909 --> 00:01:59.129
a brick wall in the dark. Oh, I like that analogy.

00:01:59.250 --> 00:02:01.709
You throw it, and you count the exact milliseconds

00:02:01.709 --> 00:02:03.650
it takes to return to your hand to figure out

00:02:03.650 --> 00:02:06.709
how far away the wall is. Just way faster. Right.

00:02:06.909 --> 00:02:10.210
An incredibly fast game of Marco Polo played

00:02:10.210 --> 00:02:13.620
with light instead of sound. The police scanner

00:02:13.620 --> 00:02:16.560
yells, Marco, by firing a laser pulse millions

00:02:16.560 --> 00:02:20.419
of times a second, and it tracks the exact microsecond

00:02:20.419 --> 00:02:22.840
your bumper yells, Polo, back by reflecting that

00:02:22.840 --> 00:02:24.979
light. And since we know the speed of light is

00:02:24.979 --> 00:02:27.860
a constant, measuring the round -trip time of

00:02:27.860 --> 00:02:30.840
that laser pulse gives you the exact distance

00:02:30.840 --> 00:02:33.259
to the car. And you just do that thousands of

00:02:33.259 --> 00:02:35.379
times a second, and boom, you have the car's

00:02:35.379 --> 00:02:38.240
speed. Exactly. But what's fascinating here is

00:02:38.240 --> 00:02:40.879
the underlying operational philosophy of both

00:02:40.879 --> 00:02:43.969
radar and lidar. What do you mean by philosophy?

00:02:44.469 --> 00:02:47.169
Well, despite using totally different physics,

00:02:47.389 --> 00:02:49.669
you know, compression of radio frequencies versus

00:02:49.669 --> 00:02:53.050
the speed of light, they share one defining characteristic,

00:02:53.349 --> 00:02:55.250
which is they're both active systems. Meaning

00:02:55.250 --> 00:02:57.610
they have to like throw energy into the environment

00:02:57.610 --> 00:03:00.490
to get a result. Yes. They are completely reliant

00:03:00.490 --> 00:03:03.599
on a mission to measure the physical world. they

00:03:03.599 --> 00:03:05.400
have to physically interact with it. Oh, I see.

00:03:05.599 --> 00:03:08.400
Like, a LIDAR system requires a laser diode to

00:03:08.400 --> 00:03:11.460
generate intense pulses of light, optics to focus

00:03:11.460 --> 00:03:14.759
that beam, and often some sort of mechanical

00:03:14.759 --> 00:03:17.680
or solid -state scanning mechanism to sweep that

00:03:17.680 --> 00:03:20.139
beam across the highway. That sounds complicated.

00:03:20.360 --> 00:03:23.599
It is. And radar requires an oscillator to generate

00:03:23.599 --> 00:03:26.419
microwaves, an antenna to broadcast them, and

00:03:26.419 --> 00:03:28.819
a receiver sensitive enough to catch the scattered

00:03:28.819 --> 00:03:31.340
reflection. Which means they require significant

00:03:31.340 --> 00:03:34.120
power. Yes, a lot of power. And they have moving

00:03:34.120 --> 00:03:37.080
parts or complex broadcasting hardware that can

00:03:37.080 --> 00:03:39.520
degrade over time. Plus, they're constantly shouting

00:03:39.520 --> 00:03:42.180
into the electromagnetic spectrum. Exactly. And

00:03:42.180 --> 00:03:44.599
that active emission is exactly the paradigm

00:03:44.599 --> 00:03:47.400
our source material says is ending. Wow, really?

00:03:47.759 --> 00:03:50.800
Yeah. We are moving from a regime of active interrogation,

00:03:51.240 --> 00:03:53.479
like pinging vehicles with energy, to a regime

00:03:53.479 --> 00:03:55.939
of entirely passive observation. Which brings

00:03:55.939 --> 00:03:59.099
us to VEDAR. Right. We are entering the era of

00:03:59.099 --> 00:04:01.900
VDAR. So the text defines video detection ranging

00:04:01.900 --> 00:04:04.259
as a technique that measures the speed of a distant

00:04:04.259 --> 00:04:06.580
vehicle simply by tracking the object through

00:04:06.580 --> 00:04:09.099
vision cameras. Just watching it. No lasers,

00:04:09.479 --> 00:04:12.259
no radio waves, just capturing the ambient light

00:04:12.259 --> 00:04:14.560
that is already bouncing off your car. It is

00:04:14.560 --> 00:04:17.079
an incredibly elegant concept. You strip away

00:04:17.079 --> 00:04:19.040
the emitters, the diodes, the antennas, and you

00:04:19.040 --> 00:04:21.220
just replace them with a lens and a sensor. I

00:04:21.220 --> 00:04:23.139
mean, I'm hung up on the mechanics here, though.

00:04:23.259 --> 00:04:26.319
How so? Well, a camera captures flat two -dimensional

00:04:26.319 --> 00:04:28.459
images, right? Right. And a car is moving toward

00:04:28.459 --> 00:04:31.660
the lens in three -dimensional space at like

00:04:31.660 --> 00:04:35.120
75 miles an hour. Yeah, fast. So how does a flat

00:04:35.120 --> 00:04:38.720
grid of pixels calculate velocity without a radar

00:04:38.720 --> 00:04:41.040
wave telling it the physical distance? I mean,

00:04:41.100 --> 00:04:43.339
a single video is just a rapid sequence of flat

00:04:43.339 --> 00:04:46.500
photographs. That exact physical limitation is

00:04:46.500 --> 00:04:49.519
what kept passive video speed detection in the

00:04:49.519 --> 00:04:52.240
realm of theory for decades. Oh, so it was a

00:04:52.240 --> 00:04:54.660
known problem. Absolutely. A single camera lens,

00:04:55.019 --> 00:04:57.779
no matter how high the resolution, is inherently

00:04:57.779 --> 00:05:00.500
cyclolepic. It has zero native depth perception.

00:05:00.699 --> 00:05:04.000
Right, because a large car far away looks geometrically

00:05:04.000 --> 00:05:07.000
identical to a small car close up. Exactly. But

00:05:07.000 --> 00:05:09.500
if you look closely at the source text, it specifies

00:05:09.500 --> 00:05:11.639
that this high precision measurement isn't done

00:05:11.639 --> 00:05:14.300
with a standard camera. OK, what is it done with?

00:05:14.579 --> 00:05:17.079
It is achieved through advanced stereoscopic

00:05:17.079 --> 00:05:19.800
imaging techniques. Stereoscopic imaging. So

00:05:19.800 --> 00:05:21.740
we aren't talking about one lens. No. We are

00:05:21.740 --> 00:05:24.839
talking about two. We are talking about replicating

00:05:24.839 --> 00:05:28.240
human biology. It is the exact same biological

00:05:28.240 --> 00:05:30.660
mechanism that allows you to catch a baseball

00:05:30.660 --> 00:05:34.000
or navigate a crowded sidewalk. Let me sketch

00:05:34.000 --> 00:05:36.620
this out mentally for the listener because the

00:05:36.620 --> 00:05:39.540
biological parallel makes the math so much easier

00:05:39.540 --> 00:05:42.279
to grasp. Yeah, go for it. So hold your thumb

00:05:42.279 --> 00:05:45.199
out at arm's length. Okay, do it. Close your

00:05:45.199 --> 00:05:47.120
left eye and align your thumb with something

00:05:47.120 --> 00:05:49.800
on the wall behind it. Say a picture frame. Got

00:05:49.800 --> 00:05:52.259
it. Now without moving your hand, open your left

00:05:52.259 --> 00:05:54.420
eye and close your right eye. The thumb appears

00:05:54.420 --> 00:05:56.819
to jump horizontally across the background. Exactly.

00:05:57.399 --> 00:06:00.660
That visual jump is called parallax. Your left

00:06:00.660 --> 00:06:02.560
eye and your right eye are viewing the world

00:06:02.560 --> 00:06:04.860
from two slightly different angles, separated

00:06:04.860 --> 00:06:07.459
by a couple of inches. Right. Your brain takes

00:06:07.459 --> 00:06:11.279
those two flat 2D images, measures the disparity

00:06:11.279 --> 00:06:13.639
between where objects appear in each eye, and

00:06:13.639 --> 00:06:15.860
instantly stitches them together to create a

00:06:15.860 --> 00:06:18.420
3D map of the world. It's pretty amazing. That's

00:06:18.420 --> 00:06:20.439
how you know how far away your sum is from the

00:06:20.439 --> 00:06:23.300
wall. And VIDAR applies that exact phenomenon

00:06:23.300 --> 00:06:26.319
to traffic enforcement. But instead of neurons,

00:06:26.560 --> 00:06:31.480
it uses pure geometry and silicon. So dual visual

00:06:31.480 --> 00:06:34.680
inputs. Yes. The assistive utilizes two camera

00:06:34.680 --> 00:06:37.759
lenses mounted on that gantry or pole, staring

00:06:37.759 --> 00:06:40.139
at the exact same stretch of asphalt. But wait,

00:06:40.339 --> 00:06:42.500
for the computer to calculate depth from that

00:06:42.500 --> 00:06:44.699
parallax, it has to have a baseline, right? I

00:06:44.699 --> 00:06:47.079
mean, the human brain knows how far apart our

00:06:47.079 --> 00:06:49.560
eyes are. Exactly. How does the VIDAR system

00:06:49.560 --> 00:06:52.660
establish that metric? The baseline is actually

00:06:52.660 --> 00:06:54.879
the most critical variable in the entire system.

00:06:55.060 --> 00:06:58.120
OK. The two lenses on a Vidar unit are rigidly

00:06:58.120 --> 00:07:00.560
mounted at an exact mathematically known distance

00:07:00.560 --> 00:07:02.959
from each other. So they don't move? Never. Let's

00:07:02.959 --> 00:07:05.579
say it's exactly 24 inches. The system software

00:07:05.579 --> 00:07:07.860
is hard -coded with that 24 inch baseline. Gotcha.

00:07:08.180 --> 00:07:10.980
When a car drives into the frame, camera A captures

00:07:10.980 --> 00:07:13.980
an image of the license plate. In that flat image,

00:07:14.279 --> 00:07:16.699
the license plate is located at specific pixel

00:07:16.699 --> 00:07:19.759
coordinates. Let's say pixel 1000 on the x -axis.

00:07:20.079 --> 00:07:22.339
And at that exact same millisecond, camera B

00:07:22.339 --> 00:07:25.680
captures its own image. Yes. But because camera

00:07:25.680 --> 00:07:29.120
B is 24 inches to the right, it sees the license

00:07:29.120 --> 00:07:31.120
plate from a slightly different angle. Precisely.

00:07:31.180 --> 00:07:34.300
So in camera B's flat image, the license plate

00:07:34.300 --> 00:07:38.500
isn't at pixel 1000. It's at like... Pixel 1050.

00:07:38.680 --> 00:07:40.860
And that 50 pixel difference is the disparity.

00:07:41.060 --> 00:07:44.240
Ah, the visual jump. Exactly. Because the software

00:07:44.240 --> 00:07:46.560
knows the exact distance between the two lenses,

00:07:46.839 --> 00:07:49.540
the exact focal length of the optics, and the

00:07:49.540 --> 00:07:52.040
exact pixel disparity between the two flat images,

00:07:52.500 --> 00:07:55.439
it can use basic trigonometry to triangulate

00:07:55.439 --> 00:07:58.319
the exact distance to that license plate in three

00:07:58.319 --> 00:08:01.199
-dimensional space. Wow. It has mapped the z

00:08:01.199 --> 00:08:03.920
-axis purely through observation. Okay, so that

00:08:03.920 --> 00:08:06.060
establishes the depth for one single moment in

00:08:06.060 --> 00:08:09.060
time. Right. Speed is distance over time. So

00:08:09.060 --> 00:08:11.879
the Vidar system has to run that complex stereoscopic

00:08:11.879 --> 00:08:14.500
triangulation over and over again. Frame by frame

00:08:14.500 --> 00:08:16.899
by frame. That's a lot of math. It is. This is

00:08:16.899 --> 00:08:18.959
where the video aspect of video detection and

00:08:18.959 --> 00:08:20.939
ranging comes into play, because video is really

00:08:20.939 --> 00:08:23.040
just sequential photography. Sure. Let's assume

00:08:23.040 --> 00:08:25.439
the Vidar cameras are shooting at 60 frames per

00:08:25.439 --> 00:08:28.000
second. OK. If it's shooting 60 frames a second,

00:08:28.060 --> 00:08:30.740
that means there is a gap of about, what, 16

00:08:30.740 --> 00:08:33.100
.6 milliseconds between every single photograph?

00:08:33.559 --> 00:08:36.940
Exactly. So the car enters the frame. The system

00:08:36.940 --> 00:08:39.980
uses stereoscopic disparity to calculate that

00:08:39.980 --> 00:08:42.940
the bumper is exactly 100 feet away. Right. 16

00:08:42.940 --> 00:08:45.580
.6 milliseconds later, the cameras capture the

00:08:45.580 --> 00:08:48.460
next frame. The system recalculates the disparity.

00:08:49.019 --> 00:08:52.740
The bumper is now, say, 98 feet away. So it knows

00:08:52.740 --> 00:08:55.419
the exact distance the car moved through 3D space,

00:08:55.440 --> 00:08:57.759
and it knows the exact fraction of a second it

00:08:57.759 --> 00:09:00.590
took to cover that distance. and distance divided

00:09:00.590 --> 00:09:03.889
by time equals velocity. That is wild. The system

00:09:03.889 --> 00:09:06.210
calculates your speed without ever emitting a

00:09:06.210 --> 00:09:08.909
single pulse of energy. It just watched the light

00:09:08.909 --> 00:09:10.950
naturally bouncing off your car from two different

00:09:10.950 --> 00:09:13.110
angles and let the software do the heavy lifting.

00:09:13.370 --> 00:09:16.009
Which brings up a massive historical question

00:09:16.009 --> 00:09:18.929
for me. The geometry behind stereoscopic triangulation

00:09:18.929 --> 00:09:21.370
isn't new. I mean, humans have understood parallax

00:09:21.370 --> 00:09:23.830
and trigonometry for centuries. Oh, absolutely.

00:09:24.070 --> 00:09:26.610
So if this passive method is so elegant, if it

00:09:26.610 --> 00:09:28.950
saves us from building complex radar emitters

00:09:28.950 --> 00:09:32.570
and expensive spinning LIDAR lasers, why wasn't

00:09:32.570 --> 00:09:34.809
this the standard from the very beginning of

00:09:34.809 --> 00:09:36.909
traffic enforcement? That is a great question.

00:09:37.169 --> 00:09:39.809
And the answer is the limitation was never the

00:09:39.809 --> 00:09:42.570
math. What was it? The limitation was the silicon.

00:09:42.759 --> 00:09:45.379
The source material actually points us directly

00:09:45.379 --> 00:09:48.080
to the moment this barrier was broken. Oh right,

00:09:48.220 --> 00:09:50.899
the patent. Yes. It cites a specific architectural

00:09:50.899 --> 00:09:56.000
origin for this technology. U .S. patent 8184863B2.

00:09:56.279 --> 00:09:59.240
Titled the Video Speed Detection System. Issued

00:09:59.240 --> 00:10:02.139
to an inventor named Ji Gang Wang on May 22,

00:10:02.320 --> 00:10:05.480
2012. 2012? I mean, that is incredibly recent

00:10:05.480 --> 00:10:07.879
when you think about the history of highway infrastructure.

00:10:08.100 --> 00:10:10.379
Very recent. We tend to view traffic cameras

00:10:10.379 --> 00:10:13.259
as these archaic, bureaucratic monoliths that

00:10:13.259 --> 00:10:16.179
have been around forever. But the specific patented

00:10:16.179 --> 00:10:18.960
capability to enforce speed limits using passive

00:10:18.960 --> 00:10:21.899
stereoscopic video is barely over a decade old.

00:10:21.940 --> 00:10:23.600
Yeah, and when you look at what the system actually

00:10:23.600 --> 00:10:26.039
has to do to replace a radar gun, that 2012 date

00:10:26.039 --> 00:10:28.179
makes perfect sense. Because of the processing

00:10:28.179 --> 00:10:31.100
power. Exactly. Think about the computational

00:10:31.100 --> 00:10:33.639
load we just described. Yeah. The system is pulling

00:10:33.639 --> 00:10:36.600
in two high -definition video feeds simultaneously.

00:10:36.720 --> 00:10:39.620
Right. It has to scan millions of pixels in real

00:10:39.620 --> 00:10:42.059
time, identify the vehicle moving at highway

00:10:42.059 --> 00:10:44.960
speeds, isolate a specific feature like a license

00:10:44.960 --> 00:10:47.799
plate or a headlight. and find that exact same

00:10:47.799 --> 00:10:51.600
feature in the second video feed. Yes. Calculate

00:10:51.600 --> 00:10:54.720
the pixel disparity, triangulate the 3D depth,

00:10:55.080 --> 00:10:57.700
and then do it all again. 16 milliseconds later.

00:10:57.840 --> 00:11:00.179
And it has to do all of that math before the

00:11:00.179 --> 00:11:02.320
car physically drives out of the camera's field

00:11:02.320 --> 00:11:05.559
of view. Right. So in the 1990s or early 2000s,

00:11:05.720 --> 00:11:09.059
trying to process dual HD video feeds frame by

00:11:09.059 --> 00:11:12.460
frame for real -time 3D reconstruction would

00:11:12.460 --> 00:11:14.659
have just melted a conventional roadside processor.

00:11:14.720 --> 00:11:16.480
You would have needed a server rack the size

00:11:16.480 --> 00:11:18.340
of a refrigerator sitting next to the highway.

00:11:18.620 --> 00:11:20.580
Pretty much. The hardware simply couldn't keep

00:11:20.580 --> 00:11:23.070
up with the geometry. Radar and LiDAR were the

00:11:23.070 --> 00:11:25.889
standard because computationally speaking, they

00:11:25.889 --> 00:11:27.929
are relatively simple. Because measuring the

00:11:27.929 --> 00:11:30.190
frequency shift of a bounced radio wave takes

00:11:30.190 --> 00:11:32.429
very little processing power. Exactly. You get

00:11:32.429 --> 00:11:35.289
a number instantly. But by 2012, Moore's law

00:11:35.289 --> 00:11:38.629
had done its work. High -definition digital image

00:11:38.629 --> 00:11:41.710
sensors became incredibly cheap and highly accurate.

00:11:41.950 --> 00:11:43.929
Yeah, they were everywhere. And more importantly,

00:11:44.309 --> 00:11:46.789
microprocessors became fast enough to handle

00:11:46.789 --> 00:11:50.450
the sheer volume of data required for real -time

00:11:50.450 --> 00:11:54.129
disparity mapping. Spot on. Jigging Wang's patent

00:11:54.129 --> 00:11:56.909
represents the exact moment when the processing

00:11:56.909 --> 00:11:59.629
power caught up to the math. Wow. It marks the

00:11:59.629 --> 00:12:02.110
transition where software finally overtook hardware

00:12:02.110 --> 00:12:04.850
in the realm of speed detection. And as the text

00:12:04.850 --> 00:12:08.090
outlines that 2012 invention. didn't languish

00:12:08.090 --> 00:12:10.190
in a patent office somewhere. No, not at all.

00:12:10.309 --> 00:12:13.169
The source explicitly lists its primary applications

00:12:13.169 --> 00:12:15.870
as remote sensing and traffic speed enforcement.

00:12:16.070 --> 00:12:18.389
So this passive stereoscopic observation has

00:12:18.389 --> 00:12:20.649
actively reshaped our physical environment. It

00:12:20.649 --> 00:12:22.929
really has. The text mentions these systems are

00:12:22.929 --> 00:12:25.830
deployed in gantries or on roadside poles. That's

00:12:25.830 --> 00:12:27.970
the physical footprint of this paradigm shift.

00:12:28.129 --> 00:12:30.250
It completely changes the architecture of enforcement.

00:12:30.370 --> 00:12:33.490
How so? Well, because you no longer need heavy

00:12:33.490 --> 00:12:36.669
high voltage equipment to emit radar waves or

00:12:36.559 --> 00:12:39.480
or mechanical spinners for LIDAR, the footprint

00:12:39.480 --> 00:12:42.440
of a speed detection system shrinks drastically.

00:12:42.659 --> 00:12:44.700
Right, it doesn't need to be this massive rig.

00:12:44.860 --> 00:12:47.039
Exactly. It becomes just a pair of lenses in

00:12:47.039 --> 00:12:49.899
a small housing, quietly processing the ambient

00:12:49.899 --> 00:12:52.120
light. So the next time you are driving down

00:12:52.120 --> 00:12:54.259
the interstate, and you pass under one of those

00:12:54.259 --> 00:12:56.879
massive metal gantries stretching over the lanes,

00:12:57.259 --> 00:12:59.500
or you catch a glimpse of a camera cluster on

00:12:59.500 --> 00:13:01.519
a roadside pole out of the corner of your eye,

00:13:01.779 --> 00:13:03.639
I want you to look at it through the lens of

00:13:03.639 --> 00:13:05.779
this deep dive. Yeah, it changes your perspective.

00:13:05.769 --> 00:13:08.269
entirely. Don't just think of it as a dumb security

00:13:08.269 --> 00:13:11.049
camera waiting to take a flat flash photograph

00:13:11.049 --> 00:13:13.690
of your license plate. Because it's not. That

00:13:13.690 --> 00:13:17.389
gray box is doing high level real time calculus

00:13:17.389 --> 00:13:20.070
on the physical space around it. You are driving

00:13:20.070 --> 00:13:23.309
through the field of view of an incredibly sophisticated

00:13:23.309 --> 00:13:25.929
stereoscopic set of eyes. Literally watching

00:13:25.929 --> 00:13:29.080
you. It is silently taking in dual video feeds,

00:13:29.440 --> 00:13:31.840
mapping the pixel disparity of your vehicle against

00:13:31.840 --> 00:13:34.519
the background, reconstructing the three -dimensional

00:13:34.519 --> 00:13:37.220
reality of the highway, and calculating your

00:13:37.220 --> 00:13:40.240
exact trajectory and velocity. It is translating

00:13:40.240 --> 00:13:43.039
the physical world into pure geometry. And doing

00:13:43.039 --> 00:13:45.980
it all passively. It forces a complete reimagining

00:13:45.980 --> 00:13:49.139
of how we measure our environment. So what does

00:13:49.139 --> 00:13:52.309
this all mean? We started with a source detailing

00:13:52.309 --> 00:13:55.409
a rather dry sounding Wikipedia entry about video

00:13:55.409 --> 00:13:58.350
detection and ranging. But what we actually uncovered

00:13:58.350 --> 00:14:00.669
is a fundamental evolution in how technology

00:14:00.669 --> 00:14:03.009
interacts with reality. It really is a massive

00:14:03.009 --> 00:14:05.269
shift. We are witnessing the end of the brute

00:14:05.269 --> 00:14:08.529
force era, an era where we had to actively hurl

00:14:08.529 --> 00:14:10.769
radar waves and laser pulses out into the world

00:14:10.769 --> 00:14:13.230
and wait for them to bounce back just to understand

00:14:13.230 --> 00:14:15.169
how fast something was moving. Yeah, shouting

00:14:15.169 --> 00:14:17.450
into the dark. Right. And we have transitioned

00:14:17.450 --> 00:14:20.639
into an era of elegant passive visual tracking,

00:14:20.980 --> 00:14:22.899
where simply gathering ambient light through

00:14:22.899 --> 00:14:25.679
two lenses is enough to decode the complex physics

00:14:25.679 --> 00:14:28.360
of a 4 ,000 -pound object moving at 80 miles

00:14:28.360 --> 00:14:30.899
an hour. If we connect this to the bigger picture,

00:14:31.720 --> 00:14:34.320
knowing the mechanics behind Vidar completely

00:14:34.320 --> 00:14:36.519
changes how we should interact with our built

00:14:36.519 --> 00:14:38.899
environment. How so? Well, the mundane highway

00:14:38.899 --> 00:14:40.980
commute isn't just a stretch of asphalt anymore.

00:14:41.179 --> 00:14:45.100
It is a live showcase of advanced physics, stereoscopic

00:14:45.100 --> 00:14:47.539
biology, and cutting -edge engineering. That's

00:14:47.539 --> 00:14:49.419
a great way to put it. When you understand the

00:14:49.419 --> 00:14:51.720
baseline math and the frame rate calculations

00:14:51.720 --> 00:14:54.879
happening inside those roadside poles, you realize

00:14:54.879 --> 00:14:57.580
that our infrastructure is becoming increasingly

00:14:57.580 --> 00:15:00.360
perceptive. Perceptive? I like that word. It

00:15:00.360 --> 00:15:02.500
doesn't need to ping us with energy to know we

00:15:02.500 --> 00:15:07.049
are there. It is passively, silently understanding

00:15:07.049 --> 00:15:09.669
the geometry of everything moving through it.

00:15:10.169 --> 00:15:11.870
Increasingly perceptive. I think that is the

00:15:11.870 --> 00:15:14.409
perfect concept to land on. It fits, right. It

00:15:14.409 --> 00:15:16.950
really does. We've spent this entire deep dive

00:15:16.950 --> 00:15:19.610
discussing how stereoscopic technology applies

00:15:19.610 --> 00:15:22.070
to cars on a highway because, well, that is exactly

00:15:22.070 --> 00:15:24.090
where the 2012 patent and the source material

00:15:24.090 --> 00:15:25.929
focused. Right. But I want to leave you with

00:15:25.929 --> 00:15:28.590
a final thought to mull over long after you finish

00:15:28.590 --> 00:15:31.649
listening. Oh, this is a good one. If Stereoscopic

00:15:31.649 --> 00:15:34.590
cameras mounted on a simple roadside pole can

00:15:34.590 --> 00:15:38.110
calculate exact speed, trajectory, and vehicular

00:15:38.110 --> 00:15:41.269
data entirely through passive observation. What

00:15:41.269 --> 00:15:43.409
happens when this passive tracking capability

00:15:43.409 --> 00:15:46.049
shrinks even further? Right, because tech always

00:15:46.049 --> 00:15:48.750
gets smaller. Exactly. What happens when the

00:15:48.750 --> 00:15:51.049
processing power gets even cheaper, the lenses

00:15:51.049 --> 00:15:54.330
get even smaller, and the stereoscopic math scales

00:15:54.330 --> 00:15:58.490
down to track human movement? That's wild to

00:15:58.490 --> 00:16:00.799
think about. How does our world change when the

00:16:00.799 --> 00:16:03.100
environment around us can calculate the speed,

00:16:03.259 --> 00:16:05.440
distance, and trajectory of people moving through

00:16:05.440 --> 00:16:08.639
a public square, a shopping mall, or a simple

00:16:08.639 --> 00:16:11.299
city sidewalk just as effortlessly as it tracks

00:16:11.299 --> 00:16:13.460
a car on the interstate? It completely changes

00:16:13.460 --> 00:16:15.539
privacy, for one. It really does. Just something

00:16:15.539 --> 00:16:17.200
to think about the next time you walk past a

00:16:17.200 --> 00:16:17.419
camera.