WEBVTT

00:00:00.000 --> 00:00:03.350
I mean, you are likely sitting, uh... mere inches

00:00:03.350 --> 00:00:05.490
away from a device right now that is capable

00:00:05.490 --> 00:00:07.870
of translating the physical pressure of the air

00:00:07.870 --> 00:00:10.429
around you into a voltage. And then, you know,

00:00:10.509 --> 00:00:13.009
it beams that to space and back in milliseconds.

00:00:13.269 --> 00:00:15.070
Right. Which is just staggering when you really

00:00:15.070 --> 00:00:16.710
pause to think about it. It is. If you're at

00:00:16.710 --> 00:00:18.710
a desk or, you know, holding your phone, you're

00:00:18.710 --> 00:00:20.629
staring right at it. It probably just looks like

00:00:20.629 --> 00:00:23.530
a tiny pinhole or maybe a sleek metal cylinder.

00:00:23.850 --> 00:00:26.170
You click unmute on a video call. The little

00:00:26.170 --> 00:00:28.670
icon turns green and instantly your voice is

00:00:28.670 --> 00:00:31.629
transported across the globe. We just we expect

00:00:31.629 --> 00:00:34.719
it to work. Yeah, it feels entirely magical to

00:00:34.719 --> 00:00:37.979
us now. I mean, the translation of physical invisible

00:00:37.979 --> 00:00:40.320
air pressure into structured digital information

00:00:40.320 --> 00:00:42.859
is something modern society just treats as a

00:00:42.859 --> 00:00:44.960
basic given. But when you actually stop and look

00:00:44.960 --> 00:00:47.060
at how that invisible transfer happens, how we

00:00:47.060 --> 00:00:49.600
capture sound and turn it into electricity, it

00:00:49.600 --> 00:00:52.829
is anything but simple. It is a wild, messy story.

00:00:53.070 --> 00:00:55.810
So today we are taking a deep dive into a massive

00:00:55.810 --> 00:00:58.450
set of source materials, mostly centered around

00:00:58.450 --> 00:01:01.090
this incredibly comprehensive Wikipedia article

00:01:01.090 --> 00:01:04.489
simply titled Microphone. Which is quite the

00:01:04.489 --> 00:01:06.890
read. Oh, it really is. And omniscient for you

00:01:06.890 --> 00:01:09.849
today is to trace this incredible journey. We

00:01:09.849 --> 00:01:13.650
are going to explore how humanity went from using

00:01:14.409 --> 00:01:17.969
basic acoustic cones in ancient theaters to eventually

00:01:17.969 --> 00:01:21.530
capturing the literal heartbeat of a snail. Yeah,

00:01:21.689 --> 00:01:24.549
and along the way we will uncover the invisible

00:01:24.549 --> 00:01:27.209
three -dimensional geometry of sound that is

00:01:27.209 --> 00:01:29.670
happening all around you right now. because the

00:01:29.670 --> 00:01:32.730
technology inside that little unmute button relies

00:01:32.730 --> 00:01:36.090
on physics that really challenge our basic understanding

00:01:36.090 --> 00:01:38.609
of how the physical world operates. Okay, let's

00:01:38.609 --> 00:01:41.170
unpack this, because to really appreciate the

00:01:41.170 --> 00:01:43.450
sleek technology sitting on your desk, we have

00:01:43.450 --> 00:01:45.349
to go back to the beginning. And the beginning

00:01:45.349 --> 00:01:48.010
of capturing and projecting sound was frankly

00:01:48.010 --> 00:01:50.209
bizarre. It was highly contested, incredibly

00:01:50.209 --> 00:01:53.390
lucrative, and in some early cases, extremely

00:01:53.390 --> 00:01:55.510
dangerous. It really was. If we look way back

00:01:55.510 --> 00:01:57.510
in the historical record, the fundamental problem

00:01:57.510 --> 00:01:59.659
was always volume. Right. Like, how do you get

00:01:59.659 --> 00:02:02.019
a fragile human voice to reach a massive group

00:02:02.019 --> 00:02:04.140
of people? Without just screaming yourself hoarse.

00:02:04.400 --> 00:02:07.200
Exactly. So the earliest known attempts were

00:02:07.200 --> 00:02:10.800
just acoustic megaphones. We're talking 5th century

00:02:10.800 --> 00:02:14.580
BC Greece. Actors in these large outdoor amphitheaters

00:02:14.580 --> 00:02:17.819
wore these massive heavy theater masks, and they

00:02:17.819 --> 00:02:20.900
had specially designed horn -shaped mouth openings

00:02:20.900 --> 00:02:23.199
carved right into them. So the mask itself was

00:02:23.199 --> 00:02:25.960
basically a giant acoustic amplifier. Precisely.

00:02:26.030 --> 00:02:28.229
That makes sense, just physically funneling the

00:02:28.229 --> 00:02:30.889
air. And then, jumping ahead, you have Robert

00:02:30.889 --> 00:02:33.650
Hooke, the English physicist in the 17th century,

00:02:34.210 --> 00:02:36.569
creating essentially the first tin can telephone

00:02:36.569 --> 00:02:39.129
using stretch wire and cups. But that's still

00:02:39.129 --> 00:02:41.430
all just moving physical sound waves around through

00:02:41.430 --> 00:02:43.210
tension and funnels. Right. It's mechanical.

00:02:43.330 --> 00:02:45.569
The real leap, the moment everything completely

00:02:45.569 --> 00:02:48.330
changed, was when humanity decided to mix sound

00:02:48.330 --> 00:02:50.310
with electricity. Which is where the danger comes

00:02:50.310 --> 00:02:52.409
out. Oh, absolutely. That transition to electrical

00:02:52.409 --> 00:02:54.789
telecommunication is where things get genuinely

00:02:54.789 --> 00:02:58.159
chaotic. By 1876, you have Alexander Graham Bell

00:02:58.159 --> 00:03:00.319
and Elisha Gray, who developed what was called

00:03:00.319 --> 00:03:02.360
the liquid transmitter. OK, this is the part

00:03:02.360 --> 00:03:04.319
of the source material that absolutely blew my

00:03:04.319 --> 00:03:06.379
mind. When I was reading this, it sounded like

00:03:06.379 --> 00:03:08.639
a dangerous high school chemistry experiment

00:03:08.639 --> 00:03:12.560
gone horribly wrong because early telecommunication

00:03:12.560 --> 00:03:15.479
literally relied on talking into a cup of acid.

00:03:16.039 --> 00:03:19.439
It was a very literal, highly physical mechanism.

00:03:20.120 --> 00:03:22.819
The liquid transmitter consisted of a metal cup

00:03:22.819 --> 00:03:25.569
filled with water. And they mixed in a small

00:03:25.569 --> 00:03:28.750
amount of sulfuric acid to make that water electrically

00:03:28.750 --> 00:03:31.330
conductive. Because plain water isn't quite conductive

00:03:31.330 --> 00:03:33.550
enough for this, right? Right. And then a diaphragm,

00:03:33.689 --> 00:03:35.610
which is the membrane that vibrates when sound

00:03:35.610 --> 00:03:39.050
waves hit, was attached to a small needle. And

00:03:39.050 --> 00:03:42.189
that needle sat just barely submerged inside

00:03:42.189 --> 00:03:44.430
this acid solution. So let me get this straight.

00:03:44.610 --> 00:03:48.240
You speak. Your voice physically pushes the diaphragm,

00:03:48.340 --> 00:03:50.500
and the diaphragm pushes this needle up and down

00:03:50.500 --> 00:03:52.680
in the acid. Exactly. And as the needle moved

00:03:52.680 --> 00:03:55.699
up and down, the size of the water meniscus pinned.

00:03:55.860 --> 00:03:58.319
The meniscus, meaning the poo. The little curve

00:03:58.319 --> 00:04:01.379
of liquid clinging to the metal needle. Oh, right.

00:04:01.520 --> 00:04:04.699
OK. So that meniscus would change in size. And

00:04:04.699 --> 00:04:06.840
that physical movement altered the electrical

00:04:06.840 --> 00:04:09.539
resistance in the circuit. That fluctuating resistance

00:04:09.539 --> 00:04:12.240
created a fluctuating electrical current that

00:04:12.240 --> 00:04:14.300
perfectly matched the sound waves of the voice.

00:04:14.580 --> 00:04:17.819
That is wild. It is. The famous first phone conversation

00:04:17.819 --> 00:04:20.360
between Bell and Watson, the Mr. Watson come

00:04:20.360 --> 00:04:22.740
here moment that was conducted using a sloshing

00:04:22.740 --> 00:04:25.379
water microphone just like this. It's incredible

00:04:25.379 --> 00:04:28.319
to think about, but obviously carrying around

00:04:28.319 --> 00:04:30.699
a cup of highly corrosive sulfuric acid isn't

00:04:30.699 --> 00:04:33.139
exactly practical for consumer electronics or

00:04:33.139 --> 00:04:35.620
like putting a phone in everyone's living room.

00:04:35.959 --> 00:04:39.100
Which brings us to the late 1870s and one of

00:04:39.100 --> 00:04:41.439
the most bitter patent wars in technological

00:04:41.439 --> 00:04:44.100
history, the battle over the carbon microphone.

00:04:44.420 --> 00:04:46.800
The stakes were astronomical here. Right. You

00:04:46.800 --> 00:04:49.160
have David Edward Hughes, Thomas Edison and Emile

00:04:49.160 --> 00:04:51.319
Berliner all fighting viciously for the credit.

00:04:51.819 --> 00:04:53.920
We really have to remember the context of the

00:04:53.920 --> 00:04:57.319
1870s here. Long distance communication was the

00:04:57.319 --> 00:04:59.639
space race of that era. The financial stakes

00:04:59.639 --> 00:05:02.220
were almost unimaginable. Whoever controlled

00:05:02.220 --> 00:05:04.959
the patent for a clear, reliable telephone signal

00:05:04.959 --> 00:05:07.480
was going to control global commerce. So who

00:05:07.480 --> 00:05:10.040
actually got there first? Well, Hughes actually

00:05:10.040 --> 00:05:12.759
demonstrated a working carbon microphone first.

00:05:13.500 --> 00:05:16.120
But Edison possessed a formidable legal team

00:05:16.120 --> 00:05:18.759
and eventually secured the patent after a massive

00:05:18.759 --> 00:05:21.920
dispute. And then Berliner refined the design

00:05:21.920 --> 00:05:24.319
for Bell's telephone company just to keep them

00:05:24.319 --> 00:05:27.470
in the race. But why was the carbon microphone

00:05:27.470 --> 00:05:29.990
worth fighting over in the first place? Like,

00:05:30.069 --> 00:05:32.709
what made a capsule full of crushed carbon so

00:05:32.709 --> 00:05:35.350
special compared to the acid bath? It comes down

00:05:35.350 --> 00:05:38.370
to a monumental breakthrough in physics, specifically

00:05:38.370 --> 00:05:41.470
a concept called a Brown's Relay. Okay, lay that

00:05:41.470 --> 00:05:44.449
out for us. So a carbon microphone uses a small

00:05:44.449 --> 00:05:47.029
capsule filled with loose carbon granules pressed

00:05:47.029 --> 00:05:49.949
between two metal plates. When sound waves hit

00:05:49.949 --> 00:05:52.870
the diaphragm, it physically squeezes those carbon

00:05:52.870 --> 00:05:55.050
granules together. OK. When they're compressed,

00:05:55.509 --> 00:05:57.509
the surface area contact between the millions

00:05:57.509 --> 00:06:00.069
of little granules increases, which lowers the

00:06:00.069 --> 00:06:02.629
electrical resistance. And then when the diaphragm

00:06:02.629 --> 00:06:05.230
pulls back, the granules loosen up and the resistance

00:06:05.230 --> 00:06:06.949
goes up. Wait, I want to make sure I understand

00:06:06.949 --> 00:06:08.410
this, because it sounds like it's just changing

00:06:08.410 --> 00:06:10.470
resistance again, exactly like the liquid mic

00:06:10.470 --> 00:06:12.910
did with the needle. It is changing resistance,

00:06:13.129 --> 00:06:15.930
yes. But the carbon microphone doesn't just pass

00:06:15.930 --> 00:06:20.009
a weak signal along. It acts as an active amplifier.

00:06:21.199 --> 00:06:24.079
Think of a tiny sound wave as a person turning

00:06:24.079 --> 00:06:27.199
a valve on a massive city fire hydrant. Ooh,

00:06:27.199 --> 00:06:30.240
I like this. The person's fraggle voice isn't

00:06:30.240 --> 00:06:32.379
providing the immense water pressure the city's

00:06:32.379 --> 00:06:35.259
electrical grid is doing that. The voice is just

00:06:35.389 --> 00:06:38.490
the hand turning the valve, releasing a massive

00:06:38.490 --> 00:06:41.189
surge of electricity to push the signal miles

00:06:41.189 --> 00:06:43.889
down a wire. Oh, that is a brilliant way to visualize

00:06:43.889 --> 00:06:46.129
it. It's taking a whisper and using it to turn

00:06:46.129 --> 00:06:48.550
a massive electrical faucet on and off. Exactly.

00:06:48.670 --> 00:06:50.730
I mean, before the invention of vacuum tubes

00:06:50.730 --> 00:06:53.290
or digital amps, this was the only way to make

00:06:53.290 --> 00:06:55.790
long distance telephone calls possible. The sound

00:06:55.790 --> 00:06:58.269
of your voice was physically modulating a massive

00:06:58.269 --> 00:07:00.509
electrical signal. And that amplification is

00:07:00.509 --> 00:07:02.629
literally what connected the continents. But

00:07:02.629 --> 00:07:05.279
and this is a big but. while carbon mics were

00:07:05.279 --> 00:07:07.759
incredibly durable for early phones. I mean,

00:07:07.839 --> 00:07:10.079
you could drop them, slam them down in anger.

00:07:10.300 --> 00:07:13.259
The classic angry hang up. Right. But the audio

00:07:13.259 --> 00:07:16.139
quality was famously terrible. It sounded muffled,

00:07:16.500 --> 00:07:19.240
scratchy, and really harsh. So as the world moved

00:07:19.240 --> 00:07:22.360
into the 1920s, the era of radio broadcasting

00:07:22.360 --> 00:07:25.160
and commercial music recording demanded something

00:07:25.160 --> 00:07:28.300
entirely different. The industry needed high

00:07:28.300 --> 00:07:31.189
fidelity. And that demand leads us directly to

00:07:31.189 --> 00:07:33.509
the big three microphone technologies that still

00:07:33.509 --> 00:07:35.870
completely dominate the recording industry today.

00:07:36.310 --> 00:07:39.370
If you walk into a multimillion dollar recording

00:07:39.370 --> 00:07:41.310
studio right now, you are essentially looking

00:07:41.310 --> 00:07:44.689
at technology invented in the 1910s and 1920s.

00:07:44.790 --> 00:07:47.389
Yeah, the foundation of modern studio sound began

00:07:47.389 --> 00:07:50.050
with the condenser microphone. This was invented

00:07:50.050 --> 00:07:53.529
by EC Wente at Western Electric in 1916. And

00:07:53.529 --> 00:07:56.310
this was the first major leap toward truly lifelike

00:07:56.310 --> 00:07:58.610
fidelity. So this one throws out the carbon entirely

00:07:58.610 --> 00:08:00.790
and relies on static electricity, right? It operates

00:08:00.790 --> 00:08:03.569
on the principle of capacitance. Think of a capacitor

00:08:03.569 --> 00:08:05.930
as two metal plates sitting very close together,

00:08:06.370 --> 00:08:09.029
separated by a tiny gap of air holding an electrical

00:08:09.029 --> 00:08:12.149
charge. OK, two charged plates. Right. In a condenser

00:08:12.149 --> 00:08:15.410
microphone, one plate is fixed and solid. The

00:08:15.410 --> 00:08:17.569
other plate is the diaphragm itself, which is

00:08:17.569 --> 00:08:19.930
made of a material so incredibly thin and lightweight

00:08:19.930 --> 00:08:22.639
it's almost microscopic. When sound waves hit

00:08:22.639 --> 00:08:25.439
that diaphragm, it vibrates, which changes the

00:08:25.439 --> 00:08:27.540
physical distance between those two charged plates.

00:08:27.740 --> 00:08:29.819
And because they are holding a charge, changing

00:08:29.819 --> 00:08:32.419
that tiny distance changes the capacitance, which

00:08:32.419 --> 00:08:34.840
generates our audio signal. I like to picture

00:08:34.840 --> 00:08:37.059
it this way for you listening. A condenser mic

00:08:37.059 --> 00:08:39.419
is like a hypersensitive, ultra -lightweight

00:08:39.419 --> 00:08:42.259
trampoline. It has virtually no mass, so it picks

00:08:42.259 --> 00:08:45.399
up every single breath, every lip smack, every...

00:08:45.210 --> 00:08:48.009
tiny acoustic nuance in the room. And that lack

00:08:48.009 --> 00:08:50.929
of mass is the secret to its detail. It has zero

00:08:50.929 --> 00:08:53.230
inertia holding it back, so it reacts instantly

00:08:53.230 --> 00:08:55.690
to the smallest changes in air pressure. That's

00:08:55.690 --> 00:08:58.029
why legendary mics like the Newman U -47, which

00:08:58.029 --> 00:09:00.750
was introduced in 1949 with a vacuum tube amplifier,

00:09:01.269 --> 00:09:04.269
they set the gold standard for those warm, incredibly

00:09:04.269 --> 00:09:07.190
intimate vocal recordings we associate with classic

00:09:07.190 --> 00:09:09.870
studio albums. Like Sinatra. Sinatra. The Beatles,

00:09:09.870 --> 00:09:12.289
they relied completely on that sensitivity. But

00:09:12.289 --> 00:09:14.289
that sensitivity comes with a heavy trade -off.

00:09:14.519 --> 00:09:17.460
Condensers are super delicate, and because they

00:09:17.460 --> 00:09:19.500
need that electrical charge across the plates

00:09:19.500 --> 00:09:22.259
before you even make a sound, they require an

00:09:22.259 --> 00:09:24.559
external power source. Right, which the industry

00:09:24.559 --> 00:09:27.600
calls phantom power. Right. Usually, 48 volts

00:09:27.600 --> 00:09:30.039
sent right up the microphone cable. So if you're

00:09:30.039 --> 00:09:32.639
in a loud, chaotic, dangerous environment, like

00:09:32.639 --> 00:09:36.080
a live heavy metal concert, a delicate condenser

00:09:36.080 --> 00:09:38.379
isn't going to survive the night. Not at all.

00:09:38.740 --> 00:09:40.919
For durability, engineers turned to the second

00:09:40.919 --> 00:09:43.700
of the big three, the dynamic microphone, invented

00:09:43.700 --> 00:09:47.639
around 1923. It abandons charge plates entirely

00:09:47.639 --> 00:09:50.500
and operates on electromagnetic induction. This

00:09:50.500 --> 00:09:52.480
is where we get the rugged tanks of the audio

00:09:52.480 --> 00:09:54.960
world. The most famous example from the sources

00:09:54.960 --> 00:09:58.340
is the Shure SM57, which was released in 1965.

00:09:59.139 --> 00:10:01.940
You can put an SM57 half an inch away from a

00:10:01.940 --> 00:10:04.320
screaming guitar amp or a snare drum, you can

00:10:04.320 --> 00:10:06.299
drop it down a flight of stairs, and it won't

00:10:06.299 --> 00:10:08.399
distort or break. It's basically indestructible.

00:10:08.480 --> 00:10:10.000
But how does it handle that much punishment?

00:10:10.220 --> 00:10:13.539
Well, instead of fragile plates, a dynamic microphone,

00:10:13.679 --> 00:10:16.080
which is also known as a moving coil microphone,

00:10:16.559 --> 00:10:18.980
uses a diaphragm attached to a small coil of

00:10:18.980 --> 00:10:22.120
copper wire. and this coil sits suspended inside

00:10:22.120 --> 00:10:25.740
a permanent magnetic field. When a massive sound

00:10:25.740 --> 00:10:28.559
wave hits the diaphragm, it violently pushes

00:10:28.559 --> 00:10:30.620
the coil back and forth through the magnetic

00:10:30.620 --> 00:10:33.139
field. It's essentially a microscopic power plant.

00:10:33.460 --> 00:10:36.500
Exactly. Just like a giant hydroelectric dam

00:10:36.500 --> 00:10:39.460
spins a turbine inside a magnet to generate a

00:10:39.460 --> 00:10:42.399
city's power, your voice acts as the wind, pushing

00:10:42.399 --> 00:10:45.299
that tiny coil through a magnet, literally generating

00:10:45.299 --> 00:10:47.779
a brand new electrical current out of thin air.

00:10:47.909 --> 00:10:49.889
It generates its own voltage entirely, which

00:10:49.889 --> 00:10:52.789
is why it never needs phantom power. It is completely

00:10:52.789 --> 00:10:54.909
self -sufficient. And here's where it gets really

00:10:54.909 --> 00:10:57.009
interesting. Because of how a dynamic mic is

00:10:57.009 --> 00:10:59.129
built, with a membrane attached to a coil sitting

00:10:59.129 --> 00:11:01.909
in a magnet, it is functionally identical to

00:11:01.909 --> 00:11:04.669
a loudspeaker. It is the exact same machinery,

00:11:04.850 --> 00:11:07.029
just operating in reverse. Yeah, the principle

00:11:07.029 --> 00:11:09.950
of reciprocity applies perfectly here. A speaker

00:11:09.950 --> 00:11:12.350
takes an electrical current, pushes a membrane,

00:11:12.730 --> 00:11:15.909
and moves air to create sound. A dynamic mic

00:11:16.110 --> 00:11:18.909
takes moving air, pushes a membrane and creates

00:11:18.909 --> 00:11:22.059
electrical current. Because the physics are identical,

00:11:22.500 --> 00:11:25.480
you can literally wire a speaker in reverse and

00:11:25.480 --> 00:11:27.759
use it to capture sound. Which blew my mind because

00:11:27.759 --> 00:11:29.759
the source notes that audio engineers actually

00:11:29.759 --> 00:11:32.100
do this on purpose. There's a device called the

00:11:32.100 --> 00:11:34.840
Yamaha Subkick. It is literally a six and a half

00:11:34.840 --> 00:11:38.039
inch speaker woofer, the exact kind you'd find

00:11:38.039 --> 00:11:40.720
in a home stereo mounted inside a drum shell.

00:11:40.759 --> 00:11:43.000
It looks crazy in the studio. And engineers place

00:11:43.000 --> 00:11:45.399
it right in front of a kick drum. Because the

00:11:45.399 --> 00:11:47.820
speaker membrane is so massive and heavy, it

00:11:47.820 --> 00:11:49.970
physically cannot move fast enough to pick up

00:11:49.970 --> 00:11:52.409
the high -pitched sounds of the cymbals. So it

00:11:52.409 --> 00:11:55.429
acts as a natural physical acoustic filter only

00:11:55.429 --> 00:11:58.070
picking up those massive low -frequency thumps.

00:11:58.230 --> 00:12:00.490
It's literally just a speaker listening instead

00:12:00.490 --> 00:12:02.350
of shouting. It's a brilliant hack of physical

00:12:02.350 --> 00:12:04.909
mass. Now the third of the big three technologies

00:12:04.909 --> 00:12:08.370
also introduced around 1923 is the ribbon microphone.

00:12:08.509 --> 00:12:11.610
The fragile ones. Very fragile. It uses a magnetic

00:12:11.610 --> 00:12:14.169
field like the dynamic mic but instead of a heavy

00:12:14.169 --> 00:12:17.309
coil it suspends an incredibly thin corrugated

00:12:17.309 --> 00:12:20.070
strip of aluminum ribbon right between the magnets.

00:12:20.450 --> 00:12:22.649
And these are notorious for being the most fragile

00:12:22.649 --> 00:12:24.549
pieces of equipment in a studio. Oh, without

00:12:24.549 --> 00:12:27.509
question. Because the ribbon is mere microns

00:12:27.509 --> 00:12:31.450
thick, it has almost zero mass. This means it

00:12:31.450 --> 00:12:33.590
responds to the raw velocity of the air particles

00:12:33.590 --> 00:12:36.230
moving past it rather than just the pressure.

00:12:36.750 --> 00:12:39.570
It captures a uniquely smooth, highly natural

00:12:39.570 --> 00:12:42.990
sound. But older vintage ribbon mics could literally

00:12:42.990 --> 00:12:45.409
be torn to shreds by a careless gust of wind.

00:12:45.629 --> 00:12:49.460
Wow. Just a strong breeze. Literally. And if

00:12:49.460 --> 00:12:51.620
you accidentally routed that 48 volt phantom

00:12:51.620 --> 00:12:53.580
power we talk about into a vintage ribbon mic,

00:12:53.960 --> 00:12:55.600
the sudden jolt of electricity could stretch

00:12:55.600 --> 00:12:58.779
or snap the metal ribbon instantly. Ouch. So

00:12:58.779 --> 00:13:00.700
we've engineered mics that won't break and mics

00:13:00.700 --> 00:13:03.340
that sound incredibly lifelike. But there's a

00:13:03.340 --> 00:13:05.980
glaring problem. If you put that perfectly engineered

00:13:05.980 --> 00:13:08.500
microphone in a room with a terrible echo or

00:13:08.500 --> 00:13:11.039
on a live stage with screaming fans and blaring

00:13:11.039 --> 00:13:13.379
amplifiers, it's going to beautifully capture

00:13:13.379 --> 00:13:16.690
all that garbage, too. Exactly. Mastering the

00:13:16.690 --> 00:13:18.870
internal tech was really only half the battle.

00:13:19.889 --> 00:13:21.850
Engineers had to figure out how to control the

00:13:21.850 --> 00:13:24.470
invisible geometry around the mic to tell it

00:13:24.470 --> 00:13:27.309
what to ignore. This introduces the concept of

00:13:27.309 --> 00:13:29.889
polar patterns. These are the invisible three

00:13:29.889 --> 00:13:31.970
-dimensional shapes of sensitivity surrounding

00:13:31.970 --> 00:13:34.929
a microphone capsule. They basically dictate

00:13:34.929 --> 00:13:37.519
the spatial awareness of the device. Let's start

00:13:37.519 --> 00:13:39.960
with the Bassline Omnidirectional. Right. Just

00:13:39.960 --> 00:13:43.039
like it sounds, it hears a perfect 360 -degree

00:13:43.039 --> 00:13:45.879
sphere around the capsule. And the source notes

00:13:45.879 --> 00:13:48.539
that Omnidirectional mics offer the purest sound

00:13:48.539 --> 00:13:51.200
with the lowest coloration, and they can capture

00:13:51.200 --> 00:13:54.340
incredibly deep bass all the way down to 20 Hz,

00:13:54.720 --> 00:13:57.179
because they are purely sensing changes in raw

00:13:57.179 --> 00:13:59.460
air pressure from any angle. They don't have

00:13:59.460 --> 00:14:02.259
to play any complex acoustic tricks or use physical

00:14:02.259 --> 00:14:05.100
blockages to filter sound, so the frequency response

00:14:05.100 --> 00:14:07.309
is naturally very flat. and true to reality.

00:14:07.509 --> 00:14:08.850
Okay, but I have to push back on this. Sure.

00:14:09.009 --> 00:14:11.789
If omnidirectional microphones provide the absolute

00:14:11.789 --> 00:14:14.610
purest sound, the most accurate bass, and basically

00:14:14.610 --> 00:14:17.129
no distortion, why don't we just use them for

00:14:17.129 --> 00:14:19.070
absolutely everything? Well, we run headfirst

00:14:19.070 --> 00:14:21.570
into the messy reality of the spaces we live

00:14:21.570 --> 00:14:24.480
in. Yes, an omnimike is mathematically pure,

00:14:24.840 --> 00:14:27.059
but imagine you were recording a singer in a

00:14:27.059 --> 00:14:30.940
small square room with bare walls. An omnidirectional

00:14:30.940 --> 00:14:33.360
mic will record the singer perfectly, but it

00:14:33.360 --> 00:14:36.379
will also perfectly capture every single ugly

00:14:36.379 --> 00:14:38.720
metallic echo bouncing off the walls right behind

00:14:38.720 --> 00:14:42.879
it. Ahh. Or consider a live concert stage. If

00:14:42.879 --> 00:14:45.120
the lead singer's microphone is omnidirectional,

00:14:45.519 --> 00:14:47.879
it will pick up the drums, the guitar amps, and

00:14:47.879 --> 00:14:50.659
most dangerously, the stage monitor is pointing

00:14:50.659 --> 00:14:52.879
directly back at the singer. Which creates the

00:14:52.879 --> 00:14:55.600
dreaded feedback loop. The mic hears the speaker,

00:14:55.940 --> 00:14:57.879
amplifies it, plays it out the speaker, the mic

00:14:57.879 --> 00:15:00.159
hears it again, and in a fraction of a second,

00:15:00.779 --> 00:15:02.700
everyone in the audience is covering their ears

00:15:02.700 --> 00:15:05.700
from that piercing screech. It's terrible. To

00:15:05.700 --> 00:15:07.820
solve that... engineers had to figure out how

00:15:07.820 --> 00:15:10.360
to make microphones that only listen in one specific

00:15:10.360 --> 00:15:12.620
direction. And the foundation of directional

00:15:12.620 --> 00:15:15.059
listening is the figure eight, or bi -directional

00:15:15.059 --> 00:15:17.559
pattern. Ribbon microphones actually do this

00:15:17.559 --> 00:15:19.720
naturally just by their physical design. Wait,

00:15:19.720 --> 00:15:21.059
let me see if I can picture this based on what

00:15:21.059 --> 00:15:23.960
we just talked about. The ribbon is suspended

00:15:23.960 --> 00:15:26.620
between magnets, meaning it's totally open to

00:15:26.620 --> 00:15:29.059
the air on the front and the back, but the sides

00:15:29.059 --> 00:15:31.120
are blocked by the magnets themselves. Right.

00:15:31.360 --> 00:15:33.679
So if a sound wave comes from the exact left

00:15:33.679 --> 00:15:36.980
or right, It hits both the front and back of

00:15:36.980 --> 00:15:39.799
the ribbon at the exact same millisecond. That

00:15:39.799 --> 00:15:42.259
is exactly what happens. It creates a state of

00:15:42.259 --> 00:15:45.080
acoustic tug of war. The air pressure pushes

00:15:45.080 --> 00:15:47.500
on the front and the back equally. at the exact

00:15:47.500 --> 00:15:50.620
same time. The forces cancel each other out completely,

00:15:50.620 --> 00:15:53.039
and the ribbon doesn't move a single millimeter.

00:15:53.200 --> 00:15:55.779
That is incredibly elegant. It's mathematically

00:15:55.779 --> 00:15:58.179
blind to the sides. You could place a figure

00:15:58.179 --> 00:16:00.679
-eight mic right above a drum kit, point the

00:16:00.679 --> 00:16:02.460
blind sides at the loudest cymbals, and it will

00:16:02.460 --> 00:16:04.940
just effectively ignore them. It's a lifesaver

00:16:04.940 --> 00:16:06.919
in the studio. But the sources highlight that

00:16:06.919 --> 00:16:09.240
the most important shape in all of audio engineering

00:16:09.240 --> 00:16:12.340
is the cardioid pattern. Cardioid meaning heart

00:16:12.340 --> 00:16:15.279
-shaped. It listens closely to the front, captures

00:16:15.279 --> 00:16:18.299
a little bit on the sides, and completely rejects

00:16:18.299 --> 00:16:20.639
sound coming from the rear. And this is the pattern

00:16:20.639 --> 00:16:23.480
that completely revolutionized live sound. It's

00:16:23.480 --> 00:16:26.799
why the Shure SM58 is the undisputed king of

00:16:26.799 --> 00:16:29.659
live vocals. The singer sings into the front

00:16:29.659 --> 00:16:32.539
of the heart shape, and the stage monitors blasting

00:16:32.539 --> 00:16:34.679
at the back of the microphone just fall into

00:16:34.679 --> 00:16:38.340
the rejection zone. No feedback. But how do you

00:16:38.340 --> 00:16:40.500
actually create a heart -shaped listening zone

00:16:40.500 --> 00:16:43.269
out of thin air? It involves an elegant trick

00:16:43.269 --> 00:16:46.870
of physics called superposition. Engineers designed

00:16:46.870 --> 00:16:49.070
the capsule to electronically or acoustically

00:16:49.070 --> 00:16:51.789
combine an omnidirectional pattern with a figure

00:16:51.789 --> 00:16:54.330
8 pattern. You lay those two invisible shapes

00:16:54.330 --> 00:16:56.450
directly over each other. Hold on, I need an

00:16:56.450 --> 00:16:58.750
explanation here. You're saying a positive signal

00:16:58.750 --> 00:17:01.070
from one pattern and a negative signal from another

00:17:01.070 --> 00:17:03.210
combine and the sound just completely disappears.

00:17:03.629 --> 00:17:06.450
How does that work in the actual air? Okay, imagine

00:17:06.450 --> 00:17:09.069
two identical water waves in a pool crashing

00:17:09.069 --> 00:17:11.750
into each other. If the high peak of one wave

00:17:11.750 --> 00:17:13.950
perfectly hits the low trough of the other waves,

00:17:14.309 --> 00:17:16.869
the forces neutralize and the water goes completely

00:17:16.869 --> 00:17:20.130
flat. Like they just erase each other. Exactly.

00:17:20.549 --> 00:17:22.710
Engineers are doing that, but with invisible

00:17:22.710 --> 00:17:25.269
sound pressure. The positive signal from the

00:17:25.269 --> 00:17:27.569
front of the figure 8 adds to the omni, boosting

00:17:27.569 --> 00:17:30.049
the front's sensitivity. But the rear of a figure

00:17:30.049 --> 00:17:33.650
8 is out of phase. It's a negative trough. When

00:17:33.650 --> 00:17:35.769
that negative trough hits the positive peak of

00:17:35.769 --> 00:17:38.230
the omni signal at the back, they cancel each

00:17:38.230 --> 00:17:41.660
other out entirely. Flat water. Absolute silence

00:17:41.660 --> 00:17:44.200
at the rear of the mic. That is genuinely brilliant

00:17:44.200 --> 00:17:47.140
engineering. What's fascinating here is the physical

00:17:47.140 --> 00:17:49.279
side effect of this directional listening, which

00:17:49.279 --> 00:17:51.200
is a phenomenon called the proximity effect.

00:17:51.319 --> 00:17:54.440
Because cardioid microphones rely on these complex

00:17:54.440 --> 00:17:56.500
pressure gradients to create their directionality,

00:17:57.000 --> 00:17:58.920
whenever a sound source gets extremely close

00:17:58.920 --> 00:18:00.940
to the capsule, like within a few centimeters,

00:18:01.440 --> 00:18:03.660
the physics of the gradient shift, causing a

00:18:03.660 --> 00:18:06.359
massive artificial boost in the lowest bass frequencies.

00:18:06.740 --> 00:18:10.039
Oh, that explains the classic radio DJ voice.

00:18:10.180 --> 00:18:12.940
That's the one. That's why podcasters and broadcasters

00:18:12.940 --> 00:18:15.940
always sound so incredibly deep, booming, and

00:18:15.940 --> 00:18:18.200
resonant. They are getting right up on the mic,

00:18:18.500 --> 00:18:20.500
intentionally triggering that physical proximity

00:18:20.500 --> 00:18:22.700
effect. to boost their bass. They are playing

00:18:22.700 --> 00:18:24.920
the physics of the capsule like a musical instrument.

00:18:25.339 --> 00:18:28.440
They absolutely are. Now, if you need extreme

00:18:28.440 --> 00:18:31.019
directionality, say, recording dialogue on a

00:18:31.019 --> 00:18:32.980
movie set where the boom pole is 10 feet away

00:18:32.980 --> 00:18:35.920
from the actor, you use a shotgun microphone.

00:18:36.299 --> 00:18:39.000
These are those long, futuristic metal wands,

00:18:39.119 --> 00:18:41.400
and the way they narrow their focus is amazing.

00:18:41.539 --> 00:18:44.480
They use an interference tube. The actual microphone

00:18:44.480 --> 00:18:47.200
capsule is hidden way down at the base of a long

00:18:47.200 --> 00:18:50.259
tube with tiny slots cut off. all down the sides,

00:18:50.839 --> 00:18:53.099
sound coming straight from the actor goes cleanly

00:18:53.099 --> 00:18:55.799
right down the barrel to the mic. But ambient

00:18:55.799 --> 00:18:58.039
distracting sound from the side enters through

00:18:58.039 --> 00:19:00.299
those different slots. And because the side noise

00:19:00.299 --> 00:19:02.339
enters those slots at different physical distances

00:19:02.339 --> 00:19:04.940
along the tube, the sound waves reach the capsule

00:19:04.940 --> 00:19:06.940
at slightly different times. They arrive out

00:19:06.940 --> 00:19:10.160
of phase. Returning to our water analogy, the

00:19:10.160 --> 00:19:12.200
peaks and troughs crash into each other inside

00:19:12.200 --> 00:19:14.380
the tube, canceling each other out before they

00:19:14.380 --> 00:19:16.240
even hit the diaphragm. Of course, if you're

00:19:16.240 --> 00:19:18.519
using a shotgun mic outside on a film set, you

00:19:18.519 --> 00:19:21.099
have to deal with wind. And the source mentions

00:19:21.099 --> 00:19:23.619
some of the best terminology in the entire audio

00:19:23.619 --> 00:19:26.299
industry for wind screens. Because wind isn't

00:19:26.299 --> 00:19:29.079
really sound, right? It's a massive, chaotic,

00:19:29.259 --> 00:19:32.579
physical blast of air that causes violent turbulence

00:19:32.579 --> 00:19:36.599
against the mic capsule. To stop it, engineers

00:19:36.599 --> 00:19:39.819
put the mic inside large, pill -shaped foam cages

00:19:39.819 --> 00:19:43.279
called blimps or zeppelins. And if the wind is

00:19:43.279 --> 00:19:45.910
really howling, they cover the blimp in a thick

00:19:45.910 --> 00:19:49.390
layer of artificial fur. The film industry affectionately

00:19:49.390 --> 00:19:52.890
refers to these furry covers as dead cats or

00:19:52.890 --> 00:19:55.609
for the smaller microphones on top of consumer

00:19:55.609 --> 00:19:58.630
cameras, dead kittens. It looks absurd. It looks

00:19:58.630 --> 00:20:00.849
exactly like you strapped a fuzzy gray animal

00:20:00.849 --> 00:20:03.049
to the end of a pole. It looks ridiculous. But

00:20:03.049 --> 00:20:04.910
the physics are highly effective. You have a

00:20:04.910 --> 00:20:07.269
massive gust of wind hitting the mic. The thousands

00:20:07.269 --> 00:20:09.829
of individual fur fibers break up that heavy

00:20:09.829 --> 00:20:13.589
blunt force wind into millions of tiny microvortices.

00:20:14.210 --> 00:20:16.730
Microvortices. Right. It converts the heavy turbulence

00:20:16.730 --> 00:20:19.650
into microscopic friction, absorbing the kinetic

00:20:19.650 --> 00:20:22.549
energy completely silently without blocking the

00:20:22.549 --> 00:20:24.809
actual acoustic sound waves of the actors talking.

00:20:25.109 --> 00:20:27.789
So we've mastered high fidelity studio recording,

00:20:28.049 --> 00:20:30.390
we've conquered feedback on live stages, and

00:20:30.390 --> 00:20:32.730
we can mathematically filter out wind on a film

00:20:32.730 --> 00:20:37.450
set. But human curiosity never stops. Once we

00:20:37.450 --> 00:20:40.230
understood the baseline physics, scientists started

00:20:40.230 --> 00:20:42.849
pushing audio capture into extreme environments

00:20:42.849 --> 00:20:45.569
where a standard moving diaphragm would be completely

00:20:45.569 --> 00:20:47.609
useless. I mean, what do you do if the thing

00:20:47.609 --> 00:20:49.269
you want to hear doesn't make a sound in the

00:20:49.269 --> 00:20:51.470
air at all? That's where contact microphones

00:20:51.470 --> 00:20:54.079
come in. Instead of picking up fluctuating air

00:20:54.079 --> 00:20:56.880
pressure, they detect solid, physical vibrations.

00:20:57.440 --> 00:20:59.640
The transducer plate is placed directly against

00:20:59.640 --> 00:21:01.859
an object. And the examples in the source are

00:21:01.859 --> 00:21:04.619
stunning. Contact mics are so hypersensitive

00:21:04.619 --> 00:21:06.480
they have been used by researchers to record

00:21:06.480 --> 00:21:09.200
the literal footsteps of ants and the muscular

00:21:09.200 --> 00:21:11.900
heartbeat of a snail. It fundamentally shifts

00:21:11.900 --> 00:21:14.859
what we even consider to be sound. We also have

00:21:14.859 --> 00:21:16.859
environments where metal and electricity are

00:21:16.859 --> 00:21:19.579
actively dangerous. Think about the inside of

00:21:19.579 --> 00:21:22.480
a hospital MRI machine. The magnetic field is

00:21:22.480 --> 00:21:25.460
so massive that a dynamic or ribbon mic would

00:21:25.460 --> 00:21:28.660
be instantly destroyed. Or worse, the magnets

00:21:28.660 --> 00:21:31.380
would turn the metal casing into a lethal projectile.

00:21:31.519 --> 00:21:33.759
Which you definitely do not want in a hospital.

00:21:33.980 --> 00:21:36.599
No. So to solve that they invented the fiber

00:21:36.599 --> 00:21:39.680
optic microphone. There are zero metal or electrical

00:21:39.680 --> 00:21:42.130
parts in the capsule at all. It shoots a tiny

00:21:42.130 --> 00:21:44.869
laser down a glass optical fiber, bounces the

00:21:44.869 --> 00:21:47.049
light off a reflective diaphragm, and sends it

00:21:47.049 --> 00:21:49.549
back up sick on fiber. When sound moves that

00:21:49.549 --> 00:21:51.730
diaphragm, it changes the intensity of the reflected

00:21:51.730 --> 00:21:54.150
light. That's so clever. A computer down a hall

00:21:54.150 --> 00:21:56.130
reads the light fluctuations and turns it back

00:21:56.130 --> 00:21:59.589
into audio. Complete 100 % immunity to magnetic

00:21:59.589 --> 00:22:02.099
fields. If you want to go a step further and

00:22:02.099 --> 00:22:05.200
eliminate the physical diaphragm entirely, you

00:22:05.200 --> 00:22:07.859
move into the realm of pure laser microphones.

00:22:08.579 --> 00:22:11.099
You can point an infrared laser at a distant

00:22:11.099 --> 00:22:14.079
window pane across the street, measure the microstopic

00:22:14.079 --> 00:22:16.500
vibrations of the solid glass caused by the people

00:22:16.500 --> 00:22:19.319
talking inside the room, and convert those vibrations

00:22:19.319 --> 00:22:22.250
back into perfectly intelligible speech. It's

00:22:22.250 --> 00:22:24.710
straight out of a spy movie. It really is. The

00:22:24.710 --> 00:22:26.930
source even details an experimental laser mic

00:22:26.930 --> 00:22:29.170
that bounces light off a moving stream of smoke

00:22:29.170 --> 00:22:32.890
or vapor in free air. It reads the acoustic disturbances

00:22:32.890 --> 00:22:35.869
in the smoke itself. So what does this all mean?

00:22:35.990 --> 00:22:38.390
If we can literally read the vibration of smoke

00:22:38.390 --> 00:22:40.289
in the air or pull a conversation off a window

00:22:40.289 --> 00:22:42.670
pane a mile away, doesn't this mean that true

00:22:42.670 --> 00:22:45.210
privacy is essentially obsolete? If we connect

00:22:45.210 --> 00:22:47.609
this to the bigger picture, it means our definition

00:22:47.609 --> 00:22:50.569
of the microphone has to radically evolve. They

00:22:50.569 --> 00:22:53.170
began in the 1800s as simple telecommunication

00:22:53.170 --> 00:22:55.630
tools, a brute force way to shout across long

00:22:55.630 --> 00:22:59.190
distance. But today, microphones act as literal

00:22:59.190 --> 00:23:02.170
sensory extensions of the human body. They allow

00:23:02.170 --> 00:23:04.529
us to perceive the microscopic reality of an

00:23:04.529 --> 00:23:07.250
ant's footprint, or safely listen inside lethal

00:23:07.250 --> 00:23:09.930
radiation zones. They have completely surpassed

00:23:09.930 --> 00:23:11.809
the biological limitations of the human ear.

00:23:11.950 --> 00:23:14.890
The ultimate proof of that is the experimental

00:23:14.890 --> 00:23:17.190
plasma microphone. It doesn't use a membrane,

00:23:17.470 --> 00:23:20.130
or glass, or lasers. It uses a literal electric

00:23:20.130 --> 00:23:23.089
arc of ionized gas. Sound waves hit the plasma,

00:23:23.549 --> 00:23:25.750
change the local air pressure, which alters the

00:23:25.750 --> 00:23:27.509
temperature and the electrical conductance of

00:23:27.509 --> 00:23:30.269
the gas itself. It's capturing sound with a controlled

00:23:30.269 --> 00:23:32.789
lightning bolt. It's a long, long way from a

00:23:32.789 --> 00:23:34.950
needle bouncing in a cup of sulfuric acid. It

00:23:34.950 --> 00:23:37.410
is a profound evolution of our mastery over physics.

00:23:37.680 --> 00:23:40.200
So, the next time you are sitting at your desk

00:23:40.200 --> 00:23:42.220
and you reach out to click that unmute button

00:23:42.220 --> 00:23:44.819
on your morning video call, or you put on your

00:23:44.819 --> 00:23:47.099
favorite headphones to listen to a perfectly

00:23:47.099 --> 00:23:50.599
crisp, intimate vocal on a classic album, take

00:23:50.599 --> 00:23:53.420
a second, you are benefiting from over a century

00:23:53.420 --> 00:23:56.519
of wild, dangerous, and brilliant engineering

00:23:56.519 --> 00:24:00.539
from acid baths and crushed carbon to tiny power

00:24:00.539 --> 00:24:03.279
plants, heart -shaped invisible listening zones,

00:24:03.500 --> 00:24:06.119
and artificial fur. It really leaves you with

00:24:06.119 --> 00:24:08.039
an incredibly interesting question about where

00:24:08.039 --> 00:24:11.259
this trajectory goes next. If we have successfully

00:24:11.259 --> 00:24:13.579
turned lightning bolts, lasers, and fiber optics

00:24:13.579 --> 00:24:16.220
into microphones, what happens when we finally

00:24:16.220 --> 00:24:19.480
crack neural interfaces? In the not -so -distant

00:24:19.480 --> 00:24:21.779
future, we might bypass the physical air pressure

00:24:21.779 --> 00:24:24.700
of sound entirely. The microphone might cease

00:24:24.700 --> 00:24:26.500
to be an external piece of hardware sitting on

00:24:26.500 --> 00:24:29.200
your desk and instead become a direct brain -to

00:24:29.200 --> 00:24:31.900
-digital telepathic transducer. No diaphragm.

00:24:32.000 --> 00:24:34.579
No moving parts. Just pure thought converted

00:24:34.579 --> 00:24:36.960
directly into a signal. Now that is something

00:24:36.960 --> 00:24:37.599
to think about.
