WEBVTT

00:00:00.000 --> 00:00:02.580
Welcome to today's Deep Dive. I am really excited

00:00:02.580 --> 00:00:04.459
to get into this one with you. Yeah, me too.

00:00:04.700 --> 00:00:07.120
We have a pretty fascinating mission today. We

00:00:07.120 --> 00:00:10.019
really do. And to start us off, I want to set

00:00:10.019 --> 00:00:12.439
the scene for you. Our source material today

00:00:12.439 --> 00:00:15.619
starts with what seems like a completely standard,

00:00:15.919 --> 00:00:18.899
incredibly straightforward search. Right, just

00:00:18.899 --> 00:00:21.910
a basic... Exactly. It is a Wikipedia search

00:00:21.910 --> 00:00:25.269
for the list of NHL teams. Which sounds like

00:00:25.269 --> 00:00:28.109
we are about to do a deep dive on hockey history.

00:00:28.329 --> 00:00:30.730
Right. But here's the big twist. The source material

00:00:30.730 --> 00:00:32.929
we have today isn't actually a list of hockey

00:00:32.929 --> 00:00:35.789
teams at all. Not even close. No, it is literally

00:00:35.789 --> 00:00:39.460
the page not found text. from Wikipedia, just

00:00:39.460 --> 00:00:41.979
the exact error page you hit when an article

00:00:41.979 --> 00:00:43.899
doesn't exist. And I know that sounds like we

00:00:43.899 --> 00:00:46.060
just hit a dead end, but instead of talking about

00:00:46.060 --> 00:00:48.479
sports today, we are going to take you on a behind

00:00:48.479 --> 00:00:51.439
the scenes tour of the Internet's largest encyclopedia.

00:00:51.820 --> 00:00:53.840
Yeah, because there's so much hiding in plain

00:00:53.840 --> 00:00:57.159
sight on this page. Exactly. By analyzing the

00:00:57.159 --> 00:01:00.079
exact text of this missing page, we can actually

00:01:00.079 --> 00:01:02.560
uncover the hidden architecture, the global community

00:01:02.560 --> 00:01:05.659
campaigns, and all these quirky technical rules

00:01:05.659 --> 00:01:08.700
that govern how human knowledge is cataloged

00:01:08.700 --> 00:01:12.200
online. It is a total goldmine. It really is.

00:01:12.260 --> 00:01:15.019
So let's just jump right into the anatomy of

00:01:15.019 --> 00:01:17.640
a missing page. The very first thing you see

00:01:17.640 --> 00:01:20.700
is the exact error message. Which reads, Wikipedia

00:01:20.700 --> 00:01:23.599
does not have an article with this exact name.

00:01:23.780 --> 00:01:25.780
And that phrasing is so deliberate, the keyword

00:01:25.780 --> 00:01:28.280
there is exact. Right, because of the case sensitivity.

00:01:28.680 --> 00:01:31.280
Exactly. Wikipedia search titles are almost entirely

00:01:31.280 --> 00:01:34.260
case sensitive. The fascinating caveat here is

00:01:34.260 --> 00:01:36.239
that it applies to everything except for the

00:01:36.239 --> 00:01:38.219
very first character. So if you capitalize the

00:01:38.219 --> 00:01:41.459
L in Lest. but then accidentally capitalize the

00:01:41.459 --> 00:01:44.980
T in teams? You hit a brick wall. A simple capitalization

00:01:44.980 --> 00:01:47.140
error leads you straight to this dead end. That

00:01:47.140 --> 00:01:49.159
is wild. You think of Google where you can just

00:01:49.159 --> 00:01:50.959
smash the keyboard and it figures out what you

00:01:50.959 --> 00:01:53.620
mean? Right, but Wikipedia forces you to be precise.

00:01:53.700 --> 00:01:57.140
It is a database, not just a search engine. Okay,

00:01:57.180 --> 00:02:00.439
so you hit this wall. But Wikipedia doesn't just

00:02:00.439 --> 00:02:03.700
leave you stranded. It gives you a set of troubleshooting

00:02:03.700 --> 00:02:06.340
steps. It acts like a guide. Yeah. So first,

00:02:06.500 --> 00:02:08.560
it gives you the option to search for existing

00:02:08.560 --> 00:02:11.099
articles. Or look for pages that just happen

00:02:11.099 --> 00:02:13.539
to link to the title you typed. Which is pretty

00:02:13.539 --> 00:02:15.860
standard. Just trying to help you find a detour.

00:02:16.139 --> 00:02:18.659
But the second step is where it gets really interesting.

00:02:18.919 --> 00:02:21.240
It introduces this concept called the article

00:02:21.240 --> 00:02:24.400
wizard. Yes, the article wizard. And it says

00:02:24.400 --> 00:02:27.080
that to create new articles, a user must log

00:02:27.080 --> 00:02:30.599
in, create an account, and reach... auto -confirmed

00:02:30.599 --> 00:02:33.979
status. Which is a huge detail. We always think

00:02:33.979 --> 00:02:37.680
of Wikipedia as this completely open source utopia

00:02:37.680 --> 00:02:39.520
where anyone can just write whatever they want.

00:02:39.740 --> 00:02:41.500
Right. The free encyclopedia that anyone can

00:02:41.500 --> 00:02:44.419
edit. Exactly. But this auto -confirmed status

00:02:44.419 --> 00:02:46.919
requirement shows that there is actually a delicate

00:02:46.919 --> 00:02:50.080
balance between open contribution and fairly

00:02:50.080 --> 00:02:53.360
strict gatekeeping for quality control. You have

00:02:53.360 --> 00:02:55.280
to earn your right to create a whole new page.

00:02:55.639 --> 00:02:58.159
Wait, I don't follow entirely. So are all users

00:02:58.159 --> 00:03:00.000
forced to become auto -confirmed eventually?

00:03:00.360 --> 00:03:02.740
Actually, no. That is a common misconception.

00:03:03.159 --> 00:03:05.659
If you are someone who only ever fixes spelling

00:03:05.659 --> 00:03:08.560
mistakes or maybe adds a missing reference link

00:03:08.560 --> 00:03:10.560
here and there, you don't actually need to be

00:03:10.560 --> 00:03:12.580
auto -confirmed. Oh, really? So you can still

00:03:12.580 --> 00:03:15.580
edit without jumping through those hoops? Right.

00:03:15.680 --> 00:03:17.939
Though, obviously, many people will naturally

00:03:17.939 --> 00:03:20.439
hit that threshold if they are regular participants

00:03:20.439 --> 00:03:22.319
on the platform. Okay. I want to make sure that

00:03:22.319 --> 00:03:24.650
is clearly communicated. Because it is a big

00:03:24.650 --> 00:03:27.689
distinction. Editing a typo is fundamentally

00:03:27.689 --> 00:03:30.430
different to the system than minting a brand

00:03:30.430 --> 00:03:33.229
new article out of thin air. Exactly. The gatekeeping

00:03:33.229 --> 00:03:36.069
is specifically for creating new spaces, not

00:03:36.069 --> 00:03:38.289
maintaining existing ones. Okay, that makes sense.

00:03:38.810 --> 00:03:40.990
So moving on to the third troubleshooting step.

00:03:41.189 --> 00:03:43.669
It talks about technical glitches. It says sometimes

00:03:43.669 --> 00:03:46.949
a page does exist, but there is a delay in the

00:03:46.949 --> 00:03:49.289
database. Right. The database might just be catching

00:03:49.289 --> 00:03:52.030
up. Wait, what causes a delay in updating the

00:03:52.030 --> 00:03:54.419
database for Wikipedia? well you have to think

00:03:54.419 --> 00:03:57.360
about the sheer scale of the site it is handling

00:03:57.360 --> 00:04:00.240
millions of requests so it heavily relies on

00:04:00.240 --> 00:04:03.740
caching which basically means storing a snapshot

00:04:03.740 --> 00:04:06.419
of a page so it loads faster for the next person

00:04:06.419 --> 00:04:09.259
sometimes that snapshot gets a little stale oh

00:04:09.259 --> 00:04:12.759
gotcha and to fix that the page tells the user

00:04:12.759 --> 00:04:15.240
to try the and i love the name of this the purge

00:04:15.240 --> 00:04:18.120
function It sounds incredibly dramatic. It really

00:04:18.120 --> 00:04:20.540
does. So wait, what is a purge function exactly?

00:04:20.879 --> 00:04:23.040
It isn't completely clear from the text what

00:04:23.040 --> 00:04:25.939
exactly this gets you. Does it reset user input

00:04:25.939 --> 00:04:28.699
on the page if I start typing an article? Does

00:04:28.699 --> 00:04:30.639
it delete anything permanently out of Wikipedia

00:04:30.639 --> 00:04:33.220
memory entirely? No, no, nothing like that. It

00:04:33.220 --> 00:04:35.779
doesn't delete human knowledge or reset your

00:04:35.779 --> 00:04:38.120
drafts. When you trigger the purge function,

00:04:38.300 --> 00:04:40.079
you are literally just telling the Wikipedia

00:04:40.079 --> 00:04:43.300
servers to throw away that stale snapshot we

00:04:43.300 --> 00:04:45.120
just talked about. Oh, so it just forces the

00:04:45.120 --> 00:04:47.399
page. to refresh from the live database exactly

00:04:47.399 --> 00:04:50.540
it clears the cache for that specific page it

00:04:50.540 --> 00:04:52.620
is basically the IT equivalent of turning it

00:04:52.620 --> 00:04:55.199
off and on again but at a server level that is

00:04:55.199 --> 00:04:57.399
so much less terrifying than purge function makes

00:04:57.399 --> 00:04:59.879
it sound I know it sounds like a sci -fi self

00:04:59.879 --> 00:05:02.620
-destruct sequence right and the final troubleshooting

00:05:02.620 --> 00:05:05.839
step is the possibility of deletion It prompts

00:05:05.839 --> 00:05:08.220
the lost user to check the deletion log just

00:05:08.220 --> 00:05:10.980
in case the article used to exist but got removed

00:05:10.980 --> 00:05:13.240
by the community. Which, again, points to that

00:05:13.240 --> 00:05:15.519
immense invisible infrastructure. Everything

00:05:15.519 --> 00:05:18.980
leaves a paper trail. So true. Now, I want to

00:05:18.980 --> 00:05:21.800
pivot from the text itself and look at the literal

00:05:21.800 --> 00:05:24.439
margins of our source material. The user interface

00:05:24.439 --> 00:05:26.459
stuff. Yeah, the sidebar and appearance settings.

00:05:26.740 --> 00:05:29.180
Yeah. It is so easy to ignore this part of a

00:05:29.180 --> 00:05:31.649
website. But there are some fascinating options

00:05:31.649 --> 00:05:34.089
here. It walks you through options for text,

00:05:34.370 --> 00:05:37.310
size, small, standard, large. Highly focused

00:05:37.310 --> 00:05:39.790
on accessibility. Exactly. And width is there

00:05:39.790 --> 00:05:43.269
too, standard or wide. Plus this color beta setting

00:05:43.269 --> 00:05:46.009
where you can choose automatic, light, or dark

00:05:46.009 --> 00:05:48.829
mode. It is entirely built around utility. How

00:05:48.829 --> 00:05:50.709
can you consume this information most comfortably?

00:05:51.209 --> 00:05:53.290
But then I noticed this one highly specific,

00:05:53.449 --> 00:05:56.230
incredibly quirky detail hidden right in the

00:05:56.230 --> 00:05:58.670
middle of this utilitarian appearance menu. Oh,

00:05:58.709 --> 00:06:01.949
the birthday mode. Yes. It literally says birthday

00:06:01.949 --> 00:06:06.129
mode and in parentheses, baby globe. And you

00:06:06.129 --> 00:06:09.089
can toggle it to enabled. I love that detail

00:06:09.089 --> 00:06:13.100
so much. What even is a baby globe? Well, Wikipedia's

00:06:13.100 --> 00:06:15.459
logo is the globe made of puzzle pieces, right?

00:06:15.560 --> 00:06:18.160
So for special anniversaries or milestones, they

00:06:18.160 --> 00:06:20.399
have this whimsical little version of the logo.

00:06:20.600 --> 00:06:22.399
That is amazing. And I think it is important

00:06:22.399 --> 00:06:25.240
because these small, whimsical design choices

00:06:25.240 --> 00:06:28.399
really humanize what is otherwise a massive,

00:06:28.459 --> 00:06:30.980
almost clinical database. It reminds you that

00:06:30.980 --> 00:06:33.790
actual humans built this. Humans with a sense

00:06:33.790 --> 00:06:35.910
of humor. Exactly. And right below that quirky

00:06:35.910 --> 00:06:37.769
setting, you see all the standard background

00:06:37.769 --> 00:06:40.290
actions available on the page. Things like getting

00:06:40.290 --> 00:06:43.589
a shortened URL, downloading a QR code, or finding

00:06:43.589 --> 00:06:46.129
a printable version. Again, very utility focused.

00:06:46.589 --> 00:06:48.589
Right. The platform isn't just a place to read.

00:06:48.730 --> 00:06:50.790
It is designed to let you extract the information,

00:06:51.129 --> 00:06:53.610
share it, print it out, and take it into the

00:06:53.610 --> 00:06:55.649
real world. Which perfectly transitions into

00:06:55.649 --> 00:06:58.149
the next big thing on this missing page. Yes,

00:06:58.149 --> 00:07:00.069
the banner. So at the very top of this otherwise

00:07:00.069 --> 00:07:04.170
empty void of a page. There's this massive prominent

00:07:04.170 --> 00:07:07.910
banner for global campaign. It says Wiki loves

00:07:07.910 --> 00:07:10.769
Ramadan 2026. Now, this is a perfect example

00:07:10.769 --> 00:07:13.569
of how dynamic the site is. Even when you are

00:07:13.569 --> 00:07:17.970
staring at a 404 error for NHL teams, the platform

00:07:17.970 --> 00:07:20.930
is actively campaigning. Right. It never stops

00:07:20.930 --> 00:07:22.990
working. The stated goal of this campaign is

00:07:22.990 --> 00:07:25.250
really inspiring. It is designed to tell the

00:07:25.250 --> 00:07:27.529
world about Ramadan traditions and specifically

00:07:27.529 --> 00:07:30.529
to bridge massive knowledge gaps about Islamic

00:07:30.529 --> 00:07:32.810
history and culture. Which is incredible. But

00:07:32.810 --> 00:07:34.750
wait, thinking back to our earlier conversation,

00:07:35.129 --> 00:07:37.410
if they aren't really auto -confirmed, can they

00:07:37.410 --> 00:07:40.410
still create a page for WikiLoves Ramadan 2026?

00:07:40.750 --> 00:07:42.889
It doesn't clarify this. in the sources. Well,

00:07:42.970 --> 00:07:45.310
our source text doesn't explicitly outline the

00:07:45.310 --> 00:07:47.430
permissions for campaign pages, but honestly,

00:07:47.569 --> 00:07:49.910
getting bogged down in those specific technical

00:07:49.910 --> 00:07:52.230
permissions misses the forest for the trees here.

00:07:52.370 --> 00:07:54.509
Fair point. Yeah. Because the larger takeaway

00:07:54.509 --> 00:07:57.269
is that Wikipedia recognizes its own blind spots.

00:07:57.610 --> 00:08:00.110
The community knows that certain topics like

00:08:00.110 --> 00:08:02.930
Western sports or pop culture are heavily documented,

00:08:03.089 --> 00:08:05.689
while rich cultural histories from other parts

00:08:05.689 --> 00:08:07.829
of the world might be missing. So they use every

00:08:07.829 --> 00:08:10.889
piece of real estate, even error pages, to recruit

00:08:10.889 --> 00:08:13.430
people. To help fill those cultural knowledge

00:08:13.430 --> 00:08:16.930
gaps. Exactly. It is a constant, active effort

00:08:16.930 --> 00:08:20.250
to make the encyclopedia truly global. That is

00:08:20.250 --> 00:08:22.709
brilliant. Yeah. And speaking of a global ecosystem,

00:08:22.889 --> 00:08:26.009
we have to talk about the In Other Projects section

00:08:26.009 --> 00:08:28.910
on the sidebar. Oh, the sister sites. Yes. I

00:08:28.910 --> 00:08:30.470
am just going to read through this list because

00:08:30.470 --> 00:08:33.330
the sheer breadth of it is staggering. Go for

00:08:33.330 --> 00:08:36.350
it. Okay. So from our source, we have Wictionary,

00:08:36.429 --> 00:08:38.610
which is the dictionary, Wikibooks for textbooks,

00:08:39.509 --> 00:08:42.809
Wiki for quotations. Wikisource, the library.

00:08:43.490 --> 00:08:45.950
Wikiversity for learning resources. It just keeps

00:08:45.950 --> 00:08:49.149
going. It really does. Commons for media. Wikivoyage

00:08:49.149 --> 00:08:51.409
for travel guides. Wikinews as a news source.

00:08:51.929 --> 00:08:54.850
Wikidata as a linked database. And Wikispecies,

00:08:54.950 --> 00:08:57.750
which is a species directory. It is almost overwhelming

00:08:57.750 --> 00:08:59.710
when you list them all out like that. It really

00:08:59.710 --> 00:09:02.149
is. I had no idea half of these even existed.

00:09:02.600 --> 00:09:04.419
And I think that is the bigger picture here.

00:09:04.519 --> 00:09:07.320
When you understand this massive list of interconnected

00:09:07.320 --> 00:09:09.980
databases covering literally everything from

00:09:09.980 --> 00:09:13.360
travel itineraries to the taxonomy of biological

00:09:13.360 --> 00:09:17.159
species, you realize something profound. What's

00:09:17.159 --> 00:09:20.600
that? The absence of a list of NHL teams is just

00:09:20.600 --> 00:09:24.620
a tiny, insignificant drop in an absolute ocean

00:09:24.620 --> 00:09:27.399
of knowledge. Wow. Yeah. It really puts a missing

00:09:27.399 --> 00:09:29.700
article into perspective. The ecosystem is so

00:09:29.700 --> 00:09:32.419
much larger than just Wikipedia itself. All these

00:09:32.419 --> 00:09:34.460
projects feed into each other, creating this

00:09:34.460 --> 00:09:37.100
web of free information that we all just completely

00:09:37.100 --> 00:09:40.259
take for granted. We absolutely do. So just to

00:09:40.259 --> 00:09:43.000
kind of recap our journey today, what started

00:09:43.000 --> 00:09:45.980
out as a failed. basic search for a list of hockey

00:09:45.980 --> 00:09:48.759
teams, ended up turning into this masterclass

00:09:48.759 --> 00:09:51.320
in digital infrastructure. A total rabbit hole.

00:09:51.460 --> 00:09:53.940
Totally. We uncovered the hidden rules of case

00:09:53.940 --> 00:09:55.919
sensitivity. We learned about the gatekeeping

00:09:55.919 --> 00:09:58.159
of the auto -confirmed status. And we finally

00:09:58.159 --> 00:10:00.320
figured out what a purge function actually does.

00:10:00.830 --> 00:10:02.909
And we got to appreciate the whimsical Baby Globe

00:10:02.909 --> 00:10:05.769
setting. Can't forget the Baby Globe. Plus, we

00:10:05.769 --> 00:10:08.330
saw how the platform actively campaigns to bridge

00:10:08.330 --> 00:10:11.129
global cultural gaps with things like Wiki Loves

00:10:11.129 --> 00:10:14.049
Ramadan while connecting to a massive web of

00:10:14.049 --> 00:10:17.210
sister projects. It is just amazing what keeps

00:10:17.210 --> 00:10:19.470
the Internet's ultimate library running. It really

00:10:19.470 --> 00:10:21.409
is an incredible feat of human collaboration.

00:10:21.490 --> 00:10:24.289
It is. But as we wrap up, I know you have a final

00:10:24.289 --> 00:10:26.110
thought you want to leave everyone with. I do.

00:10:26.269 --> 00:10:28.470
And it is a lingering question I really want

00:10:28.470 --> 00:10:30.649
you to ponder on your own after we finish today.

00:10:30.830 --> 00:10:32.610
Lay it on us. Think about all these technical

00:10:32.610 --> 00:10:35.309
hurdles we just discussed. If Wikipedia requires

00:10:35.309 --> 00:10:38.429
users to be auto -confirmed to easily create

00:10:38.429 --> 00:10:41.230
an article, and if it hides its complex infrastructure

00:10:41.230 --> 00:10:43.970
behind intimidating terms like purge functions

00:10:43.970 --> 00:10:47.730
and strict case sensitivity rules, how much vital

00:10:47.730 --> 00:10:49.990
human history and knowledge is currently trapped?

00:10:50.269 --> 00:10:52.460
Trapped where? Trapped in the minds of people

00:10:52.460 --> 00:10:54.419
who simply don't know how to navigate the article

00:10:54.419 --> 00:10:57.039
wizard. Think about the older generations or

00:10:57.039 --> 00:11:00.039
people from less tech -heavy backgrounds. How

00:11:00.039 --> 00:11:02.379
many invisible barriers stand between a crucial

00:11:02.379 --> 00:11:05.700
fact and its rightful place in the global encyclopedia?

00:11:06.000 --> 00:11:08.639
That is a really profound thought. How much are

00:11:08.639 --> 00:11:10.860
we missing just because the interface is a barrier?

00:11:11.179 --> 00:11:13.379
Exactly. It makes you wonder what else is out

00:11:13.379 --> 00:11:15.940
there just waiting for the right person to jump

00:11:15.940 --> 00:11:18.779
through the right hoops. Wow. Well, that gives

00:11:18.779 --> 00:11:21.190
you plenty to think about. Thanks for joining

00:11:21.190 --> 00:11:23.470
us on today's deep dive, and we will catch you

00:11:23.470 --> 00:11:24.029
on the next one.
