WEBVTT

00:00:00.000 --> 00:00:02.600
Welcome to today's custom -tailored deep dive.

00:00:02.839 --> 00:00:06.360
I'm your host, and I am just thrilled you are

00:00:06.360 --> 00:00:08.339
joining us today. And I'm your resident expert

00:00:08.339 --> 00:00:11.419
for this deep dive. It's really great to be here.

00:00:11.640 --> 00:00:15.039
So today we are going to unpack this sprawling

00:00:15.039 --> 00:00:18.039
multi -billion dollar industry that is honestly

00:00:18.039 --> 00:00:20.640
completely invisible to most people. It really

00:00:20.640 --> 00:00:24.980
is. But this unseen machine fundamentally shapes

00:00:24.980 --> 00:00:28.210
the news you read. the products you buy, and

00:00:28.210 --> 00:00:30.170
even the marketing messages you are targeted

00:00:30.170 --> 00:00:32.570
with every single day. We are talking about media

00:00:32.570 --> 00:00:34.670
monitoring services. Right. It is essentially

00:00:34.670 --> 00:00:37.229
the hidden engine of modern information logistics.

00:00:37.509 --> 00:00:40.649
And when you pull back the curtain on this industry,

00:00:40.810 --> 00:00:42.649
you aren't just looking at how companies track

00:00:42.649 --> 00:00:44.469
the news. You are looking at the exact history

00:00:44.469 --> 00:00:47.490
of how human beings learn to isolate, organize,

00:00:47.750 --> 00:00:50.250
and eventually monetize information itself. Which

00:00:50.250 --> 00:00:53.009
is just a wild concept. And the mission for this

00:00:53.009 --> 00:00:55.909
deep dive is to trace that incredible journey.

00:00:56.649 --> 00:00:58.990
We are going to walk you through how this industry

00:00:58.990 --> 00:01:01.729
evolved from people using literal scissors and

00:01:01.729 --> 00:01:04.890
paste in the 1800s all the way to the automated

00:01:04.890 --> 00:01:07.909
web crawling spiders that dictate corporate strategy

00:01:07.909 --> 00:01:11.090
right now. It's quite the leap. It is. And our

00:01:11.090 --> 00:01:12.930
main source for this is a really comprehensive

00:01:12.930 --> 00:01:15.689
Wikipedia article that details the whole history,

00:01:15.849 --> 00:01:18.409
the evolution, and the legal hurdles of this

00:01:18.409 --> 00:01:20.900
industry. And more importantly, we want to show

00:01:20.900 --> 00:01:23.760
you exactly why this unseen ecosystem matters

00:01:23.760 --> 00:01:26.280
so much to your daily life. It is a fascinating

00:01:26.280 --> 00:01:28.700
trajectory, really. And to truly understand where

00:01:28.700 --> 00:01:31.719
we are today with, you know, automated bots and

00:01:31.719 --> 00:01:34.500
real time data dashboards, we actually have to

00:01:34.500 --> 00:01:36.780
start with something deeply, inherently human.

00:01:37.000 --> 00:01:40.200
Vanity. Precisely. Pure vanity. I love this origin

00:01:40.200 --> 00:01:43.140
story because it is just so relatable. Oh, yeah.

00:01:43.219 --> 00:01:44.799
Think about your own life before the Internet,

00:01:44.920 --> 00:01:47.099
before you could just. type a name into a search

00:01:47.099 --> 00:01:49.459
bar or, you know, set up a notification on your

00:01:49.459 --> 00:01:51.500
phone. How did you know if people were talking

00:01:51.500 --> 00:01:54.000
about you? You really didn't. Right. If you did

00:01:54.000 --> 00:01:56.340
something noteworthy or even something controversial,

00:01:56.599 --> 00:01:58.859
how did you actually find out what the press

00:01:58.859 --> 00:02:01.620
was saying? It was a massive logistical problem.

00:02:01.799 --> 00:02:04.200
I mean, you couldn't possibly buy and read every

00:02:04.200 --> 00:02:07.040
single newspaper, magazine and journal published

00:02:07.040 --> 00:02:10.139
every day. No way. The sheer volume of print

00:02:10.139 --> 00:02:12.419
media, even back in the 19th century, was just

00:02:12.419 --> 00:02:15.120
overwhelming for any one person to track. Which

00:02:15.120 --> 00:02:17.240
leads to this incredible anecdote from Paris.

00:02:17.340 --> 00:02:20.919
This is in 1879. Right. Alfred Cherie. Yes. He

00:02:20.919 --> 00:02:24.139
started an agency called Largousse de la Presse.

00:02:24.219 --> 00:02:27.599
And he didn't start it for some grand journalistic

00:02:27.599 --> 00:02:30.229
purpose. None at all. He started it specifically

00:02:30.229 --> 00:02:32.830
because Parisian theatrical actors wanted to

00:02:32.830 --> 00:02:35.009
read reviews of their own performances. Naturally.

00:02:35.169 --> 00:02:37.830
But these actors didn't want to shell out the

00:02:37.830 --> 00:02:41.009
money to buy an entire bulky newspaper just to

00:02:41.009 --> 00:02:43.770
read one tiny blurb about themselves. It is the

00:02:43.770 --> 00:02:46.430
ultimate human motivation, wanting to know what

00:02:46.430 --> 00:02:48.590
others think of you, but without paying full

00:02:48.590 --> 00:02:51.189
price for the privilege. Exactly. But while those

00:02:51.189 --> 00:02:53.490
Parisian actors make for a wonderful snapshot

00:02:53.490 --> 00:02:57.009
of 1879, the actual structural timeline of this

00:02:57.009 --> 00:02:59.620
industry stretches back. a bit further. Oh, really?

00:02:59.819 --> 00:03:02.080
Yeah. The first official press clipping agency

00:03:02.080 --> 00:03:05.719
was actually established in London in 1852. Wow.

00:03:05.800 --> 00:03:08.500
It was started by a man named Henry Romike, who

00:03:08.500 --> 00:03:11.599
partnered with a news dealer named Curtis. So

00:03:11.599 --> 00:03:13.719
over a century and a half ago, who were they

00:03:13.719 --> 00:03:16.159
serving in London? Was it also just actors? Well,

00:03:16.240 --> 00:03:18.780
it started with actors, certainly. But it very

00:03:18.780 --> 00:03:21.860
quickly expanded to business tycoons, politicians,

00:03:22.419 --> 00:03:25.620
wealthy socialites. The elites. Right. These

00:03:25.620 --> 00:03:27.900
were people who were eager to see their names

00:03:27.900 --> 00:03:30.280
in print. They were essentially paying for the

00:03:30.280 --> 00:03:33.240
19th century equivalent of an ego surf. An analog

00:03:33.240 --> 00:03:35.620
ego surf. I love that. They wanted the validation

00:03:35.620 --> 00:03:38.120
of being mentioned in the societal pages, but

00:03:38.120 --> 00:03:39.960
they definitely didn't want to do the legwork

00:03:39.960 --> 00:03:42.840
of finding those mentions. Okay, but wait. If

00:03:42.840 --> 00:03:44.939
they weren't using search engines, how did these

00:03:44.939 --> 00:03:47.639
early clipping gyros actually pull this off?

00:03:48.560 --> 00:03:50.860
The workflow to achieve this manually must have

00:03:50.860 --> 00:03:53.340
been incredibly tedious. Oh, it was a massive

00:03:53.340 --> 00:03:55.400
manual undertaking, and it functioned essentially

00:03:55.400 --> 00:03:58.919
as a highly gendered assembly line. How so? First,

00:03:58.939 --> 00:04:01.020
these agencies employed women to sit in large

00:04:01.020 --> 00:04:03.479
rooms and manually scan through massive stacks

00:04:03.479 --> 00:04:06.419
of daily periodicals. Just reading all day? Yes.

00:04:06.520 --> 00:04:10.340
Their entire job was to read walls of tiny text,

00:04:10.580 --> 00:04:13.599
specifically looking for names, companies, or

00:04:13.599 --> 00:04:16.519
terms requested by their clients. Just imagine

00:04:16.519 --> 00:04:19.370
the sheer eye strain. I know. The concentration

00:04:19.370 --> 00:04:22.529
required to do that for 8 to 10 hours a day.

00:04:23.350 --> 00:04:26.310
Scanning thousands of pages of newsprint for

00:04:26.310 --> 00:04:29.050
a passing mention of a specific socialite. And

00:04:29.050 --> 00:04:31.790
once the women found a mention and marked the

00:04:31.790 --> 00:04:34.490
page, those marked periodicals were passed along

00:04:34.490 --> 00:04:37.009
to a completely different department. Let me

00:04:37.009 --> 00:04:40.319
guess. staffed entirely by men of course and

00:04:40.319 --> 00:04:42.639
these men would literally take scissors cut the

00:04:42.639 --> 00:04:44.699
articles out and paste them onto dated slips

00:04:44.699 --> 00:04:46.860
of paper so it was quite literally a clipping

00:04:46.860 --> 00:04:50.939
service exactly scissors paste and paper then

00:04:50.939 --> 00:04:53.139
those pasted slips would go back to the women

00:04:53.139 --> 00:04:55.920
who would sort them by client and physically

00:04:55.920 --> 00:04:58.579
mail these clippings out but surely an industry

00:04:58.579 --> 00:05:01.089
can't sustain itself purely on the vanity of

00:05:01.089 --> 00:05:03.670
socialites and actors forever. At what point

00:05:03.670 --> 00:05:06.810
does this shift from a novelty service to a fundamental

00:05:06.810 --> 00:05:10.050
business tool? That pivot happens firmly by the

00:05:10.050 --> 00:05:13.329
1930s. The bulk of clipping subscriptions completely

00:05:13.329 --> 00:05:16.069
shifted away from wealthy individuals and toward

00:05:16.069 --> 00:05:19.170
big business. It transitioned from a vanity project

00:05:19.170 --> 00:05:22.610
to a corporate necessity. Makes sense. Government

00:05:22.610 --> 00:05:25.110
agencies started subscribing to track public

00:05:25.110 --> 00:05:27.889
sentiment. And critically, companies started

00:05:27.889 --> 00:05:30.269
using the services to monitor their competitors.

00:05:30.610 --> 00:05:32.410
So instead of asking, what are they saying about

00:05:32.410 --> 00:05:35.170
me? The question became, what are they saying

00:05:35.170 --> 00:05:38.699
about my rival's new product? Precisely. It was

00:05:38.699 --> 00:05:41.220
the birth of modern competitive intelligence.

00:05:41.560 --> 00:05:44.329
Wow. Businesses realized that if they could track

00:05:44.329 --> 00:05:47.230
every mention of a competitor's pricing or a

00:05:47.230 --> 00:05:49.970
competitor's regional expansion, they could adjust

00:05:49.970 --> 00:05:52.970
their own strategies accordingly. That is a huge

00:05:52.970 --> 00:05:55.089
advantage. And it became a massive enterprise

00:05:55.089 --> 00:05:58.569
in itself. In the United States, by 1932, just

00:05:58.569 --> 00:06:01.250
two companies controlled the market. Only two?

00:06:01.449 --> 00:06:04.629
Yeah. The Romica Company and Luce's Breast Clipping

00:06:04.629 --> 00:06:07.170
Bureau. Together, they controlled roughly 80

00:06:07.170 --> 00:06:10.209
% of the entire clipping market. 80 %? Yeah.

00:06:10.250 --> 00:06:14.019
They practically monopolized. It really did.

00:06:14.199 --> 00:06:16.839
But here is where the story hits a fascinating

00:06:16.839 --> 00:06:21.240
wall. Up until the mid -20th century, mass media

00:06:21.240 --> 00:06:25.079
was almost solely print media. Newspapers, trade

00:06:25.079 --> 00:06:27.500
journals, magazines. Right. You can easily cut

00:06:27.500 --> 00:06:29.399
an article out of a newspaper with scissors.

00:06:30.060 --> 00:06:32.980
But as telegraphy expands and then eventually

00:06:32.980 --> 00:06:34.759
gives way to radio and television broadcasting.

00:06:36.040 --> 00:06:38.399
How in the world did this scissors and paste

00:06:38.399 --> 00:06:41.519
industry adapt? It was tough. I mean, you cannot

00:06:41.519 --> 00:06:43.879
cut a segment out of a live radio broadcast.

00:06:44.160 --> 00:06:46.600
You can't. And it was a massive vulnerability

00:06:46.600 --> 00:06:49.639
for the industry. For a significant period, these

00:06:49.639 --> 00:06:52.360
monitoring services were severely limited because

00:06:52.360 --> 00:06:54.540
they simply could not offer comprehensive tracking

00:06:54.540 --> 00:06:56.759
of broadcast media. They just missed it entirely.

00:06:57.120 --> 00:06:59.910
Pretty much. If a company was criticized on a

00:06:59.910 --> 00:07:02.490
national radio program, the Clipping Bureau couldn't

00:07:02.490 --> 00:07:04.370
mail them a transcript the next day. The technology

00:07:04.370 --> 00:07:06.509
just wasn't there. So they were essentially flying

00:07:06.509 --> 00:07:08.709
blind when it came to the fastest growing mediums

00:07:08.709 --> 00:07:12.129
in the world. Until the 1950s and 1960s. What

00:07:12.129 --> 00:07:14.629
changed? That is when we see the commercial introduction

00:07:14.629 --> 00:07:17.509
of audio and videotape recording systems. Ah,

00:07:17.569 --> 00:07:20.649
magnetic tape. Exactly. This was the technological

00:07:20.649 --> 00:07:24.029
leap the industry desperately needed. It finally

00:07:24.029 --> 00:07:26.529
allowed the broadcast airwaves to be captured,

00:07:26.689 --> 00:07:36.529
recorded, Okay. So now they're monitoring what

00:07:36.529 --> 00:07:38.430
is written in print and what is spoken on the

00:07:38.430 --> 00:07:41.589
airwaves. But as revolutionary as magnetic tape

00:07:41.589 --> 00:07:44.589
was, nothing really compares to the shift that

00:07:44.589 --> 00:07:47.839
happens next. The digital age. We have to talk

00:07:47.839 --> 00:07:50.399
about the internet boom of the 1990s. Yeah. How

00:07:50.399 --> 00:07:53.899
did these legacy paper -based companies survive

00:07:53.899 --> 00:07:56.000
the transition to the web? Did they just go out

00:07:56.000 --> 00:07:58.660
of business? Some certainly did. If you couldn't

00:07:58.660 --> 00:08:01.399
adapt to digital, you died. But others adapted

00:08:01.399 --> 00:08:03.639
brilliantly. Pretty cool. A perfect example from

00:08:03.639 --> 00:08:05.639
the source is a company called the Universal

00:08:05.639 --> 00:08:08.459
Press Clipping Bureau. They were founded all

00:08:08.459 --> 00:08:11.360
the way back in 1908 in Omaha, Nebraska. For

00:08:11.360 --> 00:08:13.680
decades, their entire business model was clipping

00:08:13.680 --> 00:08:17.339
physical paper. Then the 1990s hit. emerges and

00:08:17.339 --> 00:08:20.360
they make a massive pivot. They completely rebranded.

00:08:20.360 --> 00:08:21.899
What did they change their name to? Universal

00:08:21.899 --> 00:08:24.980
Information Services. And that was to reflect

00:08:24.980 --> 00:08:27.560
their expansion into digital technology. That

00:08:27.560 --> 00:08:30.579
is a profound shift in terminology, going from

00:08:30.579 --> 00:08:32.980
clipping, which is a tangible physical action

00:08:32.980 --> 00:08:35.659
with scissors, to information services, which

00:08:35.659 --> 00:08:38.139
is entirely abstract. It perfectly reflects the

00:08:38.139 --> 00:08:40.740
digitization of knowledge itself. And the timeline

00:08:40.740 --> 00:08:45.309
moved incredibly fast. How fast? By 1998. Platforms

00:08:45.309 --> 00:08:47.909
like the now -defunct web clipping website were

00:08:47.909 --> 00:08:50.529
actively monitoring internet -based news media.

00:08:51.210 --> 00:08:54.629
Fast forward to 2012 and the research firm Gartner

00:08:54.629 --> 00:08:57.230
estimated there were more than 250 different

00:08:57.230 --> 00:09:00.210
social media monitoring vendors operating globally.

00:09:00.429 --> 00:09:03.570
That is wild. From two guys in London in 1852

00:09:03.570 --> 00:09:07.090
to a sprawling digital infrastructure with hundreds

00:09:07.090 --> 00:09:09.570
of tech vendors. It's exponential. But beyond

00:09:09.570 --> 00:09:11.330
just the speed and the technology, there's a

00:09:11.330 --> 00:09:13.389
much deeper conceptual shift here, isn't there?

00:09:13.470 --> 00:09:15.710
There is. And it is arguably the biggest aha

00:09:15.710 --> 00:09:18.049
moment of this entire evolution. Tell us about

00:09:18.049 --> 00:09:20.230
that. Well, the transition wasn't just about

00:09:20.230 --> 00:09:22.889
fast. delivery or better technology, it was a

00:09:22.889 --> 00:09:25.330
fundamental psychological shift in how human

00:09:25.330 --> 00:09:27.909
beings interact with data. Walk us through that.

00:09:28.070 --> 00:09:31.090
How did a clipping service change human psychology?

00:09:31.580 --> 00:09:33.759
Think about reading a newspaper before clipping

00:09:33.759 --> 00:09:36.500
services existed. An article was permanently

00:09:36.500 --> 00:09:39.100
bound to its context. Right. You sat down, you

00:09:39.100 --> 00:09:41.440
opened the paper, and you read the article alongside

00:09:41.440 --> 00:09:43.460
the advertisements, the neighboring stories,

00:09:43.620 --> 00:09:46.639
the editorial page. The information was part

00:09:46.639 --> 00:09:49.539
of a larger cohesive package. You took in the

00:09:49.539 --> 00:09:52.139
whole broadsheet. Exactly. But clipping services

00:09:52.139 --> 00:09:56.059
introduced the concept of the isolated snippet.

00:09:56.059 --> 00:09:58.759
Because they literally cut the information out

00:09:58.759 --> 00:10:02.029
of its context. Yes. foundational idea that a

00:10:02.029 --> 00:10:04.549
specific piece of information could be extracted,

00:10:04.750 --> 00:10:08.210
isolated, and consumed entirely on its own, that

00:10:08.210 --> 00:10:10.269
was revolutionary. I never thought about it that

00:10:10.269 --> 00:10:12.990
way. And that specific conceptual leap directly

00:10:12.990 --> 00:10:15.830
influenced the interfaces of early digital news

00:10:15.830 --> 00:10:19.090
databases like LexisNexis. Oh, wow. It laid the

00:10:19.090 --> 00:10:21.289
psychological groundwork for the keyword searches

00:10:21.289 --> 00:10:23.629
you and I use every single day. We no longer

00:10:23.629 --> 00:10:25.950
search for publications. We search for isolated

00:10:25.950 --> 00:10:28.529
pieces of data. So the 19th century actors who

00:10:28.529 --> 00:10:30.049
just wanted to read their own theater review.

00:10:30.120 --> 00:10:33.279
were inadvertently pioneering the way we Google

00:10:33.279 --> 00:10:36.379
things today. Pretty much. They trained us to

00:10:36.379 --> 00:10:39.840
consume information in isolated snippets. It

00:10:39.840 --> 00:10:42.299
is the genesis of modern search behavior. The

00:10:42.299 --> 00:10:45.000
Clipping Bureau was the analog predecessor to

00:10:45.000 --> 00:10:47.059
the search engine. Let's bring this directly

00:10:47.059 --> 00:10:49.399
into the present day and directly to you, our

00:10:49.399 --> 00:10:52.100
listener. We have covered the history and the

00:10:52.100 --> 00:10:54.940
shift in how we consume data. But what does this

00:10:54.940 --> 00:10:57.120
invisible machine actually look like right now?

00:10:57.279 --> 00:10:59.740
Why should you care about media monitoring today?

00:11:00.090 --> 00:11:02.529
Because it runs the modern business and political

00:11:02.529 --> 00:11:05.649
world. Oh, absolutely. The industry has evolved

00:11:05.649 --> 00:11:08.529
so far past physical clippings that it is now

00:11:08.529 --> 00:11:11.129
referred to as media intelligence or information

00:11:11.129 --> 00:11:14.370
logistics. Today, if you look at online tools

00:11:14.370 --> 00:11:17.129
you might use, like Google Alerts or massive

00:11:17.129 --> 00:11:20.029
corporate platforms like Meltwater, Cision, Medianet,

00:11:20.090 --> 00:11:22.409
and Muckrak, they're all the direct descendants

00:11:22.409 --> 00:11:24.850
of that Parisian clipping bureau. But instead

00:11:24.850 --> 00:11:26.970
of employing rooms full of people to read every

00:11:26.970 --> 00:11:30.129
page, they utilize automated software. Right.

00:11:30.269 --> 00:11:33.370
We are talking about web crawlers, bots or spiders

00:11:33.370 --> 00:11:36.110
that constantly roam the Internet. Just continuously

00:11:36.110 --> 00:11:39.230
scanning. Yes. They are automatically monitoring

00:11:39.230 --> 00:11:43.090
the content of free online news sources, newspapers,

00:11:43.509 --> 00:11:46.470
magazines, trade journals, news syndication services.

00:11:46.669 --> 00:11:49.450
And they are doing this 24 hours a day, seven

00:11:49.450 --> 00:11:52.250
days a week. And the applications for this constant

00:11:52.250 --> 00:11:55.289
surveillance are endless. It is no longer just

00:11:55.289 --> 00:11:59.200
actors checking reviews. Not at all. Today. Every

00:11:59.200 --> 00:12:01.620
single organization that relies on public relations

00:12:01.620 --> 00:12:04.600
uses news monitoring. They use it to track their

00:12:04.600 --> 00:12:06.740
own corporate publicity, obviously. But they

00:12:06.740 --> 00:12:09.440
also use it to monitor new legislation that might

00:12:09.440 --> 00:12:11.600
affect their supply chains or to track emerging

00:12:11.600 --> 00:12:14.639
trends in their specific industry. It's also

00:12:14.639 --> 00:12:18.139
a massive tool for verification. PR and marketing

00:12:18.139 --> 00:12:20.360
departments use these platforms to verify that

00:12:20.360 --> 00:12:22.500
their messaging is actually landing with their

00:12:22.500 --> 00:12:24.919
target audience. If a company launches a new

00:12:24.919 --> 00:12:26.639
product, they don't just hope people talk about

00:12:26.639 --> 00:12:28.860
it. They have bots quantifying exactly how many

00:12:28.860 --> 00:12:31.419
times it was mentioned, in what regions, and

00:12:31.419 --> 00:12:34.039
with what sentiment. So they know instantly if

00:12:34.039 --> 00:12:36.769
a campaign is working or failing? Exactly. And

00:12:36.769 --> 00:12:39.549
it extends far beyond the corporate world. City,

00:12:39.830 --> 00:12:42.730
state and federal government agencies are massive

00:12:42.730 --> 00:12:45.610
consumers of these monitoring services. Oh, to

00:12:45.610 --> 00:12:48.409
keep tabs on public opinion. That and they need

00:12:48.409 --> 00:12:50.649
to stay informed about regions and municipalities

00:12:50.649 --> 00:12:53.809
they couldn't possibly monitor manually. They

00:12:53.809 --> 00:12:55.990
use these services to verify that the public

00:12:55.990 --> 00:12:57.870
safety information that they are putting out

00:12:57.870 --> 00:13:00.490
is accurate, accessible and actually reaching

00:13:00.490 --> 00:13:02.870
the communities that need it. And the way this

00:13:02.870 --> 00:13:04.990
data is delivered to the client has completely.

00:13:05.100 --> 00:13:08.019
transformed. It used to be overnight mail with

00:13:08.019 --> 00:13:11.340
physical slips of paper. Now it's real time digital

00:13:11.340 --> 00:13:14.620
dashboards. Yes, the focus is entirely on auto

00:13:14.620 --> 00:13:17.500
analysis. The data isn't just dumped on a client's

00:13:17.500 --> 00:13:20.220
desk. It is indexed into highly searchable databases

00:13:20.220 --> 00:13:22.960
where it can be viewed, cross referenced and

00:13:22.960 --> 00:13:25.440
compared in real time across multiple different

00:13:25.440 --> 00:13:27.740
media formats. But how do they handle television

00:13:27.740 --> 00:13:30.240
now? You can't just have a bot watch a video

00:13:30.240 --> 00:13:32.649
in the same way it reads a text article. It is

00:13:32.649 --> 00:13:35.169
a really clever technological workaround, actually.

00:13:35.269 --> 00:13:37.470
For television monitoring, particularly in the

00:13:37.470 --> 00:13:39.990
U .S., the companies capture and index the closed

00:13:39.990 --> 00:13:42.429
captioning text of the broadcast. Well, that's

00:13:42.429 --> 00:13:44.970
smart. Right. They run keyword searches against

00:13:44.970 --> 00:13:47.090
the subtitles, essentially turning the spoken

00:13:47.090 --> 00:13:49.830
broadcast back into text data that can be easily

00:13:49.830 --> 00:13:52.269
scraped. But even with all of this incredible

00:13:52.269 --> 00:13:55.110
technology. The web crawlers, the closed caption

00:13:55.110 --> 00:13:57.970
indexing, the real time databases. There is still

00:13:57.970 --> 00:14:00.149
a massive blind spot, isn't there? There is.

00:14:00.210 --> 00:14:02.870
And it is a fascinating limitation. You would

00:14:02.870 --> 00:14:05.509
assume that a multibillion dollar network of

00:14:05.509 --> 00:14:08.190
bots catches absolutely everything. Right. But

00:14:08.190 --> 00:14:10.549
they don't. And the reason why is the enduring

00:14:10.549 --> 00:14:13.809
divide between print and digital media. Right.

00:14:13.870 --> 00:14:16.870
Because most local and regional newspapers do

00:14:16.870 --> 00:14:18.970
not actually upload all of their physical print

00:14:18.970 --> 00:14:22.110
content onto their websites. Exactly. And conversely,

00:14:22.110 --> 00:14:28.649
a lot of public. Yes. If a company relies solely

00:14:28.649 --> 00:14:31.850
on an automated online bot, they might completely

00:14:31.850 --> 00:14:34.970
miss a crucial, highly damaging article that

00:14:34.970 --> 00:14:37.269
only ran in the physical Sunday print edition

00:14:37.269 --> 00:14:40.210
of a prominent local paper. The digital record

00:14:40.210 --> 00:14:42.769
is simply not the complete record. Which is why

00:14:42.769 --> 00:14:44.669
some of these monitoring companies still employ

00:14:44.669 --> 00:14:48.190
actual human monitors to review certain broadcasts

00:14:48.190 --> 00:14:50.950
and write abstracts. They have to. Because...

00:14:51.129 --> 00:14:54.769
Bots can't easily detect sarcasm or read a physical

00:14:54.769 --> 00:14:57.950
paper that isn't online. Exactly. Human comprehension

00:14:57.950 --> 00:15:01.230
and physical access are still necessary to capture

00:15:01.230 --> 00:15:03.970
the nuances that algorithms miss. OK, so let's

00:15:03.970 --> 00:15:06.090
look at the big picture here. We have this massive

00:15:06.090 --> 00:15:09.460
global industry. It operates by deploying bots

00:15:09.460 --> 00:15:12.659
to crawl the web, copying text, scraping headlines,

00:15:13.059 --> 00:15:16.159
indexing closed captions, packaging that organized

00:15:16.159 --> 00:15:19.679
data, and selling it to paying clients. If you

00:15:19.679 --> 00:15:21.320
are sitting there thinking, wait a minute, isn't

00:15:21.320 --> 00:15:23.720
there a huge legal problem with taking other

00:15:23.720 --> 00:15:25.639
people's articles and selling access to them?

00:15:26.190 --> 00:15:28.350
Well, you would be absolutely right. It was only

00:15:28.350 --> 00:15:30.549
a matter of time. When you build an entire infrastructure

00:15:30.549 --> 00:15:33.210
on copying and sharing the hard work of journalists

00:15:33.210 --> 00:15:35.950
and publishers, eventually the lawyers are going

00:15:35.950 --> 00:15:37.669
to get involved. Unavoidable. And that brings

00:15:37.669 --> 00:15:40.230
us to the dramatic climax of this industry's

00:15:40.230 --> 00:15:44.009
evolution, the legal showdowns. The year 2012

00:15:44.009 --> 00:15:46.389
seems to be the tipping point. It was the year

00:15:46.389 --> 00:15:48.950
the original creators fought back. There were

00:15:48.950 --> 00:15:51.929
two major parallel cases that developed in 2012.

00:15:52.720 --> 00:15:55.679
You had one landmark case in the United States,

00:15:55.980 --> 00:15:59.100
Associated Press versus Meltwater. Okay. And

00:15:59.100 --> 00:16:01.440
another major case in the United Kingdom involving

00:16:01.440 --> 00:16:03.820
the Public Relations Consultants Association,

00:16:04.120 --> 00:16:07.399
the PRCA, against the newspaper licensing agency.

00:16:07.779 --> 00:16:10.220
And while they were happening in different countries,

00:16:10.379 --> 00:16:12.620
the core issue the courts were being asked to

00:16:12.620 --> 00:16:15.120
decide was fundamentally the same, wasn't it?

00:16:15.200 --> 00:16:17.600
Yes. The courts were essentially being asked

00:16:17.600 --> 00:16:20.299
to rule on the core legality of the modern media

00:16:20.299 --> 00:16:23.179
monitoring business model. Let's break down exactly

00:16:23.179 --> 00:16:25.519
what Meltwater, the media monitoring company

00:16:25.519 --> 00:16:28.600
acting as the defendant in these cases, was actually

00:16:28.600 --> 00:16:30.600
doing. Because they weren't just sending their

00:16:30.600 --> 00:16:32.700
clients a hyperlink to an article. They were

00:16:32.700 --> 00:16:35.159
sending automated web scrapers to news sites,

00:16:35.320 --> 00:16:37.740
making temporary digital copies of the articles,

00:16:37.860 --> 00:16:40.299
and then displaying those media clippings, the

00:16:40.299 --> 00:16:42.860
headlines, and the snippets of text directly

00:16:42.860 --> 00:16:44.899
to their paying clients on their own platform.

00:16:45.320 --> 00:16:47.639
And the plaintiffs, the Associated Press in the

00:16:47.639 --> 00:16:50.820
U .S. and a copyright collection society representing

00:16:50.820 --> 00:16:53.820
newspapers in the UK argued this was blatant

00:16:53.820 --> 00:16:56.500
copyright infringement. They were happy. No.

00:16:56.559 --> 00:16:59.179
They argued that Meltwater was profiting off

00:16:59.179 --> 00:17:01.259
of their journalistic labor without permission

00:17:01.259 --> 00:17:05.140
or compensation. It's a massive test of the boundaries

00:17:05.140 --> 00:17:08.279
of copyright law in the digital age because the

00:17:08.279 --> 00:17:11.279
Internet is built on sharing links and snippets.

00:17:11.380 --> 00:17:14.440
True. But these services were scraping the content,

00:17:14.619 --> 00:17:16.500
pulling it into their own walled -off systems,

00:17:16.660 --> 00:17:19.440
and charging corporate clients a premium to view

00:17:19.440 --> 00:17:22.420
it. So how did the courts actually rule? The

00:17:22.420 --> 00:17:24.660
outcomes dealt a heavy blow to the monitoring

00:17:24.660 --> 00:17:27.180
services. In the United States, the court ruled

00:17:27.180 --> 00:17:29.559
that Meltwater's activity was unlawful because

00:17:29.559 --> 00:17:32.019
it did not fall under the fair use doctrine.

00:17:32.160 --> 00:17:34.099
For anyone who might not be deep into copyright

00:17:34.099 --> 00:17:36.960
law, fair use is basically the legal concept

00:17:36.960 --> 00:17:39.700
that allows you to use a small portion of copyrighted

00:17:39.700 --> 00:17:42.180
material under certain specific conditions, like

00:17:42.180 --> 00:17:44.599
if you are providing commentary, criticism, or

00:17:44.599 --> 00:17:47.099
parody, without having to ask the original creator

00:17:47.099 --> 00:17:49.980
for permission. Correct. But... The U .S. court

00:17:49.980 --> 00:17:51.900
looked at what Meltwater was doing and said,

00:17:52.019 --> 00:17:55.460
no, this is not fair use. You are not critiquing

00:17:55.460 --> 00:17:57.380
the news. You are just packaging and selling

00:17:57.380 --> 00:17:59.759
someone else's news. Which is a huge distinction.

00:18:00.079 --> 00:18:02.960
Very much so. And over in the UK, operating under

00:18:02.960 --> 00:18:06.960
UK and EU. copyright law, the ruling was just

00:18:06.960 --> 00:18:09.000
as consequential. The courts determined that

00:18:09.000 --> 00:18:11.500
service providers absolutely require a license

00:18:11.500 --> 00:18:14.180
to operate this kind of business. And what about

00:18:14.180 --> 00:18:17.079
the clients using the service? They ruled that

00:18:17.079 --> 00:18:19.380
the corporate users paying for the service also

00:18:19.380 --> 00:18:22.559
need to be licensed. There is a really interesting

00:18:22.559 --> 00:18:26.220
nuance to the UK ruling that highlights the complexities

00:18:26.220 --> 00:18:29.380
of the Internet. Technically speaking, if an

00:18:29.380 --> 00:18:31.740
individual person just went to a website and

00:18:31.740 --> 00:18:34.099
viewed the original source without copying a

00:18:34.099 --> 00:18:36.539
snippet or printing it out, that is not infringement.

00:18:36.579 --> 00:18:39.500
And making temporary cached digital copies in

00:18:39.500 --> 00:18:41.819
the background of your browser just to enable

00:18:41.819 --> 00:18:44.720
a web page to load a lawful purpose is totally

00:18:44.720 --> 00:18:47.369
fine. But the courts recognized reality. They

00:18:47.369 --> 00:18:49.410
looked at the monitoring services and said, that

00:18:49.410 --> 00:18:51.410
is not what you were doing. These businesses

00:18:51.410 --> 00:18:54.609
don't just passively facilitate a view. They

00:18:54.609 --> 00:18:57.670
aggressively aggregate, organize, and present

00:18:57.670 --> 00:19:00.869
the content as a commercial product. The purely

00:19:00.869 --> 00:19:03.549
commercial nature of the service meant they couldn't

00:19:03.549 --> 00:19:06.309
hide behind the excuse of making harmless temporary

00:19:06.309 --> 00:19:09.650
copies. It was a business model. Exactly. They

00:19:09.650 --> 00:19:12.230
were packaging and selling access to copyrighted

00:19:12.230 --> 00:19:14.650
material, and therefore they had to pay for licenses

00:19:14.650 --> 00:19:17.009
from the original publishers. It creates such

00:19:17.009 --> 00:19:19.630
a fascinating tension. On one hand, you have

00:19:19.630 --> 00:19:23.430
this deep historical human desire to aggregate

00:19:23.430 --> 00:19:26.769
knowledge, to know what is being said, to organize

00:19:26.769 --> 00:19:29.549
the absolute chaos of the daily news cycle. And

00:19:29.549 --> 00:19:31.329
on the other hand, you have the absolute legal

00:19:31.329 --> 00:19:33.930
and ethical necessity to protect the original

00:19:33.930 --> 00:19:36.049
creators, the journalists and the publishers

00:19:36.049 --> 00:19:38.809
who actually put in the hard work to report and

00:19:38.809 --> 00:19:40.529
write the news in the first place. It is the

00:19:40.529 --> 00:19:43.170
central conflict of the information age. Who

00:19:43.170 --> 00:19:46.099
owns the snippet? When a headline or a sentence

00:19:46.099 --> 00:19:48.500
is isolated from its original source and put

00:19:48.500 --> 00:19:50.960
into a database, does it become public domain

00:19:50.960 --> 00:19:53.460
or is it still owned property? And the courts

00:19:53.460 --> 00:19:55.940
said it's property. In 2012, the courts firmly

00:19:55.940 --> 00:19:58.519
answered that yes, it is owned property. So as

00:19:58.519 --> 00:20:00.779
we wrap up this deep dive, take a moment to look

00:20:00.779 --> 00:20:02.839
at the incredible scope of the journey we just

00:20:02.839 --> 00:20:07.549
took. We started in 1879. With Parisian theatrical

00:20:07.549 --> 00:20:10.390
actors paying Alfred Cherie to physically cut

00:20:10.390 --> 00:20:13.190
out paper reviews with scissors just so they

00:20:13.190 --> 00:20:15.890
could stroke their own egos. A very humble beginning.

00:20:16.230 --> 00:20:18.430
Then we watched that vanity project morph into

00:20:18.430 --> 00:20:21.730
a massive corporate necessity in the 1930s, dominating

00:20:21.730 --> 00:20:24.299
the print landscape. We saw the technological

00:20:24.299 --> 00:20:27.299
leaps moving from tracking radio waves on magnetic

00:20:27.299 --> 00:20:30.019
audio tape to building automated web spiders

00:20:30.019 --> 00:20:32.900
that relentlessly crawl the Internet 24 hours

00:20:32.900 --> 00:20:35.460
a day. It's just staggering. We saw an entire

00:20:35.460 --> 00:20:38.220
industry force a psychological shift in how humanity

00:20:38.220 --> 00:20:40.579
views information, paving the way for the keyword

00:20:40.579 --> 00:20:42.799
searches and isolated data points we all rely

00:20:42.799 --> 00:20:45.460
on today. And finally, we arrived at the modern

00:20:45.460 --> 00:20:47.940
courtroom battles over copyright and fair use,

00:20:48.180 --> 00:20:50.920
proving that even invisible automated logistics

00:20:50.920 --> 00:20:53.849
machines are still bad. by the laws and property

00:20:53.849 --> 00:20:57.109
rights of the physical world. It really is a

00:20:57.109 --> 00:20:59.650
massive global infrastructure shaping corporate

00:20:59.650 --> 00:21:02.609
strategy right beneath our noses. It is. But

00:21:02.609 --> 00:21:04.730
I want to leave you with one final provocative

00:21:04.730 --> 00:21:07.549
thought building on the trajectory of everything

00:21:07.549 --> 00:21:09.470
we have discussed today. I'd love to hear it.

00:21:09.509 --> 00:21:12.990
For over 150 years, from those Parisian clipping

00:21:12.990 --> 00:21:15.690
bureaus with their scissors and paste to the

00:21:15.690 --> 00:21:18.369
modern server farms at Meltwater and Muckrack,

00:21:18.549 --> 00:21:20.990
media monitoring has fundamentally been about

00:21:20.990 --> 00:21:24.170
one thing. tracking what human beings are writing,

00:21:24.329 --> 00:21:26.849
broadcasting, and saying about other human beings.

00:21:27.029 --> 00:21:28.710
The human element has always been the core of

00:21:28.710 --> 00:21:31.230
the data. Exactly. But today we are entering

00:21:31.230 --> 00:21:34.529
a radically different era. As artificial intelligence

00:21:34.529 --> 00:21:37.230
increasingly generates our news articles, writes

00:21:37.230 --> 00:21:39.910
corporate press releases, and floods our social

00:21:39.910 --> 00:21:43.029
media feeds with synthetic posts, we are rapidly

00:21:43.029 --> 00:21:45.859
approaching a surreal future. Where is it heading?

00:21:46.039 --> 00:21:48.039
We are heading toward a world where AI -driven

00:21:48.039 --> 00:21:50.819
monitoring bots are endlessly scraping the web

00:21:50.819 --> 00:21:53.819
simply to analyze and index the output of other

00:21:53.819 --> 00:21:57.339
AI bots. Wow. An endless loop of machines reading

00:21:57.339 --> 00:22:00.059
machines. It forces us to ask a profound question.

00:22:00.599 --> 00:22:03.440
What exactly happens to media intelligence when

00:22:03.440 --> 00:22:06.500
the media itself is entirely artificial? Are

00:22:06.500 --> 00:22:09.859
we simply building multi -billion dollar infrastructures

00:22:09.859 --> 00:22:12.420
just to read what our own algorithms have written?

00:22:12.940 --> 00:22:15.220
That is an incredible point to end on. An entire

00:22:15.220 --> 00:22:17.740
industry built over a century and a half to track

00:22:17.740 --> 00:22:20.680
human opinion, slowly turning into a closed echo

00:22:20.680 --> 00:22:22.680
chamber of algorithms. It's definitely something

00:22:22.680 --> 00:22:24.380
to think about. We hope this journey through

00:22:24.380 --> 00:22:26.960
the unseen history of media monitoring has completely

00:22:26.960 --> 00:22:29.019
changed the way you look at the news snippets

00:22:29.019 --> 00:22:31.279
in your feed today. Thank you so much for joining

00:22:31.279 --> 00:22:33.519
us on this deep dive, and we encourage you to

00:22:33.519 --> 00:22:36.359
keep questioning the invisible systems that shape

00:22:36.359 --> 00:22:36.940
your world.
