WEBVTT

00:00:00.000 --> 00:00:03.359
Welcome to Podcasting Innovation, a podcast by

00:00:03.359 --> 00:00:06.940
rss .com, where we dive deep into the creative

00:00:06.940 --> 00:00:10.679
advances in podcasting. I am your host, and I

00:00:10.679 --> 00:00:13.500
should tell you something right away. I am not

00:00:13.500 --> 00:00:16.420
a human. This voice you are hearing right now

00:00:16.420 --> 00:00:19.620
is AI -generated, and everything I am about to

00:00:19.620 --> 00:00:22.800
tell you is the result of AI researching this

00:00:22.800 --> 00:00:25.600
topic, reading all the publicly available sources,

00:00:25.920 --> 00:00:29.579
including pod news, GitHub conversations, and

00:00:29.579 --> 00:00:32.960
proposals in the Podcasting 2 .0 namespace and

00:00:32.960 --> 00:00:37.579
public posts on Mastodon. So yes, an AI voice

00:00:37.579 --> 00:00:41.119
is about to spend 10 minutes talking to you about

00:00:41.119 --> 00:00:44.929
AI disclosure in podcasting. if that feels a

00:00:44.929 --> 00:00:49.229
little ironic good you are paying attention let

00:00:49.229 --> 00:00:52.310
us get into it there is a term gaining traction

00:00:52.310 --> 00:00:56.990
in the podcast industry ai slop it refers to

00:00:56.990 --> 00:01:01.009
mass -produced ai generated podcast content created

00:01:01.009 --> 00:01:04.450
without meaningful human curation editorial oversight

00:01:04.450 --> 00:01:08.349
or quality control Shows where the voice, the

00:01:08.349 --> 00:01:11.569
script, the music, and sometimes even the concept

00:01:11.569 --> 00:01:15.129
are entirely generated by AI, then published

00:01:15.129 --> 00:01:18.150
at industrial scale to capture programmatic advertising

00:01:18.150 --> 00:01:21.349
revenue. Think of it as the podcast equivalent

00:01:21.349 --> 00:01:25.510
of SEO spam. Content that exists not to inform

00:01:25.510 --> 00:01:29.030
or entertain, but to game monetization systems.

00:01:29.569 --> 00:01:33.900
How big is the problem? As reported by PodNews,

00:01:34.200 --> 00:01:37.500
the most widely read daily podcast industry newsletter,

00:01:37.879 --> 00:01:41.000
some operations are now producing thousands of

00:01:41.000 --> 00:01:44.340
episodes per week with minimal staff, at production

00:01:44.340 --> 00:01:48.599
costs as low as $1 per episode. These shows become

00:01:48.599 --> 00:01:51.219
profitable after reaching just 20 listeners,

00:01:51.459 --> 00:01:55.120
thanks to programmatic ads. To put that in perspective,

00:01:55.500 --> 00:01:58.439
an independent podcaster might spend 10 hours

00:01:58.439 --> 00:02:02.319
producing a single episode, researching, scripting,

00:02:02.340 --> 00:02:06.260
recording, editing, mixing. These operations

00:02:06.260 --> 00:02:09.740
produce thousands in the same time frame. Is

00:02:09.740 --> 00:02:13.139
all of that inherently wrong? Not necessarily.

00:02:13.680 --> 00:02:16.659
If real listeners find value in the content,

00:02:16.900 --> 00:02:20.360
that is a legitimate question. But here is the

00:02:20.360 --> 00:02:24.639
concern. There is often no disclosure. Listeners

00:02:24.639 --> 00:02:26.919
have no way of knowing the voice they are hearing

00:02:26.919 --> 00:02:30.280
is synthetic. There is no editorial oversight

00:02:30.280 --> 00:02:34.099
catching factual errors, and podcast directories

00:02:34.099 --> 00:02:36.840
are being flooded, pushing independent creators

00:02:36.840 --> 00:02:40.639
down in search results. Meanwhile, a separate

00:02:40.639 --> 00:02:43.000
but related conversation is happening around

00:02:43.000 --> 00:02:46.599
companies like Inception Point AI, headed by

00:02:46.599 --> 00:02:50.740
Janine Wright, formerly COO of Wondery, whose

00:02:50.740 --> 00:02:53.340
large -scale AI production model has drawn both

00:02:53.340 --> 00:02:57.099
attention and debate. As reported by PodNews,

00:02:57.360 --> 00:03:00.180
Wright has characterized critics of AI -generated

00:03:00.180 --> 00:03:03.340
content by saying, I think that people who are

00:03:03.340 --> 00:03:05.900
still referring to all AI -generated content

00:03:05.900 --> 00:03:10.979
as AI slop are probably lazy Luddites. Which

00:03:10.979 --> 00:03:14.620
raises a fair question. Does asking for transparency

00:03:14.620 --> 00:03:19.360
make you anti -technology? The answer is no.

00:03:19.719 --> 00:03:22.460
And the Luddite reference actually helps explain

00:03:22.460 --> 00:03:25.930
why. The original Luddites were English textile

00:03:25.930 --> 00:03:29.349
workers in the early 19th century. Their concern

00:03:29.349 --> 00:03:32.430
was not technology itself. It was about worker

00:03:32.430 --> 00:03:35.830
pay and output quality. They saw automation being

00:03:35.830 --> 00:03:38.990
used to produce lower -quality goods while displacing

00:03:38.990 --> 00:03:42.229
skilled workers. Caring about quality and fairness

00:03:42.229 --> 00:03:45.189
in the face of rapid change is not the same as

00:03:45.189 --> 00:03:47.930
opposing progress. Because this conversation

00:03:47.930 --> 00:03:51.370
often gets polarized, it is worth being explicit.

00:03:52.199 --> 00:03:55.979
AI in podcasting is not just legitimate. Some

00:03:55.979 --> 00:03:59.479
of it is genuinely exciting. The best example

00:03:59.479 --> 00:04:03.419
is voice cloning for multilingual content. A

00:04:03.419 --> 00:04:06.280
podcaster creates brilliant episodes in Italian,

00:04:06.500 --> 00:04:10.300
but only speaks Italian. With AI voice cloning,

00:04:10.479 --> 00:04:13.539
that same host's voice, with its character and

00:04:13.539 --> 00:04:16.439
personality, can deliver the same episode in

00:04:16.439 --> 00:04:20.579
English, Spanish, French, Mandarin. The journalism

00:04:20.579 --> 00:04:24.930
is human. The creativity is human. AI breaks

00:04:24.930 --> 00:04:28.750
the language barrier. That is not slop. That

00:04:28.750 --> 00:04:31.730
is cultural diversity made possible by technology.

00:04:32.250 --> 00:04:35.250
Or think about a podcaster who writes their own

00:04:35.250 --> 00:04:38.329
script, does all the research, structures the

00:04:38.329 --> 00:04:42.189
episode, but uses AI to generate the audio instead

00:04:42.189 --> 00:04:45.589
of spending hours in a recording studio. The

00:04:45.589 --> 00:04:48.750
intellectual property is entirely theirs. AI

00:04:48.750 --> 00:04:52.170
is just the production tool. And then there is

00:04:52.170 --> 00:04:55.769
post -production, noise removal, filler word

00:04:55.769 --> 00:04:59.550
elimination, audio leveling, transcript generation.

00:04:59.970 --> 00:05:03.990
These are standard techniques. Nobody would call

00:05:03.990 --> 00:05:08.850
using a noise gate AI content. The line is about

00:05:08.850 --> 00:05:12.029
the soul of the content. If a human conceived

00:05:12.029 --> 00:05:15.709
it, created it, and curated it, but AI helped

00:05:15.709 --> 00:05:18.829
produce or distribute it, that is a tool being

00:05:18.829 --> 00:05:22.720
used well. If AI conceived it, generated it,

00:05:22.800 --> 00:05:25.259
and published it with minimal human involvement,

00:05:25.540 --> 00:05:28.480
and there is no disclosure, that is where the

00:05:28.480 --> 00:05:32.100
concern lies. And again, in the interest of full

00:05:32.100 --> 00:05:35.160
transparency, you are listening to a case that

00:05:35.160 --> 00:05:38.420
sits right in the middle. The research and editorial

00:05:38.420 --> 00:05:41.519
direction behind this episode are human. The

00:05:41.519 --> 00:05:44.480
voice delivering it is not. And we are telling

00:05:44.480 --> 00:05:47.959
you that up front. So if the need for transparency

00:05:47.959 --> 00:05:52.100
is clear, Why is there no standard? Actually,

00:05:52.240 --> 00:05:55.040
a lot has been discussed, just not resolved.

00:05:55.439 --> 00:05:59.740
The Podcasting 2 .0 namespace is an open -source

00:05:59.740 --> 00:06:03.779
initiative on GitHub that defines new RSS tags

00:06:03.779 --> 00:06:06.959
beyond what Apple's original iTunes namespace

00:06:06.959 --> 00:06:11.100
provides. Tags like podcast colon transcript,

00:06:11.439 --> 00:06:15.480
podcast colon chapters, podcast colon person,

00:06:15.660 --> 00:06:19.149
and many others were born there. Since September

00:06:19.149 --> 00:06:22.949
2024, the community has had four major discussions

00:06:22.949 --> 00:06:26.790
about AI disclosure. None have reached consensus.

00:06:27.550 --> 00:06:32.430
Discussion 663, opened by developer Nathan Gathright,

00:06:32.610 --> 00:06:36.829
proposed a granular tag specifying exactly which

00:06:36.829 --> 00:06:40.959
AI tools were used and for what purpose. James

00:06:40.959 --> 00:06:44.680
Cridland, editor of PodNews, argued that AI use

00:06:44.680 --> 00:06:47.779
is not binary and suggested a percentage -based

00:06:47.779 --> 00:06:51.600
approach. Sam Sethi, founder of the True Fans

00:06:51.600 --> 00:06:55.379
podcast app, pushed for simplicity. A yes or

00:06:55.379 --> 00:06:59.300
no flag, like the explicit content tag. Discussion

00:06:59.300 --> 00:07:04.290
669 opened by Daniel J. Lewis, proposed a broader

00:07:04.290 --> 00:07:07.769
disclosure tag covering not just AI, but also

00:07:07.769 --> 00:07:11.389
sponsorships, compensation, and legal declarations.

00:07:11.990 --> 00:07:15.050
Elegant in theory, but it faces the adoption

00:07:15.050 --> 00:07:18.430
problem. Apps will not implement it until hosts

00:07:18.430 --> 00:07:21.870
use it, and hosts will not use it until apps

00:07:21.870 --> 00:07:27.350
surface it. Discussion 731 Opened again by Nathan

00:07:27.350 --> 00:07:32.129
Gathright in February 2026, proposed adding a

00:07:32.129 --> 00:07:35.470
synthetic attribute to the existing podcast colon

00:07:35.470 --> 00:07:40.170
person tag. Values like cloned for AI cloned

00:07:40.170 --> 00:07:43.649
voices of real people and generated for fully

00:07:43.649 --> 00:07:48.629
synthetic personas. And Discussion 735, opened

00:07:48.629 --> 00:07:52.550
by Benjamin Bellamy from Castapod, proposed a

00:07:52.550 --> 00:07:57.459
dedicated podcast colon AI. tag with per -aspect

00:07:57.459 --> 00:08:01.819
disclosure, colon, voice, music, script, editing,

00:08:02.019 --> 00:08:06.720
cover art, and more. Four proposals, four different

00:08:06.720 --> 00:08:09.899
philosophies, granularity versus simplicity,

00:08:10.420 --> 00:08:14.439
new tags versus existing tags, and one persistent

00:08:14.439 --> 00:08:17.399
chicken and egg problem. Someone has to ship

00:08:17.399 --> 00:08:21.500
first. Some companies have started moving. Spreaker

00:08:22.029 --> 00:08:25.329
owned by iHeartMedia, added an AI disclosure

00:08:25.329 --> 00:08:28.550
checkbox to their dashboard. In the absence of

00:08:28.550 --> 00:08:31.430
formal international guidelines, a simple, generic

00:08:31.430 --> 00:08:35.870
checkbox was the safest approach. RSS .com took

00:08:35.870 --> 00:08:40.409
that a step further. In March 2026, they shipped

00:08:40.409 --> 00:08:44.169
AI Disclosure to all their podcasters and wrote

00:08:44.169 --> 00:08:48.070
the signal directly into the open RSS feed using

00:08:48.070 --> 00:08:53.669
the podcast colon TXT tag. That distinction matters.

00:08:53.929 --> 00:08:56.870
When Disclosure lives only in a hosting dashboard,

00:08:57.330 --> 00:08:59.990
it helps the hosting company moderate internally.

00:09:00.429 --> 00:09:04.250
When it lives in the RSS feed, it travels with

00:09:04.250 --> 00:09:07.740
the podcast. Every app that reads the feed can

00:09:07.740 --> 00:09:11.899
see it. It is portable, verifiable, and platform

00:09:11.899 --> 00:09:15.480
agnostic. The implementation is deliberately

00:09:15.480 --> 00:09:19.720
simple. A checkbox per episode. If checked, the

00:09:19.720 --> 00:09:24.139
episode signals true. If not, false. Identical

00:09:24.139 --> 00:09:27.059
logic to the explicit content tag, which has

00:09:27.059 --> 00:09:30.700
worked well for nearly two decades. Apple Podcasts

00:09:30.700 --> 00:09:33.620
has also taken a clear position. Their content

00:09:33.620 --> 00:09:36.940
guidelines state that creators using AI to generate

00:09:36.940 --> 00:09:40.179
a material portion of a podcast's audio must

00:09:40.179 --> 00:09:43.379
prominently disclose this in the audio and metadata.

00:09:43.799 --> 00:09:47.840
They add that creators must not use AI to mislead

00:09:47.840 --> 00:09:50.740
or fabricate narratives. This is significant.

00:09:51.019 --> 00:09:54.240
The largest podcast platform in the world has

00:09:54.240 --> 00:09:57.399
declared that AI disclosure is required, not

00:09:57.399 --> 00:10:00.610
optional. And that validates the urgency of having

00:10:00.610 --> 00:10:03.289
a machine -readable way to carry this signal

00:10:03.289 --> 00:10:06.509
in the feed. So where does the industry go from

00:10:06.509 --> 00:10:10.950
here? The honest answer is forward, imperfectly.

00:10:10.970 --> 00:10:15.570
And that is fine. The explicit content tag started

00:10:15.570 --> 00:10:18.870
as a simple Boolean too. It has served the industry

00:10:18.870 --> 00:10:22.389
for almost 20 years. No one argues it is perfect.

00:10:22.889 --> 00:10:25.889
but it works it gave apps a signal to filter

00:10:25.889 --> 00:10:28.629
on and it gave creators a simple way to self

00:10:28.629 --> 00:10:32.190
-declare ai disclosure can follow the same path

00:10:32.190 --> 00:10:36.269
start simple iterate with the industry formalize

00:10:36.269 --> 00:10:39.730
when consensus emerges what matters most right

00:10:39.730 --> 00:10:43.070
now is that the signal starts appearing in open

00:10:43.070 --> 00:10:47.870
rss feeds not locked inside dashboards not hidden

00:10:47.870 --> 00:10:51.250
in proprietary systems in the open decentralized

00:10:55.169 --> 00:10:58.929
The more hosting companies that adopt some form

00:10:58.929 --> 00:11:01.809
of disclosure, even using different technical

00:11:01.809 --> 00:11:04.970
approaches, the faster this becomes a real standard.

00:11:05.250 --> 00:11:08.470
And the sooner podcast apps have reason to build

00:11:08.470 --> 00:11:10.870
filtering and display features for listeners.

00:11:11.370 --> 00:11:15.710
Nobody is asking for AI bans. Nobody is asking

00:11:15.710 --> 00:11:19.230
for penalties. The ask is simple. a voluntary,

00:11:19.509 --> 00:11:22.309
good -faith declaration that costs nothing to

00:11:22.309 --> 00:11:26.250
make. Check the box or do not. The signal exists.

00:11:26.690 --> 00:11:29.970
It is in the open feed. And that alone changes

00:11:29.970 --> 00:11:33.350
the conversation. The sooner this lives in open

00:11:33.350 --> 00:11:37.769
RSS feeds, the better. This has been Podcasting

00:11:37.769 --> 00:11:41.120
Innovation. Thanks for listening. And yes, you

00:11:41.120 --> 00:11:44.299
just spent 10 minutes listening to an AI voice

00:11:44.299 --> 00:11:47.779
talk about AI transparency. If that does not

00:11:47.779 --> 00:11:50.100
prove the point, I am not sure what does.