WEBVTT

00:00:00.000 --> 00:00:03.520
In 2026, we've handed AI the keys to our most

00:00:03.520 --> 00:00:07.660
sensitive databases. But what if a single harmless

00:00:07.660 --> 00:00:09.919
-looking smiley face emoji could bring the whole

00:00:09.919 --> 00:00:13.679
system down? Two sec silence. Welcome to this

00:00:13.679 --> 00:00:15.179
deep dive. I'm really glad you could join us

00:00:15.179 --> 00:00:17.940
today. It's great to be here. We have some incredibly

00:00:17.940 --> 00:00:20.399
fascinating material to cover. We really do.

00:00:20.519 --> 00:00:22.719
Our source material today is Max Anne's March

00:00:22.719 --> 00:00:27.129
2026 AI Security Guide. It's an extensive look

00:00:27.129 --> 00:00:29.469
at modern pen testing and defense strategies.

00:00:29.730 --> 00:00:33.030
Right. It basically maps out the completely untamed

00:00:33.030 --> 00:00:36.350
Wild West of 2026 AI security. And Wild West

00:00:36.350 --> 00:00:38.409
is honestly the perfect term for it. Yeah, it

00:00:38.409 --> 00:00:40.310
really is. We're going to uncover why prompt

00:00:40.310 --> 00:00:42.450
injection is essentially the new SQL injection.

00:00:42.670 --> 00:00:44.969
We'll examine the massive vulnerabilities hiding

00:00:44.969 --> 00:00:47.770
inside overprivileged AI agents. We'll see how

00:00:47.770 --> 00:00:50.530
autonomous AI hackers are currently winning bug

00:00:50.530 --> 00:00:52.850
bounties. Let's start by unpacking this massive

00:00:52.850 --> 00:00:55.759
adoption gap we're seeing. highlights a staggering

00:00:55.759 --> 00:00:58.259
statistic right up front. Yeah, that's 72 % number.

00:00:58.380 --> 00:01:02.399
Exactly. About 72 % of enterprises currently

00:01:02.399 --> 00:01:05.319
use AI agents in their daily workflows, but only

00:01:05.319 --> 00:01:09.939
29 % actually have AI -specific security in place.

00:01:10.180 --> 00:01:13.010
It's a genuinely terrifying gap. I mean, the

00:01:13.010 --> 00:01:15.629
vast majority of companies are flying completely

00:01:15.629 --> 00:01:17.750
blind right now. Developers are under immense

00:01:17.750 --> 00:01:20.049
pressure. They have to ship code quickly. Right.

00:01:20.129 --> 00:01:21.829
They have to meet market demands. So they're

00:01:21.829 --> 00:01:24.430
largely ignoring the underlying security implications.

00:01:24.469 --> 00:01:27.430
They just bolt an AI onto their database. Yeah,

00:01:27.430 --> 00:01:30.090
call it a day. Exactly. It feels exactly like

00:01:30.090 --> 00:01:32.709
the early web era with SQL injections. Back then,

00:01:32.790 --> 00:01:35.069
people were building dynamic websites incredibly

00:01:35.069 --> 00:01:37.349
fast. But they were leaving massive database

00:01:37.349 --> 00:01:40.450
backdoors wide open. And that exact same historical

00:01:40.450 --> 00:01:43.689
moment is repeating itself today. Except today,

00:01:43.989 --> 00:01:46.870
the primary target isn't just a static database.

00:01:47.170 --> 00:01:50.689
Right. It's the automated decision -making AI

00:01:50.689 --> 00:01:53.750
system itself. Jason Haddix from Arcanum Information

00:01:53.750 --> 00:01:55.769
Security makes a brilliant point about this.

00:01:55.930 --> 00:01:58.489
He states that AI pen testing is the most critical

00:01:58.489 --> 00:02:01.519
skill of 2026. Right. And he makes a really important

00:02:01.519 --> 00:02:04.120
distinction there. People constantly mix up AI,

00:02:04.459 --> 00:02:08.080
red teaming, with actual AI pen testing. Red

00:02:08.080 --> 00:02:09.759
teaming is what we always hear about in the news.

00:02:09.879 --> 00:02:12.300
Yeah, exactly. That involves attacking the model

00:02:12.300 --> 00:02:15.259
itself. You try to force a bad output or make

00:02:15.259 --> 00:02:17.500
the chatbot say bad words. We're checking the

00:02:17.500 --> 00:02:20.400
brain of the AI system. Right. Those jailbreaks

00:02:20.400 --> 00:02:22.379
definitely still matter for brand reputation.

00:02:22.659 --> 00:02:27.000
But red teaming only tests one single isolated

00:02:27.000 --> 00:02:30.810
layer. True AI pen testing is a holistic assessment.

00:02:31.129 --> 00:02:35.500
Yes. It checks the APIs, the pipelines. The infrastructure.

00:02:35.800 --> 00:02:38.139
So if red teaming checks the brain, pen testing

00:02:38.139 --> 00:02:40.819
checks the whole body. That analogy is spot on.

00:02:40.900 --> 00:02:42.900
The brain is just the language model generating

00:02:42.900 --> 00:02:46.960
text. The body includes the network of APIs fetching

00:02:46.960 --> 00:02:49.340
external data. It includes the cloud storage

00:02:49.340 --> 00:02:51.599
buckets holding private company documents. If

00:02:51.599 --> 00:02:53.680
you only test the brain, you're missing 90 %

00:02:53.680 --> 00:02:56.860
of the actual risk. Exactly. Why do so many companies

00:02:56.860 --> 00:02:59.539
only bother testing the brain? Mostly because

00:02:59.539 --> 00:03:02.139
the brain is the public facing part. PR disasters

00:03:02.139 --> 00:03:04.659
from bad outputs are immediate. They're embarrassing.

00:03:04.840 --> 00:03:07.139
Companies rush to prevent the AI from saying

00:03:07.139 --> 00:03:09.800
offensive things. They completely forget that

00:03:09.800 --> 00:03:12.099
the silent background data pipes hold the real

00:03:12.099 --> 00:03:14.599
value. Attackers don't care if your AI uses bad

00:03:14.599 --> 00:03:17.379
words. They care about the private customer data

00:03:17.379 --> 00:03:19.639
flowing through its back -end connections. So

00:03:19.639 --> 00:03:21.659
they secure the chatbot but leave the back door

00:03:21.659 --> 00:03:23.860
wide open. Right. And that brings us directly

00:03:23.860 --> 00:03:26.419
to the rapidly expanding attack surface. Most

00:03:26.419 --> 00:03:29.060
people think AI security just means tricking

00:03:29.060 --> 00:03:32.219
a text box. But a real... Production -ready AI

00:03:32.219 --> 00:03:35.460
system has six distinct layers of vulnerability.

00:03:35.860 --> 00:03:38.099
First, you have the underlying model itself,

00:03:38.319 --> 00:03:40.439
including the system prompts. Right, that's the

00:03:40.439 --> 00:03:42.900
foundational layer. Next, you have the API connections

00:03:42.900 --> 00:03:45.360
and internal webhooks. Then you have the data

00:03:45.360 --> 00:03:47.259
aggregators running quietly in the background.

00:03:47.460 --> 00:03:49.580
Meaning your private databases and your complex

00:03:49.580 --> 00:03:52.800
RG pipelines. Then we hit the critical integrations

00:03:52.800 --> 00:03:55.680
layer, like Zapier connections or your internal

00:03:55.680 --> 00:03:58.219
CRM. You also have the actual application sitting

00:03:58.219 --> 00:04:00.840
directly on top. The web or mobile interface.

00:04:01.180 --> 00:04:03.620
Exactly. And finally, the foundational cloud

00:04:03.620 --> 00:04:06.699
infrastructure layer. I still wrestle with trusting

00:04:06.699 --> 00:04:10.969
basic API integrations myself. Yeah. Beat. It

00:04:10.969 --> 00:04:12.849
always feels like you're leaving a digital window

00:04:12.849 --> 00:04:15.349
unlocked. You are absolutely right to feel a

00:04:15.349 --> 00:04:17.470
little paranoid about that. Yeah. The source

00:04:17.470 --> 00:04:19.910
guide shares a brilliant real -world warning

00:04:19.910 --> 00:04:22.149
sign about this. This was the company that rushed

00:04:22.149 --> 00:04:24.670
an internal AI assistant into production, right?

00:04:24.730 --> 00:04:27.610
Yeah. They just wanted to help their sales team

00:04:27.610 --> 00:04:30.329
draft emails faster. They didn't consult their

00:04:30.329 --> 00:04:32.589
security team at all. And months later, they

00:04:32.589 --> 00:04:36.129
discovered a massive silent data leak. It was

00:04:36.129 --> 00:04:38.829
devastating. The AI assistant had been quietly

00:04:38.829 --> 00:04:41.930
sending sensitive sales data out. Directly to

00:04:41.930 --> 00:04:44.930
third -party AI providers. Right. Storing it

00:04:44.930 --> 00:04:47.529
on external servers. Nobody meant for it to happen.

00:04:47.589 --> 00:04:49.449
It happened because security wasn't involved

00:04:49.449 --> 00:04:52.230
during the integrations phase. Exactly. This

00:04:52.230 --> 00:04:54.850
is why professional AI pen testers follow a strict

00:04:54.850 --> 00:04:57.490
seven -step methodology. They move through the

00:04:57.490 --> 00:04:59.939
attack surface logically. Like a burglar checking

00:04:59.939 --> 00:05:02.639
every single entry point. Right. Step one involves

00:05:02.639 --> 00:05:05.180
testing the basic external system inputs. Step

00:05:05.180 --> 00:05:07.660
two maps out the entire connected digital ecosystem.

00:05:07.959 --> 00:05:10.480
Step three is finally attacking the actual AI

00:05:10.480 --> 00:05:14.079
model itself. Step four focuses heavily on advanced

00:05:14.079 --> 00:05:17.339
prompt engineering. Step five targets the underlying

00:05:17.339 --> 00:05:20.939
data layer and vector stores. Step six is exploiting

00:05:20.939 --> 00:05:23.920
the application front end. And step seven is

00:05:23.920 --> 00:05:26.339
pivoting to move laterally across the network.

00:05:26.759 --> 00:05:28.860
What actually happens when attackers decide to

00:05:28.860 --> 00:05:31.500
skip the model entirely? They look for weaknesses

00:05:31.500 --> 00:05:34.699
in the surrounding infrastructure instead. They

00:05:34.699 --> 00:05:38.160
might find an unsecured API endpoint that the

00:05:38.160 --> 00:05:41.480
AI uses. They send commands directly to that

00:05:41.480 --> 00:05:43.779
endpoint, completely ignoring the language model.

00:05:43.959 --> 00:05:46.920
They bypass the AI entirely and just exploit

00:05:46.920 --> 00:05:49.279
the connected data pipes. Precisely. And this

00:05:49.279 --> 00:05:51.300
brings us to the biggest threat of all. Prompt

00:05:51.300 --> 00:05:55.079
injection. It is essentially the new SQL injection

00:05:55.079 --> 00:05:58.029
of our era. It's the defining vulnerability right

00:05:58.029 --> 00:06:01.069
now. Attackers hide malicious instructions inside

00:06:01.069 --> 00:06:04.230
natural language inputs. The AI processes these

00:06:04.230 --> 00:06:06.250
hidden instructions as completely normal input.

00:06:06.430 --> 00:06:09.110
Exactly. There are four distinct attack primitives

00:06:09.110 --> 00:06:11.310
we need to understand here. Think of them like

00:06:11.310 --> 00:06:13.709
stacking Lego blocks of data. First, you have

00:06:13.709 --> 00:06:16.629
the actual intent of the attack. Right. The intent

00:06:16.629 --> 00:06:19.149
could be extracting highly sensitive private

00:06:19.149 --> 00:06:22.329
emails. Second, you have the specific delivery

00:06:22.329 --> 00:06:24.779
technique. Disguising the attack as a harmless

00:06:24.779 --> 00:06:27.439
role -playing scenario. Yep. Third, you have

00:06:27.439 --> 00:06:30.279
evasion techniques to bypass standard safety

00:06:30.279 --> 00:06:34.420
filters. Like using complex encoding or obscure

00:06:34.420 --> 00:06:36.920
foreign languages. And finally, you have the

00:06:36.920 --> 00:06:40.300
specific utility add -ons. Small additions designed

00:06:40.300 --> 00:06:42.740
to bypass system guardrails. When you combine

00:06:42.740 --> 00:06:45.399
these four Lego blocks, you get incredibly complex

00:06:45.399 --> 00:06:47.879
attack paths. This is where it gets really wild.

00:06:48.379 --> 00:06:50.920
Attackers are using something called emoji smuggling.

00:06:51.019 --> 00:06:53.160
This one is so clever, computers don't actually

00:06:53.160 --> 00:06:55.819
read emojis as little pictures. No, they read

00:06:55.819 --> 00:06:58.300
the underlying Unicode characters. So an attacker

00:06:58.300 --> 00:07:01.139
encodes malicious textual instructions inside

00:07:01.139 --> 00:07:04.259
the Unicode metadata of an emoji. They hide a

00:07:04.259 --> 00:07:06.779
command to delete files inside a harmless smiley

00:07:06.779 --> 00:07:09.420
face. The human user just sees a normal smiley

00:07:09.420 --> 00:07:12.290
face. But the AI reads the encoded text data

00:07:12.290 --> 00:07:14.529
and follows the malicious instructions. They're

00:07:14.529 --> 00:07:17.410
also using link smuggling. Right. Crafting deceptive

00:07:17.410 --> 00:07:19.790
URLs that steal conversation data when clicked.

00:07:19.990 --> 00:07:23.310
But the most dangerous risk of 2026 is indirect

00:07:23.310 --> 00:07:26.470
injection via retrieval. Yes. That's when an

00:07:26.470 --> 00:07:29.269
attacker poisons a document inside a database.

00:07:29.589 --> 00:07:31.709
Or a web page the internal agent will eventually

00:07:31.709 --> 00:07:33.790
read. The attacker doesn't even interact with

00:07:33.790 --> 00:07:36.089
the AI directly. They just plant a booby -trapped

00:07:36.089 --> 00:07:39.069
resume in an HR database. Why is this indirect

00:07:39.069 --> 00:07:41.730
injection considered such a devastating threat?

00:07:42.009 --> 00:07:44.750
Because it operates completely silently. You

00:07:44.750 --> 00:07:47.209
poison one document, and it waits patiently for

00:07:47.209 --> 00:07:49.670
months. When the AI eventually pulls it, it gets

00:07:49.670 --> 00:07:52.379
hijacked mid -execution. One poison file sits

00:07:52.379 --> 00:07:55.540
in memory and infects every future query. Exactly.

00:07:55.800 --> 00:07:58.480
This is why beginners desperately need safe places

00:07:58.480 --> 00:08:01.060
to practice. You have to start practicing offensive

00:08:01.060 --> 00:08:04.620
AI security to build real intuition. The guide

00:08:04.620 --> 00:08:06.620
recommends a very specific practice progression.

00:08:07.259 --> 00:08:11.100
Step one is Gandalf by Le Carre. That focuses

00:08:11.100 --> 00:08:13.720
on teaching basic prompt manipulation. Yeah,

00:08:13.720 --> 00:08:15.699
learning how these models behave under persistent

00:08:15.699 --> 00:08:17.819
pressure. Then you step up to Agent Breaker.

00:08:17.980 --> 00:08:20.170
Right. This traumatically raises the overall

00:08:20.170 --> 00:08:22.769
difficulty. You're facing multi -step AI agents

00:08:22.769 --> 00:08:25.829
that have memory and actual tools. Why does practicing

00:08:25.829 --> 00:08:28.029
on these multi -step agents completely change

00:08:28.029 --> 00:08:31.069
the game? Single -turn chatbots just reply with

00:08:31.069 --> 00:08:33.950
text. Multi -step agents actually browse the

00:08:33.950 --> 00:08:36.889
web, read documents, and call APIs. You have

00:08:36.889 --> 00:08:38.809
to understand how each separate step connects.

00:08:39.169 --> 00:08:42.090
Agents take actual actions, meaning you can hijack

00:08:42.090 --> 00:08:45.000
them mid -task. You nailed it. And that leads

00:08:45.000 --> 00:08:48.159
to the final practice stage. The auto parts CTF.

00:08:48.279 --> 00:08:50.580
The pro -level, self -hosted capture of the flag.

00:08:50.679 --> 00:08:52.980
It's built from a real client pen testing engagement.

00:08:53.320 --> 00:08:56.240
It features deep business logic flaws and actual

00:08:56.240 --> 00:08:59.179
data exfiltration path. It perfectly highlights

00:08:59.179 --> 00:09:01.779
the next massive vulnerability we must discuss.

00:09:02.039 --> 00:09:05.700
The MCP security blind spot. Let's define MCP

00:09:05.700 --> 00:09:08.480
plainly. It's the digital bridge connecting AI

00:09:08.480 --> 00:09:11.620
agents to real world tools. Exactly. It connects

00:09:11.620 --> 00:09:14.700
the AI to databases, emails, and CRM platforms.

00:09:14.899 --> 00:09:17.440
It turns a chatbot into a worker. But it introduces

00:09:17.440 --> 00:09:20.159
a massive security gap. There is no standard

00:09:20.159 --> 00:09:22.440
role -based access control right now. The default

00:09:22.440 --> 00:09:24.740
is giving the AI far more access than needed.

00:09:24.980 --> 00:09:27.860
Right. Why do developers give AI full right access

00:09:27.860 --> 00:09:30.240
initially during setup? Developers are under

00:09:30.240 --> 00:09:34.220
immense pressure to ship incredibly fast. Configuring

00:09:34.220 --> 00:09:36.480
granular read -only permissions takes time and

00:09:36.480 --> 00:09:39.820
testing. Giving full access just works immediately.

00:09:40.120 --> 00:09:42.500
It's just faster to grant full access than configure

00:09:42.500 --> 00:09:45.799
strict rules. Unfortunately, yes. And this creates

00:09:45.799 --> 00:09:48.059
a truly catastrophic attack chain. It starts

00:09:48.059 --> 00:09:50.299
with a simple prompt injection attack. Which

00:09:50.299 --> 00:09:53.480
leads directly to a compromised internal AI agent.

00:09:53.850 --> 00:09:56.710
That compromised agent uses its overprivileged

00:09:56.710 --> 00:09:59.990
MCP connection to wreak havoc. Suddenly, the

00:09:59.990 --> 00:10:02.929
attacker has full right access to highly sensitive

00:10:02.929 --> 00:10:05.669
medical data. Or secure financial records. This

00:10:05.669 --> 00:10:08.470
is why the golden rule of 2026 is zero trust.

00:10:08.809 --> 00:10:11.710
Never give an AI agent more access than it absolutely

00:10:11.710 --> 00:10:14.570
needs. Mid -roll sponsor Reed inserted here,

00:10:14.710 --> 00:10:17.029
do not pull from newsletter text. We need to

00:10:17.029 --> 00:10:19.789
talk about AI hacking itself. Yeah. Something

00:10:19.789 --> 00:10:21.529
important is happening right now, and it's making

00:10:21.529 --> 00:10:24.850
people very uneasy. Autonomous AI tools are actively

00:10:24.850 --> 00:10:27.769
competing in human bug bounty programs. And they

00:10:27.769 --> 00:10:29.590
aren't just participating, they're winning them.

00:10:29.750 --> 00:10:33.549
Against top -tier human hackers. Tools like XBOW

00:10:33.549 --> 00:10:36.330
and Arachne are topping the global leaderboards

00:10:36.330 --> 00:10:40.730
today. They're finding real, complex flaws remarkably

00:10:40.730 --> 00:10:44.750
fast. Whoa, imagine an automated AI hacker scaling

00:10:44.750 --> 00:10:47.450
to a billion queries a second against unsecured

00:10:47.450 --> 00:10:50.620
networks. beat it is genuinely a terrifying thought

00:10:50.620 --> 00:10:53.120
the sheer speed of these tools is unprecedented

00:10:54.019 --> 00:10:56.220
Automation is coming for routine pen testing

00:10:56.220 --> 00:10:58.940
first. Automated vulnerability scanning is already

00:10:58.940 --> 00:11:01.559
standard. This creates a truly chilling gap for

00:11:01.559 --> 00:11:03.940
many organizations. Companies with zero security

00:11:03.940 --> 00:11:06.500
testing are severely exposed. They're about to

00:11:06.500 --> 00:11:09.220
face attackers wielding automated AI tools. The

00:11:09.220 --> 00:11:11.740
baseline level of offensive capability is rising

00:11:11.740 --> 00:11:14.419
incredibly fast. What happens to human security

00:11:14.419 --> 00:11:18.000
experts now that AI automates this? Routine vulnerability

00:11:18.000 --> 00:11:20.440
scanning is going to be handed over to AI. It's

00:11:20.440 --> 00:11:22.539
just faster. Human experts will need to focus

00:11:22.539 --> 00:11:25.250
on understanding complex business contests. Humans

00:11:25.250 --> 00:11:28.029
move to complex judgment calls while AI handles

00:11:28.029 --> 00:11:30.669
the routine scanning. That is the rapidly approaching

00:11:30.669 --> 00:11:32.929
future of the security industry. So what does

00:11:32.929 --> 00:11:34.789
this all mean for us moving forward? We're seeing

00:11:34.789 --> 00:11:37.809
a massive shift from chatbots to active workers.

00:11:38.169 --> 00:11:40.549
AI security is no longer about preventing offensive

00:11:40.549 --> 00:11:43.350
generated language. It's entirely about preventing

00:11:43.350 --> 00:11:46.309
unauthorized action execution. You are defending

00:11:46.309 --> 00:11:48.830
the tools, the databases, and the infrastructure.

00:11:49.470 --> 00:11:52.250
The primary focus must be on zero trust principles

00:11:52.250 --> 00:11:55.750
for all AI agents. You absolutely need strict

00:11:55.750 --> 00:11:58.350
access control at the MCP integration layer.

00:11:58.429 --> 00:12:00.289
You need to thoroughly audit your integrations

00:12:00.289 --> 00:12:02.450
right now. Assume prompts will eventually be

00:12:02.450 --> 00:12:05.429
extracted by clever attackers and run a real

00:12:05.429 --> 00:12:07.990
pentest before users ever touch the product.

00:12:08.230 --> 00:12:10.049
Before we wrap up, I want to leave you with a

00:12:10.049 --> 00:12:12.299
final provocative thought. Let's hear it. If

00:12:12.299 --> 00:12:16.639
AI agents like XBOW and Arachne are autonomously

00:12:16.639 --> 00:12:19.259
finding zero day vulnerabilities faster than

00:12:19.259 --> 00:12:22.700
humans, how long until we have to rely entirely

00:12:22.700 --> 00:12:25.639
on AI to write the defensive patches? Wow. And

00:12:25.639 --> 00:12:28.500
if an AI is writing the patch to defend against

00:12:28.500 --> 00:12:31.919
an AI attacker, who is really in control of the

00:12:31.919 --> 00:12:34.460
security ecosystem? That is a fascinating question

00:12:34.460 --> 00:12:36.320
to leave off on. Thank you so much for joining

00:12:36.320 --> 00:12:38.399
us on this deep dive. We will catch you next

00:12:38.399 --> 00:12:39.960
time. Out to your own music.