WEBVTT

00:00:00.000 --> 00:00:03.220
Imagine trying to, like, actually bottle human

00:00:03.220 --> 00:00:06.879
intuition, just cashing the exact thought process

00:00:06.879 --> 00:00:10.160
of a world -class doctor diagnosing some rare

00:00:10.160 --> 00:00:13.380
blood disease. Or a master chemist, you know,

00:00:13.500 --> 00:00:15.699
deducing the structure of an unknown organic

00:00:15.699 --> 00:00:18.719
molecule from just a massive pile of raw data.

00:00:18.940 --> 00:00:21.320
Yeah, exactly. Taking those rapid subconscious

00:00:21.320 --> 00:00:23.679
leaps of logic that happen inside an expert's

00:00:23.679 --> 00:00:26.879
brain, freezing them into code, and then letting

00:00:26.879 --> 00:00:29.079
a machine just replicate that exact brilliance

00:00:29.079 --> 00:00:31.179
on command. And doing it decades before anyone

00:00:31.179 --> 00:00:33.420
had ever even heard of deep learning or massive

00:00:33.420 --> 00:00:35.320
language models. Right. Well, welcome to the

00:00:35.320 --> 00:00:37.380
deep dive. Today we are exploring the history,

00:00:37.759 --> 00:00:40.039
the mechanics, and the kind of hidden legacy

00:00:40.039 --> 00:00:42.500
of expert systems. Yeah. If you're a learner

00:00:42.500 --> 00:00:44.679
who's obsessing over modern artificial neural

00:00:44.679 --> 00:00:46.700
networks, you really need to hear this because

00:00:46.700 --> 00:00:49.780
we are looking at what is arguably the first

00:00:49.780 --> 00:00:53.259
truly successful form of AI software. It's just

00:00:53.259 --> 00:00:55.039
a fascinating journey. We're going to look at

00:00:55.039 --> 00:00:57.359
how early computer scientists pioneered this

00:00:57.359 --> 00:00:59.719
concept, why it completely revolutionized the

00:00:59.719 --> 00:01:02.359
corporate world in the 1980s, and well, whether

00:01:02.359 --> 00:01:04.780
these systems actually died out during the infamous

00:01:04.780 --> 00:01:07.560
AI winter. Or if they just went undercover. Exactly.

00:01:07.980 --> 00:01:10.239
Okay, let's unpack this. Before we get into the

00:01:10.239 --> 00:01:12.280
corporate espionage and the AI winter and all

00:01:12.280 --> 00:01:14.459
that, we really need to understand the anatomy

00:01:14.459 --> 00:01:17.439
of these mechanical minds. The core concept here

00:01:17.439 --> 00:01:20.650
is emulating human decision -making Using if

00:01:20.650 --> 00:01:23.250
-then rules right instead of conventional procedural

00:01:23.250 --> 00:01:26.310
programming, right? Yeah, so how does that architecture

00:01:26.310 --> 00:01:28.430
actually work under the hood? So fundamentally

00:01:28.430 --> 00:01:31.689
an expert system is divided into two main subsystems

00:01:31.689 --> 00:01:33.430
First you have what's called the knowledge base.

00:01:33.549 --> 00:01:35.870
This is where all the facts and the rules about

00:01:35.870 --> 00:01:38.709
the specific world are stored Okay, and early

00:01:38.709 --> 00:01:42.579
on these were just you know flat standalone assertions.

00:01:42.939 --> 00:01:46.420
But as the systems grew, they evolved into object

00:01:46.420 --> 00:01:49.659
-oriented concepts. So think of classes, subclasses,

00:01:49.859 --> 00:01:51.799
instances. Wait, hold on. Unpack that object

00:01:51.799 --> 00:01:53.959
-oriented shift for me. Why do they need classes

00:01:53.959 --> 00:01:56.239
instead of just like a big list of facts? Well,

00:01:56.280 --> 00:01:58.920
because a flat list takes up massive amounts

00:01:58.920 --> 00:02:01.519
of memory and processing power. Oh, sure. If

00:02:01.519 --> 00:02:03.939
you have to read a specific separate rule stating

00:02:03.939 --> 00:02:06.599
that a sedan has four wheels and then another

00:02:06.599 --> 00:02:09.379
rule that a truck has four wheels and a van has

00:02:09.379 --> 00:02:11.280
four wheels. You're just wasting space. You're

00:02:11.280 --> 00:02:14.379
wasting so much space. By creating a vehicle

00:02:14.379 --> 00:02:18.240
class with the baseline rule of has four wheels,

00:02:18.599 --> 00:02:21.400
you can just create subclasses that inherit those

00:02:21.400 --> 00:02:23.759
traits automatically. Right. It made the knowledge

00:02:23.759 --> 00:02:26.819
base infinitely more efficient. and critically

00:02:26.819 --> 00:02:29.319
scalable. OK, so the knowledge base holds the

00:02:29.319 --> 00:02:31.960
rules and the facts. What's the second subsystem,

00:02:32.120 --> 00:02:34.460
then? That would be the inference engine. This

00:02:34.460 --> 00:02:37.020
is the automated reasoning system. It basically

00:02:37.020 --> 00:02:38.800
looks at the current state of the knowledge base,

00:02:39.319 --> 00:02:41.560
applies the relevant rules to the known facts,

00:02:41.960 --> 00:02:44.580
and deduces entirely new facts. It sounds like,

00:02:44.919 --> 00:02:47.080
well, if I'm picturing this, traditional code

00:02:47.080 --> 00:02:49.020
is like a recipe you have to follow step by step,

00:02:49.159 --> 00:02:51.139
right? Yeah, imperative instructions. Right.

00:02:51.610 --> 00:02:53.849
But an expert system is more like a detective

00:02:53.849 --> 00:02:56.509
walking into a room full of clues and just applying

00:02:56.509 --> 00:02:59.349
logic. Or, actually, maybe it's like a giant

00:02:59.349 --> 00:03:01.909
plinko board. A plinko board? Yeah, like the

00:03:01.909 --> 00:03:03.810
knowledge base forms all the pegs, the facts,

00:03:03.870 --> 00:03:06.490
and the rules. And the inference engine is the

00:03:06.490 --> 00:03:08.710
puck dropping through. Every time it hits a peg,

00:03:09.129 --> 00:03:12.689
it's forced down a specific if -then branch until

00:03:12.689 --> 00:03:14.750
it eventually lands in a conclusion at the bottom.

00:03:14.960 --> 00:03:17.740
That is actually a really brilliant way to visualize

00:03:17.740 --> 00:03:20.419
it. And that puck, the inference engine, can

00:03:20.419 --> 00:03:22.259
actually drop through the board in two different

00:03:22.259 --> 00:03:24.719
ways, forward chaining and backward chaining.

00:03:24.919 --> 00:03:27.300
Okay, break those down for me. Let's use a classic

00:03:27.300 --> 00:03:29.500
logic puzzle from the source material, the Socrates

00:03:29.500 --> 00:03:33.509
example. In forward chaining, the system is totally

00:03:33.509 --> 00:03:36.710
data -driven. You feed it a fact, like Socrates

00:03:36.710 --> 00:03:39.449
is a man, the inference engine hits the rule

00:03:39.449 --> 00:03:42.990
peg that says, if man, then mortal, and it asserts

00:03:42.990 --> 00:03:46.210
a new conclusion. Socrates is mortal. So it moves

00:03:46.210 --> 00:03:48.330
forward from the data to the conclusion. Data

00:03:48.330 --> 00:03:50.949
in, conclusion out. Very straightforward. Exactly.

00:03:51.030 --> 00:03:52.689
But backward chaining sounds like the system

00:03:52.689 --> 00:03:55.889
is working in reverse. Wait, I'm actually a bit

00:03:55.889 --> 00:03:57.870
confused. How does the system even know what

00:03:57.870 --> 00:03:59.750
goal to start with? Does the user have to prompt

00:03:59.750 --> 00:04:03.650
it? Yes, backward chaining is goal -driven. The

00:04:03.650 --> 00:04:05.909
user, or maybe another part of the program, starts

00:04:05.909 --> 00:04:09.449
with a specific question, like, is Socrates mortal?

00:04:09.810 --> 00:04:12.569
The system looks at its rules and says, well,

00:04:12.750 --> 00:04:15.169
to prove he is mortal, I first need to establish

00:04:15.169 --> 00:04:18.110
if he is a man. So it queries its knowledge base.

00:04:18.269 --> 00:04:20.370
And if that fact isn't already recorded, the

00:04:20.370 --> 00:04:22.509
system does something incredibly powerful. Who

00:04:22.509 --> 00:04:25.470
does it do? It basically pauses its processing,

00:04:25.790 --> 00:04:28.529
generates a prompt, and actively interrogates

00:04:28.529 --> 00:04:31.629
the user. It will literally ask, is Socrates

00:04:31.629 --> 00:04:35.269
a man? That is wild. It realizes what it doesn't

00:04:35.269 --> 00:04:38.189
know? and actively questions you to fill in the

00:04:38.189 --> 00:04:40.269
blank so it can solve the puzzle. Right. And

00:04:40.269 --> 00:04:42.870
the source mentions a massive advantage to this

00:04:42.870 --> 00:04:45.069
rigid rule structure, the explanation facility.

00:04:45.389 --> 00:04:47.949
Oh yes, the why factor. This is huge. Because

00:04:47.949 --> 00:04:49.949
the system is bouncing down that plinko board

00:04:49.949 --> 00:04:52.089
of explicit if -then rules, it can trace its

00:04:52.089 --> 00:04:54.470
own steps backward. If it diagnoses a problem

00:04:54.470 --> 00:04:57.689
and you ask it why, it doesn't spit out some

00:04:57.689 --> 00:05:00.370
opaque mathematical probability. It translates

00:05:00.370 --> 00:05:03.449
its logic trail into natural English. It will

00:05:03.449 --> 00:05:06.850
reply. I concluded this because Rule 42 states

00:05:06.850 --> 00:05:10.170
all men are mortal, and Fact B states Socrates

00:05:10.170 --> 00:05:13.379
is a man. I mean, if it can explain itself and

00:05:13.379 --> 00:05:15.740
build that kind of trust with the user, it must

00:05:15.740 --> 00:05:17.800
have jumped out of the lab pretty quickly. Who

00:05:17.800 --> 00:05:20.319
actually trusted this enough to put it into practice

00:05:20.319 --> 00:05:22.740
back then? Well, to put it in context, in the

00:05:22.740 --> 00:05:25.660
late 1950s and 60s, researchers were trying to

00:05:25.660 --> 00:05:28.060
build diagnostic systems, especially in medicine.

00:05:28.139 --> 00:05:30.420
But they were hitting walls using traditional

00:05:30.420 --> 00:05:32.639
flow charts or statistical pattern matching.

00:05:32.980 --> 00:05:35.019
Those early attempts were just simply too rigid.

00:05:35.100 --> 00:05:38.839
Right. Then around 1965, Edward Feigenbaum, he's

00:05:38.839 --> 00:05:40.889
often called the father of experts. systems,

00:05:41.449 --> 00:05:43.569
and his team at Stanford introduced the formal

00:05:43.569 --> 00:05:46.750
concept. And Feigenbaum had this really vital

00:05:46.750 --> 00:05:49.269
insight. Really was. He realized that the true

00:05:49.269 --> 00:05:51.470
power of an intelligent system comes from the

00:05:51.470 --> 00:05:53.730
specific knowledge it possesses, not just from

00:05:53.730 --> 00:05:56.629
general problem -solving math. Knowing a lot

00:05:56.629 --> 00:05:59.189
about a highly specific domain is better than

00:05:59.189 --> 00:06:01.790
being generally smart but totally ignorant of

00:06:01.790 --> 00:06:03.990
the details. Exactly. And they proved it with

00:06:03.990 --> 00:06:07.399
a system called MYCIN. Oh yeah, the sources detail

00:06:07.399 --> 00:06:11.079
how they built MYCIN to diagnose infectious blood

00:06:11.079 --> 00:06:14.160
diseases. But it wasn't just using basic true

00:06:14.160 --> 00:06:17.800
or false logic, was it? Not at all. MYCIN introduced

00:06:17.800 --> 00:06:20.060
certainty factors. Because in medicine, things

00:06:20.060 --> 00:06:22.259
are rarely absolute, right? Yeah, of course.

00:06:22.420 --> 00:06:24.899
A doctor might observe a symptom and be, say,

00:06:25.100 --> 00:06:29.220
70 % sure it indicates a specific bacteria. MYCIN

00:06:29.220 --> 00:06:31.519
allowed the rules to incorporate that fuzzy logic.

00:06:31.660 --> 00:06:33.300
That's fascinating. It would ask the physician

00:06:33.300 --> 00:06:36.459
questions about the patient, calculate the compounding

00:06:36.459 --> 00:06:38.540
probabilities of different infections based on

00:06:38.540 --> 00:06:41.620
those certainty factors, and then recommend an

00:06:41.620 --> 00:06:44.040
antibiotic therapy. Wow. And it was actually

00:06:44.040 --> 00:06:48.339
outperforming at Stanford in the 1970s. The 1970s.

00:06:48.360 --> 00:06:50.839
And it wasn't just medicine. The source lists

00:06:50.839 --> 00:06:53.720
Dendrol identifying organic molecules in chemistry

00:06:53.720 --> 00:06:57.800
and Garvin AS1, which that was the first medical

00:06:57.800 --> 00:07:00.620
expert system used for daily routine diagnosis

00:07:00.620 --> 00:07:02.459
internationally in Australia, right? Yes, running

00:07:02.459 --> 00:07:05.680
on 661 rules, actually. That is incredible. But

00:07:05.680 --> 00:07:07.639
the one that really caught my eye was the APS

00:07:07.639 --> 00:07:09.819
system. They used a programming language called

00:07:09.819 --> 00:07:12.360
Prologue to encode the British Nationality Act

00:07:12.360 --> 00:07:15.240
of 1981. Yeah, that's a famous one. They literally

00:07:15.240 --> 00:07:18.160
turned international law into executable code.

00:07:19.079 --> 00:07:22.439
But why use Prolog? Why not just write it in

00:07:22.439 --> 00:07:25.379
standard languages like C or COBOL that everyone

00:07:25.379 --> 00:07:28.149
in the business world was already using? So standard

00:07:28.149 --> 00:07:31.069
languages like C or COBOL are imperative. You

00:07:31.069 --> 00:07:33.550
have to give the computer step -by -step instructions,

00:07:33.850 --> 00:07:36.490
go to this memory address, add to, move to the

00:07:36.490 --> 00:07:39.389
next step. Right, very procedural. Exactly. Prologue

00:07:39.389 --> 00:07:41.750
is declarative. It is literally a language built

00:07:41.750 --> 00:07:44.610
on logic statements. You declare a truth, like

00:07:44.610 --> 00:07:47.189
a parent of a parent is a grandparent. OK. You

00:07:47.189 --> 00:07:49.209
don't tell the computer how to find the grandparent.

00:07:49.389 --> 00:07:51.730
You just define the logic. And the languages

00:07:51.730 --> 00:07:54.110
built in inference engine figures out the rest.

00:07:54.350 --> 00:07:56.990
It made encoding complex legal statu - it's just

00:07:56.990 --> 00:07:59.350
incredibly natural. Okay, so we have systems

00:07:59.350 --> 00:08:01.350
outperforming doctors and parsing international

00:08:01.350 --> 00:08:04.750
law. By the 1980s, two -thirds of Fortune 500

00:08:04.750 --> 00:08:06.689
companies were applying this technology in their

00:08:06.689 --> 00:08:09.850
daily business. But I have to ask, if these systems

00:08:09.850 --> 00:08:13.029
are essentially just giant complex webs of if

00:08:13.029 --> 00:08:15.470
-then rules, were they actually intelligent or

00:08:15.470 --> 00:08:17.709
were they just incredibly fast filing cabinets

00:08:17.709 --> 00:08:20.259
cross -referencing data? What's fascinating here

00:08:20.259 --> 00:08:22.120
is how the source answers that, specifically

00:08:22.120 --> 00:08:25.199
with a story from 1982 about a software program

00:08:25.199 --> 00:08:28.920
called SID, Synthesis of Integral Design. Right,

00:08:28.920 --> 00:08:32.379
the SID controversy. Yeah. SID was an expert

00:08:32.379 --> 00:08:35.000
system written in Lisp, and it was tasked with

00:08:35.000 --> 00:08:38.759
designing CPU logic gates for the VX9000 computer.

00:08:39.399 --> 00:08:41.460
OK. Human expert logic designers input a set

00:08:41.460 --> 00:08:44.639
of foundational rules. But SID took those rules

00:08:44.639 --> 00:08:47.539
and expanded on them. It generated logic synthesis

00:08:47.539 --> 00:08:49.779
routines that were vastly more complex than the

00:08:49.779 --> 00:08:51.860
original inputs. It beat the masters at their

00:08:51.860 --> 00:08:54.019
own game. Yeah. It ended up designing, what,

00:08:54.220 --> 00:08:57.519
93 % of the CPU's logic gates? Yes. And the design

00:08:57.519 --> 00:08:59.740
actually exceeded the capabilities of the human

00:08:59.740 --> 00:09:01.659
experts who wrote the rules in the first place.

00:09:01.679 --> 00:09:04.620
Yeah. But the ending of that story is crazy.

00:09:05.320 --> 00:09:07.120
The source notes that the program was highly

00:09:07.120 --> 00:09:09.539
controversial. And despite its massive success.

00:09:09.639 --> 00:09:12.600
The human logic designers terminated the AI after

00:09:12.600 --> 00:09:14.259
the project was finished. It pulled the plug.

00:09:14.559 --> 00:09:18.090
They did. It brings up questions of ego, certainly,

00:09:18.429 --> 00:09:22.009
but also trust. When a machine's output is so

00:09:22.009 --> 00:09:24.450
complex that the humans can't fully grasp how

00:09:24.450 --> 00:09:26.870
it optimized everything, even if it technically

00:09:26.870 --> 00:09:29.649
works flawlessly, people get really uncomfortable.

00:09:29.730 --> 00:09:32.690
It felt alien to them. Completely alien. So if

00:09:32.690 --> 00:09:35.330
SID is outperforming the masters and Fortune

00:09:35.330 --> 00:09:38.470
500s are eating this up, why did it hit a wall?

00:09:40.309 --> 00:09:42.360
Because... Spoiler alert to you listening. You

00:09:42.360 --> 00:09:44.620
don't hear the phrase expert system much anymore.

00:09:45.039 --> 00:09:47.440
We don't see massive server farms dedicated to

00:09:47.440 --> 00:09:49.820
prologue logic puzzles. No, we don't. The collapse

00:09:49.820 --> 00:09:52.259
really came down to the fatal flaws of the if

00:09:52.259 --> 00:09:55.779
-then architecture itself. The first major roadblock

00:09:55.779 --> 00:09:58.799
was what researchers called the knowledge acquisition

00:09:58.799 --> 00:10:02.100
problem The human bottleneck exactly. I mean

00:10:02.100 --> 00:10:03.620
think about your own job for a second if you're

00:10:03.620 --> 00:10:05.960
listening to this How hard would it be to sit

00:10:05.960 --> 00:10:07.879
down and write out every single subconscious

00:10:07.879 --> 00:10:10.580
decision you make in a day without ever contradicting

00:10:10.580 --> 00:10:12.659
yourself? It's nearly impossible. You would have

00:10:12.659 --> 00:10:14.320
a nervous breakdown trying to map out your own

00:10:14.320 --> 00:10:17.299
intuition and To build these systems, companies

00:10:17.299 --> 00:10:19.820
had to hire these specialized knowledge engineers

00:10:19.820 --> 00:10:23.480
whose entire job was to extract these intuitive

00:10:23.480 --> 00:10:26.980
rules from domain experts. But these experts

00:10:26.980 --> 00:10:30.240
are, by definition, highly valued and incredibly

00:10:30.240 --> 00:10:33.360
busy people. Getting hours of their time and

00:10:33.360 --> 00:10:35.500
then trying to translate their gut feelings into

00:10:35.500 --> 00:10:39.179
rigid logic trees, it was agonizingly slow and

00:10:39.179 --> 00:10:41.620
incredibly expensive. And even if you could extract

00:10:41.620 --> 00:10:44.559
all those rules, The computers themselves started

00:10:44.559 --> 00:10:46.539
choking on them, didn't they? The math simply

00:10:46.539 --> 00:10:49.159
caught up with the ambition. Yeah, the scaling

00:10:49.159 --> 00:10:51.519
nightmare. Researchers started envisioning the

00:10:51.519 --> 00:10:54.120
ultimate expert system with, like, 100 million

00:10:54.120 --> 00:10:56.600
rules. Wow. But they hit a hard mathematical

00:10:56.600 --> 00:10:59.779
wall. When you have that many rules, verifying

00:10:59.779 --> 00:11:01.799
that they don't contradict each other turns into

00:11:01.799 --> 00:11:05.200
a massive computational issue known as the SAT

00:11:05.200 --> 00:11:08.500
problem, Boolean Satisfiability. OK, let's break

00:11:08.500 --> 00:11:11.330
that down practically. Why is checking for contradictions

00:11:11.330 --> 00:11:13.889
so hard for a computer? I mean, it's just cross

00:11:13.889 --> 00:11:15.710
-referencing text, right? Well, it's not just

00:11:15.710 --> 00:11:18.509
text. It's a combinatorial explosion. Imagine

00:11:18.509 --> 00:11:20.549
you have a system with a hundred thousand rules.

00:11:21.049 --> 00:11:24.610
If you introduce one new rule, you have to ensure

00:11:24.610 --> 00:11:27.070
it doesn't create some hidden paradox with rule

00:11:27.070 --> 00:11:31.009
number 4000. Oh, right. To verify absolute consistency

00:11:31.009 --> 00:11:33.090
across hundreds of interconnected variables,

00:11:33.570 --> 00:11:36.129
the computer has to check every single possible

00:11:36.129 --> 00:11:39.289
combination of true and false states. So the

00:11:39.289 --> 00:11:41.610
search space just grows exponentially. It's like

00:11:41.610 --> 00:11:44.210
a butterfly effect of logic. Precisely. It's

00:11:44.210 --> 00:11:46.870
2 to the power of n. So if you have just 10 variables,

00:11:46.990 --> 00:11:49.350
that's over a thousand combinations. Which is

00:11:49.350 --> 00:11:51.620
fine for a computer. Right. But once you reach

00:11:51.620 --> 00:11:54.179
hundreds of variables, the number of permutations

00:11:54.179 --> 00:11:56.620
quickly exceeds the number of atoms in the known

00:11:56.620 --> 00:12:00.419
universe. Wow. It literally paralyzes the processor.

00:12:01.100 --> 00:12:04.399
Massive systems became fundamentally computationally

00:12:04.399 --> 00:12:06.379
unworkable. And on top of that, they suffered

00:12:06.379 --> 00:12:09.159
from severe overfitting, right? Yes. Because

00:12:09.159 --> 00:12:12.100
the system only knows exactly what is explicitly

00:12:12.100 --> 00:12:14.759
described in its knowledge base, it struggles

00:12:14.759 --> 00:12:17.840
to handle novel situations. If a case comes in

00:12:17.840 --> 00:12:21.080
that is just slightly outside, the pre -programmed

00:12:21.080 --> 00:12:24.139
rules, the system doesn't have like common sense

00:12:24.139 --> 00:12:27.039
to fall back on. It just breaks. And then you

00:12:27.039 --> 00:12:29.960
add all of that to what the source calls integration

00:12:29.960 --> 00:12:33.019
woes. You have these incredible AI systems built

00:12:33.019 --> 00:12:37.159
in exotic languages like Lisp and Prolog running

00:12:37.159 --> 00:12:40.580
on highly specialized. Very expensive Lisp machines.

00:12:41.019 --> 00:12:43.799
Which corporate IT departments absolutely hated?

00:12:44.080 --> 00:12:45.620
I can imagine. They were sitting there running

00:12:45.620 --> 00:12:48.000
their entire massive businesses on imperative

00:12:48.000 --> 00:12:51.340
languages like COBOL on IBM mainframes, and then

00:12:51.340 --> 00:12:55.639
1981 hits, the IBM PC is introduced, and suddenly

00:12:55.639 --> 00:12:58.360
we have this huge paradigm shift to the client

00:12:58.360 --> 00:13:00.399
server model. Corporate IT departments wanted

00:13:00.399 --> 00:13:03.259
AI on their affordable PCs and standard servers.

00:13:03.539 --> 00:13:05.779
They didn't want it on weird, isolated hardware

00:13:05.779 --> 00:13:07.779
that couldn't even talk to their legacy databases.

00:13:08.039 --> 00:13:10.519
It was just totally incompatible with the business

00:13:10.519 --> 00:13:12.860
infrastructure of the time. So between the human

00:13:12.860 --> 00:13:15.179
bottlenecks, the mathematical limits of the SAT

00:13:15.179 --> 00:13:17.720
problem, and the IT departments just rejecting

00:13:17.720 --> 00:13:20.340
the hardware, we entered the AI winter of the

00:13:20.340 --> 00:13:23.519
1990s. The term expert system practically vanished

00:13:23.519 --> 00:13:25.919
from the IT lexicon. It really did. Which brings

00:13:25.919 --> 00:13:27.700
us to the ultimate question of this deep dive.

00:13:28.279 --> 00:13:30.580
Did this technology actually die, or is it just

00:13:30.580 --> 00:13:33.639
in disguise? Well, the source outlines two totally

00:13:33.639 --> 00:13:35.820
opposite interpretations of what happened next.

00:13:36.059 --> 00:13:39.179
Theory one is the pessimistic view. Expert systems

00:13:39.179 --> 00:13:42.259
failed. They simply couldn't deliver on the overhyped

00:13:42.259 --> 00:13:45.259
promises of the 1980s. The scaling problems were

00:13:45.259 --> 00:13:47.600
totally insurmountable and the tech world basically

00:13:47.600 --> 00:13:49.940
just cut its losses and moved on. But theory

00:13:49.940 --> 00:13:52.860
two is the stealth resurrection. Yes. They didn't

00:13:52.860 --> 00:13:55.039
die at all. They were victims of their own massive

00:13:55.039 --> 00:13:58.039
success. The IT world just swallowed them whole.

00:13:58.279 --> 00:14:01.360
Exactly. As IT professionals really grasped the

00:14:01.360 --> 00:14:03.399
concept of separating the rules from the inference

00:14:03.399 --> 00:14:06.299
engine, they stopped treating them as magical

00:14:06.299 --> 00:14:08.500
standalone artificial intelligence. Right. The

00:14:08.500 --> 00:14:10.679
mystique wore off. Yeah. They just became another

00:14:10.679 --> 00:14:13.460
standard tool in the software toolbox. Major

00:14:13.460 --> 00:14:17.259
vendors like SAP, Siebel, and Oracle quietly

00:14:17.259 --> 00:14:19.899
integrated expert system abilities right into

00:14:19.899 --> 00:14:22.019
their standard business. application suites.

00:14:22.220 --> 00:14:24.039
And they became known as business rule engines.

00:14:24.279 --> 00:14:27.080
Exactly. Business rule engines. So they migrated

00:14:27.080 --> 00:14:30.840
from being sci -fi AI in a Stanford lab to becoming

00:14:30.840 --> 00:14:33.960
the invisible plumbing of corporate logic. I

00:14:33.960 --> 00:14:36.879
mean, they are still out there right now, quietly

00:14:36.879 --> 00:14:40.039
automating business processes, approving mortgages,

00:14:40.279 --> 00:14:42.980
and running diagnostics behind the scenes. They

00:14:42.980 --> 00:14:45.600
are. And if we connect this to the bigger picture,

00:14:46.360 --> 00:14:49.519
the definition of AI simply shifted. How so?

00:14:50.190 --> 00:14:52.730
Modern artificial intelligence, like the machine

00:14:52.730 --> 00:14:54.809
learning models everyone is obsessed with today,

00:14:55.210 --> 00:14:57.110
operates very differently from these classic

00:14:57.110 --> 00:15:00.289
expert systems. Modern AI, like recurrent neural

00:15:00.289 --> 00:15:03.389
networks, relies on big data and feedback mechanisms.

00:15:04.129 --> 00:15:06.049
Instead of a human trying to write a million

00:15:06.049 --> 00:15:08.350
rigid rules, you just feed the neural network

00:15:08.350 --> 00:15:10.789
millions of examples and it learns to recognize

00:15:10.789 --> 00:15:13.009
the patterns itself. It generalizes far better

00:15:13.009 --> 00:15:15.210
than if -then rules ever could. Tremendously

00:15:15.210 --> 00:15:17.230
better. There is a great example of this in the

00:15:17.230 --> 00:15:19.110
source material regarding voice recognition.

00:15:19.450 --> 00:15:22.009
Early researchers built an expert system called

00:15:22.009 --> 00:15:24.870
Hearsay. Ah, yes. They tried to recognize speech

00:15:24.870 --> 00:15:27.289
using explicit if -then rules about audio phones.

00:15:27.429 --> 00:15:30.110
And it struggled immensely. Because human speech

00:15:30.110 --> 00:15:33.629
is noisy, it is messy, it relies heavily on context.

00:15:34.049 --> 00:15:36.289
You cannot capture the difference between a Boston

00:15:36.289 --> 00:15:38.769
accent and a Texas accent with a strict set of

00:15:38.769 --> 00:15:41.669
logic rules. She's too fuzzy. Right. Hearsay

00:15:41.669 --> 00:15:43.929
proved that interpretation problems, like messy

00:15:43.929 --> 00:15:46.629
pattern matching, were far better suited for

00:15:46.629 --> 00:15:49.049
neural networks than rule -based expert systems.

00:15:48.970 --> 00:15:52.970
So what does this all mean? If we zoom out, we

00:15:52.970 --> 00:15:56.250
see this incredible historical arc. Expert systems

00:15:56.250 --> 00:15:59.309
may not be the shiny face of AI anymore, but

00:15:59.309 --> 00:16:02.629
they completely pioneered the idea that capturing

00:16:02.629 --> 00:16:05.389
human knowledge was inherently valuable. Yes,

00:16:05.389 --> 00:16:07.429
they absolutely did. They pushed the computing

00:16:07.429 --> 00:16:09.610
world forward, they paved the way for client

00:16:09.610 --> 00:16:12.009
-server environments, and they still quietly

00:16:12.009 --> 00:16:14.330
run the business logic of the world today. But

00:16:14.330 --> 00:16:16.309
they also offer something that we have largely

00:16:16.309 --> 00:16:19.210
lost in the modern AI boom. What's that? Today

00:16:19.210 --> 00:16:22.309
we live in a world of complex black box AI. When

00:16:22.309 --> 00:16:24.990
a neural network denies your loan or misidentifies

00:16:24.990 --> 00:16:27.909
a medical scan and you ask it why, it cannot

00:16:27.909 --> 00:16:29.950
really tell you. No, it just points to billions

00:16:29.950 --> 00:16:32.269
of statistical weights scattered across a massive

00:16:32.269 --> 00:16:35.799
matrix. Exactly. The absolute transparency of

00:16:35.799 --> 00:16:38.440
the old expert system's explanation facility,

00:16:38.679 --> 00:16:41.539
that ability to reply in plain English, I decided

00:16:41.539 --> 00:16:44.240
this because of Rule A and Fact B. That is something

00:16:44.240 --> 00:16:47.580
we actually deeply miss today. We traded transparency

00:16:47.580 --> 00:16:50.220
for scale. We got the answers, but we lost the

00:16:50.220 --> 00:16:52.200
explanation. Which brings up a fascinating thought

00:16:52.200 --> 00:16:54.740
to leave you with. When an old expert system

00:16:54.740 --> 00:16:57.220
made a terrible mistake, you could trace the

00:16:57.220 --> 00:16:59.779
exact rule that caused it, find the human knowledge

00:16:59.779 --> 00:17:01.940
engineer who wrote it, and hold them accountable.

00:17:02.220 --> 00:17:06.460
But today, when a black box neural network hallucinates

00:17:06.460 --> 00:17:08.740
a legal precedent or misdiagnoses a patient,

00:17:09.220 --> 00:17:12.970
the logic is entirely invisible. As these modern

00:17:12.970 --> 00:17:15.190
systems take over our most critical infrastructure,

00:17:15.849 --> 00:17:18.049
next time you make a complex decision at work,

00:17:18.490 --> 00:17:21.089
ask yourself, could your thought process be broken

00:17:21.089 --> 00:17:23.930
down into a massive tree of if -then rules? Probably

00:17:23.930 --> 00:17:26.970
not. And if it can't, what exactly is the secret

00:17:26.970 --> 00:17:28.849
ingredient of human intuition that a machine

00:17:28.849 --> 00:17:31.670
can't capture? And when the machine gets it wrong,

00:17:32.089 --> 00:17:35.130
without being able to explain why, who exactly

00:17:35.130 --> 00:17:36.210
do we hold accountable?
