WEBVTT

00:00:00.000 --> 00:00:02.580
If you ask someone on the street for directions

00:00:02.580 --> 00:00:05.419
to the nearest coffee shop, they don't usually

00:00:05.419 --> 00:00:07.559
look at you and say, well, put your left foot

00:00:07.559 --> 00:00:10.699
forward, shift your weight, now flex your knee

00:00:10.699 --> 00:00:12.740
20 degrees. Right, that would be terrifying.

00:00:13.099 --> 00:00:14.900
And, you know, completely exhausting. They just

00:00:14.900 --> 00:00:17.579
say, take a left. It's brick build. Exactly.

00:00:18.140 --> 00:00:19.940
They give you the layout and they just trust

00:00:19.940 --> 00:00:22.210
your brain to figure out the actual. physical

00:00:22.210 --> 00:00:23.829
mechanics of getting there. Yeah, they don't

00:00:23.829 --> 00:00:26.250
micromanage your joints. But for decades, I mean,

00:00:26.309 --> 00:00:29.429
we forced computers to do exactly that. We communicated

00:00:29.429 --> 00:00:32.369
with them using that exhausting foot -by -foot

00:00:32.369 --> 00:00:35.530
method. But today, we are looking at a completely

00:00:35.530 --> 00:00:38.149
different paradigm. So welcome to another deep

00:00:38.149 --> 00:00:40.710
dive specially tailored for you, the learner.

00:00:41.450 --> 00:00:44.009
Today, we're skipping the dense academic textbooks

00:00:44.009 --> 00:00:46.630
and jumping straight into a really fascinating

00:00:46.630 --> 00:00:49.689
source, a comprehensive Wikipedia article on

00:00:49.689 --> 00:00:52.490
logic programming. It's a great piece. It really

00:00:52.490 --> 00:00:55.369
is. And the mission of this deep dive is to extract

00:00:55.369 --> 00:00:58.490
the absolute best nuggets from this text. We're

00:00:58.490 --> 00:01:00.810
going to demystify what logic programming is,

00:01:01.210 --> 00:01:04.590
explore this dramatic history behind it, and

00:01:04.590 --> 00:01:07.409
uncover how it actually mimics human thought

00:01:07.409 --> 00:01:09.870
and human law. Yeah, and the goal today is really

00:01:09.870 --> 00:01:12.250
to look at the why behind the code, not just

00:01:12.250 --> 00:01:15.010
how it works, but why this approach even exists

00:01:15.010 --> 00:01:17.549
and why it matters. OK, let's unpack this. Yeah.

00:01:17.569 --> 00:01:19.230
Because before we can understand why this sparked

00:01:19.230 --> 00:01:21.989
an actual academic war, we need to know how it

00:01:21.989 --> 00:01:24.430
actually works. Right. So the fundamental shift

00:01:24.430 --> 00:01:28.230
here is moving from procedural to declarative

00:01:28.230 --> 00:01:30.390
programming. Declarative, meaning? Meaning you

00:01:30.390 --> 00:01:32.469
don't give the computer step -by -step instructions.

00:01:32.909 --> 00:01:34.329
That foot -by -foot method, that's procedural.

00:01:34.469 --> 00:01:36.989
You're dictating the control flow. But logic

00:01:36.989 --> 00:01:39.200
programming is declarative. You just give the

00:01:39.200 --> 00:01:41.379
computer a set of sentences in logical form,

00:01:41.519 --> 00:01:43.120
facts, and rules, and you just let it figure

00:01:43.120 --> 00:01:45.000
out the solution. So you're just describing the

00:01:45.000 --> 00:01:47.659
problem, basically. Exactly. You describe what

00:01:47.659 --> 00:01:50.400
the problem is, not how to solve it. If you look

00:01:50.400 --> 00:01:52.560
at the structure of a rule in this system, it

00:01:52.560 --> 00:01:55.280
looks sort of backwards. You write the conclusion

00:01:55.280 --> 00:01:58.540
first. So let's say conclusion A. Then you write

00:01:58.540 --> 00:02:01.040
a colon and a dash, followed by the conditions.

00:02:01.099 --> 00:02:04.640
Let's call them B and C. Wait, a colon and a

00:02:04.640 --> 00:02:07.140
dash, like a sideways smiley face. Yeah, a lot

00:02:07.140 --> 00:02:09.680
of people see that. But think of that colon and

00:02:09.680 --> 00:02:11.919
dash as the computer's way of saying the word

00:02:11.919 --> 00:02:16.560
I F. So it translates to A is true, I F, B and

00:02:16.560 --> 00:02:19.000
C are true. Oh, I see. Yeah. And A is the head

00:02:19.000 --> 00:02:21.639
of the rule, the goal. B and C are the body.

00:02:21.840 --> 00:02:25.020
And these are known as horn clauses. Horn clauses.

00:02:25.199 --> 00:02:27.479
Right. Which is just a formal way to describe

00:02:27.479 --> 00:02:30.159
rules that have one clear, definite conclusion.

00:02:30.479 --> 00:02:32.520
And if you just have A with no conditions, that's

00:02:32.520 --> 00:02:34.840
just a fact. Then you just ask the system queries,

00:02:35.060 --> 00:02:36.840
and it does a deduction. OK, let me try to make

00:02:36.840 --> 00:02:39.300
this concrete using the family tree example from

00:02:39.300 --> 00:02:41.039
the source text. Sure, go ahead. So if I tell

00:02:41.039 --> 00:02:44.340
the computer, Charles is William's parent. That's

00:02:44.340 --> 00:02:46.620
a fact, right? Yep, a base fact. And then I give

00:02:46.620 --> 00:02:48.800
it a rule, like a grandparent is a parent of

00:02:48.800 --> 00:02:52.219
a parent. In a normal procedural language, I

00:02:52.219 --> 00:02:54.460
have to write a loop to search the whole database,

00:02:54.560 --> 00:02:57.259
hold it in memory, cross -reference everything.

00:02:57.419 --> 00:02:59.539
Which takes a lot of manual coding, yeah. Right.

00:02:59.900 --> 00:03:01.659
But here, I don't need to write a loop. I just

00:03:01.659 --> 00:03:04.580
ask the query, who is William's grandparent?

00:03:05.180 --> 00:03:07.199
And the computer does the logical heavy lifting.

00:03:07.520 --> 00:03:09.840
It just connects the dots. That is the exact

00:03:09.840 --> 00:03:12.520
elegance of it. It searches its database of horn

00:03:12.520 --> 00:03:16.340
clauses and figures it out. So if it's that elegant...

00:03:16.460 --> 00:03:18.860
I mean, why wasn't this the only way we programmed

00:03:18.860 --> 00:03:21.939
computers from the start? Because it caused a

00:03:21.939 --> 00:03:24.500
massive academic rivalry. Really? Oh, yeah. In

00:03:24.500 --> 00:03:27.479
the late 1960s and early 1970s, the AI community

00:03:27.479 --> 00:03:30.520
was completely split. On one side, you had the

00:03:30.520 --> 00:03:32.819
declarative camp. Which our sources say was mostly

00:03:32.819 --> 00:03:34.460
Stanford and Edinburgh, right? Right. With guys

00:03:34.460 --> 00:03:36.479
like John McCarthy, Cordell Green, Robert Kowalski.

00:03:36.599 --> 00:03:38.800
Exactly. They believed knowledge should be represented

00:03:38.800 --> 00:03:41.039
logically. But on the other side, you had the

00:03:41.039 --> 00:03:45.349
procedural camp, mainly at MIT. Ah, MIT. Marvin

00:03:45.349 --> 00:03:47.889
Minsky and Seymour Paper. Right, and they wanted

00:03:47.889 --> 00:03:50.169
step -by -step control. They looked at pure logic

00:03:50.169 --> 00:03:52.509
and thought it was way too inefficient for real

00:03:52.509 --> 00:03:54.789
-world computing. Wait, so you're telling me

00:03:54.789 --> 00:03:57.169
that one of the most important programming languages

00:03:57.169 --> 00:04:01.689
in AI history, Prologue, basically emerged because

00:04:01.689 --> 00:04:04.789
two rival academic factions couldn't agree on

00:04:04.789 --> 00:04:06.750
whether to give computers rules or instructions.

00:04:06.889 --> 00:04:09.280
That is exactly what happened. The declarative

00:04:09.280 --> 00:04:12.000
camp wanted mathematical purity, and the procedural

00:04:12.000 --> 00:04:14.979
camp wanted control, using things like backtracking,

00:04:15.080 --> 00:04:18.220
where the computer tries a path and rewinds if

00:04:18.220 --> 00:04:20.980
it hits a dead end. Like a maze. Yeah, exactly

00:04:20.980 --> 00:04:24.319
like a maze. But the two sides finally reconciled,

00:04:24.420 --> 00:04:26.899
thanks to Elaine Kolmerauer and Robert Kowalski.

00:04:26.959 --> 00:04:29.339
This was in Marseille, right? Around 1971 or

00:04:29.339 --> 00:04:32.040
72? Yes. Holmerhauer was working on a French

00:04:32.040 --> 00:04:34.439
question answering system. He invited Kowalski

00:04:34.439 --> 00:04:36.959
over, and together they realized you could actually

00:04:36.959 --> 00:04:39.740
execute pure logic as a controlled series of

00:04:39.740 --> 00:04:42.420
SNPs. They combined the two philosophies. Wow.

00:04:42.540 --> 00:04:45.019
And that reconciliation led directly to the creation

00:04:45.019 --> 00:04:48.259
of Prolog in 1972. Which stands for programming

00:04:48.259 --> 00:04:52.220
in logic. You got it. And this whole reconciliation

00:04:52.220 --> 00:04:55.060
led to a really fundamental realization about

00:04:55.060 --> 00:04:57.420
problem solving, which Kowalski actually captured

00:04:57.420 --> 00:05:01.060
in an equation. Algorithm equals logic plus control.

00:05:01.519 --> 00:05:03.860
The equation of the century, some call it. It

00:05:03.860 --> 00:05:06.759
means any algorithm has two parts. The logic

00:05:06.759 --> 00:05:10.139
is the knowledge, the rules and facts, and the

00:05:10.139 --> 00:05:13.319
control is the strategy the machine uses to process

00:05:13.319 --> 00:05:16.779
it. So logic is knowing the recipe for a cake.

00:05:17.019 --> 00:05:19.480
And control is whether you start by gathering

00:05:19.480 --> 00:05:21.839
the ingredients, which the text calls forward

00:05:21.839 --> 00:05:24.720
reasoning, or visualizing the finished cake and

00:05:24.720 --> 00:05:26.879
working backward to find out what you need, which

00:05:26.879 --> 00:05:29.040
is backward reasoning. That's a perfect analogy.

00:05:29.439 --> 00:05:31.720
Backward reasoning is top -down, like Prologue

00:05:31.720 --> 00:05:35.439
uses. It reduces goals down to base facts. Forward

00:05:35.439 --> 00:05:37.879
reasoning is bottom -up, deriving all possible

00:05:37.879 --> 00:05:39.759
facts from the base. Does it really make a big

00:05:39.759 --> 00:05:41.560
difference which one you use? Oh, absolutely.

00:05:41.899 --> 00:05:43.860
The text uses the Fibonacci sequence to show

00:05:43.860 --> 00:05:45.579
this. Oh, right, where each number is the sum

00:05:45.579 --> 00:05:47.680
of the two before it. Yeah. If you use backward

00:05:47.680 --> 00:05:49.740
reasoning to find a large Fibonacci number, the

00:05:49.740 --> 00:05:51.720
computer starts at the top and says, OK, I need

00:05:51.720 --> 00:05:54.920
n minus 1 and n minus 2. And then for n minus

00:05:54.920 --> 00:05:57.319
1, it needs n minus 2 and n minus 3. So it's

00:05:57.319 --> 00:06:00.600
calculating n minus 2 twice. Exactly. It redundantly

00:06:00.600 --> 00:06:03.160
calculates the same numbers over and over. Its

00:06:03.160 --> 00:06:05.319
exponential complexity, the computer will just

00:06:05.319 --> 00:06:08.019
crash. Yikes. But if you just change the control

00:06:08.019 --> 00:06:10.459
to forward reasoning, keeping the exact same

00:06:10.459 --> 00:06:12.839
logic, it starts at 0 and 1 and just adds them

00:06:12.839 --> 00:06:15.620
up linearly. It takes almost no time at all.

00:06:15.740 --> 00:06:18.139
Wow, so no recomputation. It just walks up the

00:06:18.139 --> 00:06:22.180
sequence. But real life isn't a neat math problem

00:06:22.180 --> 00:06:25.060
with perfect facts. How does it handle missing

00:06:25.060 --> 00:06:27.240
information? That is where we get to a concept

00:06:27.240 --> 00:06:30.420
called negation as failure. Negation as failure.

00:06:30.519 --> 00:06:33.060
Right. In this paradigm, if the computer cannot

00:06:33.060 --> 00:06:36.120
prove a positive condition is true, it just assumes

00:06:36.120 --> 00:06:38.800
it's false. Here's where it gets really interesting.

00:06:39.459 --> 00:06:41.199
Doesn't this mean the computer is just jumping

00:06:41.199 --> 00:06:44.000
to conclusions based on ignorance? Is that safe?

00:06:44.170 --> 00:06:47.430
Well, it makes it a non -monotonic logic, meaning

00:06:47.430 --> 00:06:49.930
conclusions can be withdrawn when new information

00:06:49.930 --> 00:06:52.509
appears. It's highly adaptable. Can we use the

00:06:52.509 --> 00:06:54.490
Tom the thief example from the article? Let's

00:06:54.490 --> 00:06:58.009
do it. So say we have a rule. A person is punished

00:06:58.009 --> 00:07:00.709
if they're a thief unless they are a minor. OK.

00:07:01.089 --> 00:07:04.569
We tell the system, Tom is a thief. It checks

00:07:04.569 --> 00:07:06.790
if he's a minor, finds no evidence of it, so

00:07:06.790 --> 00:07:09.930
it assumes he's not. Verdict. Tom is punished.

00:07:10.230 --> 00:07:12.709
So we jump to that conclusion, but then what

00:07:12.709 --> 00:07:15.569
if we learn he's 15? We input that fact. The

00:07:15.569 --> 00:07:17.850
system immediately withdraws the punishment conclusion

00:07:17.850 --> 00:07:20.470
because now the minor condition is met. The new

00:07:20.470 --> 00:07:23.970
verdict is Tom is rehabilitated. And if we learn

00:07:23.970 --> 00:07:27.089
he's a violent offender. We add a rule that violence

00:07:27.089 --> 00:07:30.410
defeats rehabilitation. The system reevaluates,

00:07:30.550 --> 00:07:32.810
the exception triggers, and the punishment is

00:07:32.810 --> 00:07:35.189
reinstated. It's just arguing with itself. It's

00:07:35.189 --> 00:07:37.230
bouncing between rules and exceptions. Exactly.

00:07:37.310 --> 00:07:39.829
And that exact tendency to jump to conclusions

00:07:39.829 --> 00:07:42.870
and revise them later isn't a flaw. It actually

00:07:42.870 --> 00:07:45.009
perfectly aligns with how the human brain works.

00:07:45.209 --> 00:07:47.290
Really? Yeah. Cognitive scientists like Paul

00:07:47.290 --> 00:07:49.149
Thagard and Keith Stenning, who are cited in

00:07:49.149 --> 00:07:51.610
The Source, argue that this is psychologically

00:07:51.610 --> 00:07:54.269
plausible. Because humans use defaults and exceptions

00:07:54.269 --> 00:07:56.860
all the time. Right. If I say bird, you assume

00:07:56.860 --> 00:07:59.199
it flies. If I say penguin, you pull up the exception.

00:07:59.379 --> 00:08:02.019
You don't crash. And Stenning even says this

00:08:02.019 --> 00:08:04.620
kind of backward reasoning has an appealing neural

00:08:04.620 --> 00:08:07.500
implementation. That makes total sense. So instead

00:08:07.500 --> 00:08:10.060
of writing endless if -then loops to cover every

00:08:10.060 --> 00:08:12.620
single edge case, we are basically giving the

00:08:12.620 --> 00:08:15.420
computer a law degree. like feeding it the Civil

00:08:15.420 --> 00:08:17.439
Code and letting it argue with itself to find

00:08:17.439 --> 00:08:19.379
the verdict. That is quite literally what they

00:08:19.379 --> 00:08:21.839
did. In the 1980s, they digitized the British

00:08:21.839 --> 00:08:24.459
Nationality Act using logic programming. Oh,

00:08:24.459 --> 00:08:27.319
wow. And today, Japan has a modern system called

00:08:27.319 --> 00:08:30.740
Proleg. Right, the prologue -based legal reasoning

00:08:30.740 --> 00:08:33.940
system. Yes. And it uses roughly 2 ,500 rules

00:08:33.940 --> 00:08:36.659
from the Japanese Civil Code. It models human

00:08:36.659 --> 00:08:39.299
legal arguments perfectly. But wait, if it's

00:08:39.299 --> 00:08:42.480
this good... Why did it kind of fade out? The

00:08:42.480 --> 00:08:44.360
text talks about the Japanese fifth -generation

00:08:44.360 --> 00:08:47.519
computer systems project, the FGCS, in the 80s.

00:08:47.620 --> 00:08:49.820
Yeah, they poured massive money into trying to

00:08:49.820 --> 00:08:52.200
use concurrent logic programming for parallel

00:08:52.200 --> 00:08:54.700
AI, meaning multiple processors working at the

00:08:54.700 --> 00:08:57.620
same time. But it failed. It did. It turns out,

00:08:58.000 --> 00:09:00.139
forcing logic to commit to choices in parallel

00:09:00.139 --> 00:09:03.159
hardware ruined the mathematical purity. It introduced

00:09:03.159 --> 00:09:06.080
timing bugs. And meanwhile, regular, general

00:09:06.080 --> 00:09:08.419
-purpose computers just got incredibly fast.

00:09:08.779 --> 00:09:10.840
So standard procedural programming just outpaced

00:09:10.840 --> 00:09:13.500
it in raw speed. Exactly. It couldn't compete.

00:09:13.820 --> 00:09:16.299
But the paradigm didn't die, right? Because the

00:09:16.299 --> 00:09:18.220
source text goes into how it splintered into

00:09:18.220 --> 00:09:22.440
these modern flavors. Like, uh... Datalog. Right.

00:09:22.519 --> 00:09:25.139
Datalog is huge for databases. Standard relational

00:09:25.139 --> 00:09:28.139
databases like SQL choke on recursive queries

00:09:28.139 --> 00:09:30.820
like finding an ancestor 15 generations back.

00:09:30.940 --> 00:09:32.639
Because it doesn't know how many times to loop.

00:09:33.159 --> 00:09:36.340
Exactly. But Datalog handles recursive rules

00:09:36.340 --> 00:09:39.580
natively and efficiently. And then there's ASP,

00:09:39.740 --> 00:09:42.320
answer set programming. The text describes it

00:09:42.320 --> 00:09:45.960
as a generate and pest method. Yes. ASP is amazing

00:09:45.960 --> 00:09:48.899
for constraint problems. Take map coloring. You

00:09:48.899 --> 00:09:51.559
need to color a map red, green, or blue, but

00:09:51.559 --> 00:09:53.500
no touching countries can be the same color.

00:09:53.759 --> 00:09:55.500
Procedurally, that's a nightmare of checking

00:09:55.500 --> 00:09:58.279
every single border. But in ASP, you just generate

00:09:58.279 --> 00:10:00.759
every possible color combination for the universe

00:10:00.759 --> 00:10:03.399
and then apply a constraint test to filter out

00:10:03.399 --> 00:10:05.700
any maps where touching countries share a color.

00:10:05.840 --> 00:10:07.899
Generate the universe, then destroy the ones

00:10:07.899 --> 00:10:10.419
that break the rules. That is wild. It's incredible.

00:10:10.379 --> 00:10:13.200
powerful. But the last one mentioned is inductive

00:10:13.200 --> 00:10:16.700
logic programming or ILP, and this one flips

00:10:16.700 --> 00:10:19.960
the script completely. It does. ILP is a form

00:10:19.960 --> 00:10:22.360
of machine learning. Instead of giving it the

00:10:22.360 --> 00:10:24.559
rules, you give it the facts and the answers,

00:10:24.720 --> 00:10:26.860
positive and negative examples, and you make

00:10:26.860 --> 00:10:29.500
the machine learn the rules. If inductive logic

00:10:29.500 --> 00:10:32.580
programming is about inducing the general rules

00:10:32.580 --> 00:10:34.679
from a bunch of positive and negative examples,

00:10:35.340 --> 00:10:38.039
isn't that essentially exactly how you or I learned

00:10:38.039 --> 00:10:40.620
what a dog or a car was? when we were toddlers.

00:10:40.919 --> 00:10:42.980
It's exactly the same mechanism. You show a kid

00:10:42.980 --> 00:10:46.259
a dog, say dog. Show a cat, say not dog. Eventually,

00:10:46.480 --> 00:10:48.500
they induce the rule of what a dog is. That's

00:10:48.500 --> 00:10:51.179
fascinating. And Stuart Russell, a major AI researcher,

00:10:51.299 --> 00:10:53.419
notes in the text that this kind of concept invention

00:10:53.419 --> 00:10:57.539
is key to reaching true human -level AI. Wow.

00:10:58.379 --> 00:11:01.399
We have cut so much ground today. from the simple

00:11:01.399 --> 00:11:03.639
idea of just giving computers declarative rules,

00:11:04.080 --> 00:11:06.820
to the academic turf wars at MIT and Stanford,

00:11:07.259 --> 00:11:09.879
all the way to modeling the Japanese civil code

00:11:09.879 --> 00:11:12.600
and mimicking toddler cognition. So what does

00:11:12.600 --> 00:11:14.019
this all mean? What does this leave us today?

00:11:14.240 --> 00:11:15.919
Well, it leaves us with a really important thought

00:11:15.919 --> 00:11:18.779
to mull over. Right now, our modern AI, like

00:11:18.779 --> 00:11:21.340
massive neural networks, they're incredibly powerful,

00:11:21.639 --> 00:11:23.940
but they're completely unpredictable black boxes.

00:11:24.179 --> 00:11:26.830
Right, they just... guess based on statistics.

00:11:27.370 --> 00:11:30.389
Exactly. So if logic programming so closely mirrors

00:11:30.389 --> 00:11:33.690
human cognitive rules, defaults, and the legal

00:11:33.690 --> 00:11:36.750
system, might the future of AI require us to

00:11:36.750 --> 00:11:38.950
combine our modern neural networks with this

00:11:38.950 --> 00:11:41.509
classic rule -based logic programming? Just to

00:11:41.509 --> 00:11:44.429
keep the AI's reasoning explainable. Explainable

00:11:44.429 --> 00:11:47.019
and actually aligned with human law. blending

00:11:47.019 --> 00:11:49.220
the statistical guessing with an actual logical

00:11:49.220 --> 00:11:51.720
law degree. Love that. Well, be you the learner.

00:11:52.019 --> 00:11:53.559
Thank you so much for joining us on this deep

00:11:53.559 --> 00:11:55.519
dive. Stay curious, keep questioning how things

00:11:55.519 --> 00:11:56.500
work, and we'll see you next time.