WEBVTT

00:00:03.517 --> 00:00:10.517
This is the Convergent Science Network podcast. Leading researchers in the domain

00:00:10.517 --> 00:00:16.797
of neuroscience, brain theory and technology are interviewed by Paul Verschoor and Tony Prescott.

00:00:19.457 --> 00:00:25.297
This is Paul Verschoor with Tony Prescott and the Cognitive Science Network podcast.

00:00:26.077 --> 00:00:30.997
Today, we're here with Gary Marcus, who's one of the speakers at our BCPT summer school.

00:00:32.757 --> 00:00:40.037
Gary, you really tried to come up with an alternative view on how we can think

00:00:40.037 --> 00:00:45.037
about, if you want, cortical computation, contrasting your proposal to this

00:00:45.037 --> 00:00:46.877
notion of a canonical microcircuit.

00:00:47.397 --> 00:00:51.297
Right so so what what

00:00:51.297 --> 00:00:54.037
do you find problematic with this notion

00:00:54.037 --> 00:00:57.557
of a canonical microcircuit now do we define a canonical microcircuit

00:00:57.557 --> 00:01:03.517
it looks like a search for the one true ring that will rule them all and searches

00:01:03.517 --> 00:01:07.397
for one true ring to rule them all usually fail and i have some specific reasons

00:01:07.397 --> 00:01:11.837
why i'm skeptical about this one like i said the broadest level i don't think

00:01:11.837 --> 00:01:14.217
we're going to find one true ring i don't think we're going to find a silver

00:01:14.217 --> 00:01:16.157
bullet i think that the brain is really complicated,

00:01:16.317 --> 00:01:20.237
and I think you can see that complication at any level at which you try to analyze

00:01:20.237 --> 00:01:23.577
the brain, whether you're talking about connectivity between different areas

00:01:23.577 --> 00:01:27.697
or whether you're looking at the number of neuron types or all of the different

00:01:27.697 --> 00:01:31.277
proteins that are trafficked in synapses. There's enormous complexity.

00:01:31.777 --> 00:01:35.817
I think that in physics, people seek a kind of parsimony. They're trying to

00:01:35.817 --> 00:01:38.937
find a grand unified theory in a few tiny principles.

00:01:39.157 --> 00:01:43.077
I don't think that that's plausible for biology. There was a quote I used to

00:01:43.077 --> 00:01:44.357
like from Francis Crick.

00:01:44.397 --> 00:01:46.577
I may not get the words exactly right, but it's something like,

00:01:46.617 --> 00:01:52.197
parsimony is a valuable tool in physics.

00:01:52.477 --> 00:01:54.217
In biology, it's a dangerous implement.

00:01:54.837 --> 00:01:57.577
And then I met him once, and I told him that this was my favorite quote.

00:01:57.757 --> 00:02:02.597
And he said, yeah, in physics, we have laws. In biology, there are gadgets.

00:02:04.697 --> 00:02:08.457
And hoping that you're going to find the entire answer in a single gadget is,

00:02:08.457 --> 00:02:12.277
I think, unrealistic. But now, in just celebrating complexity,

00:02:12.517 --> 00:02:14.017
you might not gain understanding either.

00:02:14.357 --> 00:02:18.657
No, I'm not saying that, for example, the right way forward is to just have

00:02:18.657 --> 00:02:22.257
a big massive computer simulation where we don't know any of the operating principles

00:02:22.257 --> 00:02:24.477
and we just sort of worship the complexity.

00:02:24.637 --> 00:02:27.697
I think what we want are intermediate levels of explanation,

00:02:28.117 --> 00:02:33.917
intermediate levels of components that map on to other levels.

00:02:34.715 --> 00:02:38.115
In understanding a computer, you want to understand a series of levels starting

00:02:38.115 --> 00:02:40.715
from the transistor and moving up to AND gates and OR gates,

00:02:40.935 --> 00:02:44.655
microprocessors, operating system, software, etc.

00:02:45.015 --> 00:02:49.015
And I think that we probably want to find something similar in understanding

00:02:49.015 --> 00:02:50.715
probably any biological system.

00:02:50.955 --> 00:02:54.775
There are going to be low-level components that contribute to higher-level components.

00:02:54.775 --> 00:02:57.695
But then what are we trying to explain?

00:02:57.955 --> 00:03:03.475
Is it like, in the MAR sense, some overall functionality that we can push into

00:03:03.475 --> 00:03:07.835
a computational level of description, which implementation we have to identify?

00:03:08.535 --> 00:03:12.655
Or is it another set of functionalities that will decompose along a different

00:03:12.655 --> 00:03:14.815
set of levels of description?

00:03:15.315 --> 00:03:18.735
Well, I think that Marsense is a really good starting place where you want to

00:03:18.735 --> 00:03:23.775
understand the relation between computation and specific algorithms and how

00:03:23.775 --> 00:03:25.515
those specific algorithms are realized.

00:03:25.855 --> 00:03:30.115
And I think you might ask for even more than that, but I think that that's a

00:03:30.115 --> 00:03:31.195
really good starting point.

00:03:31.335 --> 00:03:36.495
I think sometimes in neuroscience, people lose sight of the mapping between these things.

00:03:36.535 --> 00:03:39.295
And I think that that's really what the field should be about. out so

00:03:39.295 --> 00:03:42.215
if you're talking about like mapping the whole brain that's a great thing

00:03:42.215 --> 00:03:45.075
to do but mapping the whole brain isn't by itself a question about

00:03:45.075 --> 00:03:47.775
how that relates to computation and i think that we

00:03:47.775 --> 00:03:50.655
need to keep our eye on that ball what we're trying to do is to find

00:03:50.655 --> 00:03:54.575
let's say motifs that are represented neurally that are repeated over and over

00:03:54.575 --> 00:03:59.315
again and how those connect in turn to the computational right but so let's

00:03:59.315 --> 00:04:04.135
set up at the context of the discussion if we then want to let's say declare

00:04:04.135 --> 00:04:08.035
some higher level functions that we want to explain in the anti-implementational terms,

00:04:08.235 --> 00:04:11.095
what are the higher level functions that we need to explain?

00:04:11.850 --> 00:04:16.590
Well, we don't fully know that. I mean, I think we're, I use this metaphor in

00:04:16.590 --> 00:04:19.830
the talk about we're working from two ends of the tunnel and we're trying to meet in the middle.

00:04:19.870 --> 00:04:25.190
The truth is we don't have a well-defined starting point on the higher level cognition side.

00:04:25.350 --> 00:04:28.210
We have guesses. the place where

00:04:28.210 --> 00:04:31.250
we best have answers are like we know some very specific

00:04:31.250 --> 00:04:34.230
things about individual channels and things like that

00:04:34.230 --> 00:04:36.990
on the neural side and then on the other side we

00:04:36.990 --> 00:04:39.690
have some guesses about what's going on cognition we don't know for

00:04:39.690 --> 00:04:43.050
sure I can give you my own set of guesses so for example

00:04:43.050 --> 00:04:47.610
I wrote a book called the algebraic mind and I laid out basically what I would

00:04:47.610 --> 00:04:51.410
take to be the tenets of symbol manipulation and said these are these are non-negotiable

00:04:51.410 --> 00:04:54.810
these are things as far as I can tell really have to be part of any theory of

00:04:54.810 --> 00:04:59.650
mine and that included the ability to represent variables that you can instantiate

00:04:59.650 --> 00:05:01.650
with particular values at particular times,

00:05:01.770 --> 00:05:03.810
to have operations over those variables,

00:05:04.070 --> 00:05:09.330
to have structured representation so A, B is not the same as B, A,

00:05:09.650 --> 00:05:14.710
to represent a type token distinction, and ultimately to represent tree structures.

00:05:14.910 --> 00:05:19.390
Now, I wrote that book almost 15 years ago, I recant one of those claims to

00:05:19.390 --> 00:05:22.710
some degree, which is I don't think that the human mind actually has the ability

00:05:22.710 --> 00:05:24.450
to represent arbitrary tree structures.

00:05:24.510 --> 00:05:28.230
It'd be really handy if we had it and computers make use of that all the time.

00:05:28.350 --> 00:05:31.310
So the directory structure for your files, for example, is a tree structure.

00:05:32.050 --> 00:05:35.710
I don't think that we can do that in unbounded ways. And there's some psycholinguistic

00:05:35.710 --> 00:05:37.810
phenomena that suggest to me

00:05:37.810 --> 00:05:41.350
that we can't like sort of in the mind's eye apprehend the full sentence.

00:05:41.530 --> 00:05:44.610
It's analogous to these change blind experiments where, you know,

00:05:44.630 --> 00:05:49.050
you see a parking lot full of soldiers and there's a jet plane and in between

00:05:49.050 --> 00:05:52.650
frames with a mask jet engine comes and goes and you don't even notice because

00:05:52.650 --> 00:05:54.470
you're watching the soldiers and the plane and so forth.

00:05:54.590 --> 00:05:57.930
So you have the illusion of representing the image as a whole,

00:05:58.010 --> 00:06:01.230
but we don't really. We represent pieces of it and we can reconstruct some of

00:06:01.230 --> 00:06:02.530
it, hoping that the world is stable.

00:06:02.930 --> 00:06:06.950
I think that in representing sentences, we don't really have the full sentence in our head.

00:06:07.070 --> 00:06:11.690
And so you're subject to kind of linguistic illusions that are like optical illusions.

00:06:11.850 --> 00:06:15.350
Like I can say the sentence, more people have been to Russia than I have.

00:06:15.710 --> 00:06:18.610
And ostensibly it's a reasonable sentence, but you sit there,

00:06:18.710 --> 00:06:20.670
you realize it's what's called ellipsis. Something's missing.

00:06:20.710 --> 00:06:22.310
More people have been to Russia than I have.

00:06:22.650 --> 00:06:25.450
Been to Russia, that doesn't actually make sense. But you don't immediately

00:06:25.450 --> 00:06:30.730
notice the problematic nature of that sentence because you don't really have

00:06:30.730 --> 00:06:31.910
the full sentence in your head.

00:06:32.555 --> 00:06:37.475
Take another example. If I say it was the boxer that the sailor loved or something

00:06:37.475 --> 00:06:40.315
like that, you can get confused pretty quickly about what the relations are

00:06:40.315 --> 00:06:43.135
between the particular actors in the scene.

00:06:43.295 --> 00:06:47.155
If you really had a full tree structure in your head, you probably wouldn't get confused.

00:06:47.415 --> 00:06:52.295
Instead, what I argued in my book Kluge is that we don't have location addressable

00:06:52.295 --> 00:06:55.495
memory in our brains the way that computers do. So in a computer,

00:06:55.535 --> 00:06:56.915
you have essentially a set of

00:06:56.915 --> 00:06:59.915
safe deposit boxes that are numbered and you store things in those boxes.

00:07:00.075 --> 00:07:04.315
And once you put something in box 113, you can expect to get it back out again.

00:07:04.675 --> 00:07:08.375
And if you have that, then you can build tree structures in a very straightforward

00:07:08.375 --> 00:07:10.975
way where you just map the boxes onto the trees.

00:07:11.375 --> 00:07:15.475
But if you don't have location addressable memory, it's very difficult to actually

00:07:15.475 --> 00:07:19.195
represent a tree structure. So I think that one was actually an incorrect claim

00:07:19.195 --> 00:07:21.075
that I made in the Algebraic Mind.

00:07:21.175 --> 00:07:24.575
I stand by the others, and I would say there are other things too.

00:07:24.675 --> 00:07:27.935
I come from a language perspective, but if you were doing vision,

00:07:27.995 --> 00:07:29.235
you could have your own wish list.

00:07:29.335 --> 00:07:34.155
But those would start my wish list. And I would say that type tokens is one,

00:07:34.275 --> 00:07:38.215
the distinction between like this bottle of water and bottles of water in general

00:07:38.215 --> 00:07:42.555
is something that you need at some level in vision and that we know surprisingly

00:07:42.555 --> 00:07:44.055
little about how the brain does that.

00:07:44.195 --> 00:07:47.695
But again, I think it's non-negotiable, at least if you have human cognition,

00:07:47.855 --> 00:07:52.035
that's part of what you traffic in is the difference between a kind and a particular instance of a kind.

00:07:52.035 --> 00:07:54.695
Right so the example that you

00:07:54.695 --> 00:07:57.735
gave of our difficulty in processing these sentences with

00:07:57.735 --> 00:08:00.575
embedded clauses uh is one of the ones that uh

00:08:00.575 --> 00:08:05.075
jeffrey ellman i think used to motivate his simple recurrent network which you

00:08:05.075 --> 00:08:09.515
were quite critical of in your talk and he said look i have a network which

00:08:09.515 --> 00:08:14.215
can generate some things which look a bit like grammar and it has difficulties

00:08:14.215 --> 00:08:18.815
with particular types of constructions that are like the constructions that

00:08:18.815 --> 00:08:20.375
human minds have difficulty with.

00:08:21.115 --> 00:08:24.895
Now, you felt that the simple recurrent network wasn't powerful enough and that

00:08:24.895 --> 00:08:26.355
there needs to be something more.

00:08:26.555 --> 00:08:31.575
But in your talk, you weren't specific on how we go from the more computer-like

00:08:31.575 --> 00:08:37.555
FPGA-type mechanisms onto something which is neural circuitry.

00:08:37.695 --> 00:08:42.795
So have you got some examples in mind when you think of how that maps onto circuit designs?

00:08:43.315 --> 00:08:46.955
Well, I think we don't know enough to understand, say, a parser as a whole.

00:08:47.095 --> 00:08:50.815
So So what Ellman was trying to do in that model was to capture a lot about

00:08:50.815 --> 00:08:53.395
both syntax and semantics in one model.

00:08:53.735 --> 00:08:58.115
And I think too much, in fact. So he had a model where you could predict the

00:08:58.115 --> 00:09:01.655
next word in a sentence based on prior context. Yeah.

00:09:02.425 --> 00:09:06.165
Syntax and semantics weren't even explicitly represented. All you had were individual

00:09:06.165 --> 00:09:07.745
words that followed one another.

00:09:08.125 --> 00:09:13.225
And it's true that some of what it did had some superficial similarity to what

00:09:13.225 --> 00:09:16.105
people did, but I think it got it right for the wrong reasons.

00:09:16.245 --> 00:09:19.845
And I think it's possible to break down systems in many ways and think that

00:09:19.845 --> 00:09:21.845
you understood the system.

00:09:21.905 --> 00:09:24.805
But if you look at whether the system as a whole works, well,

00:09:24.805 --> 00:09:28.145
his system as a whole didn't work. There are lots of things about language it didn't really capture.

00:09:28.565 --> 00:09:32.445
And so I think people made more of it than they probably should.

00:09:32.645 --> 00:09:35.565
That's one part of your question. The other part of the question is,

00:09:35.665 --> 00:09:37.345
you know, where do we go from here?

00:09:37.445 --> 00:09:41.145
How do we handle these things? Part of what I'd say is we're not in a position

00:09:41.145 --> 00:09:45.465
to understand a complete parser now or a complete seeding segment or anything like that.

00:09:45.805 --> 00:09:50.005
What I'm arguing is that we need intermediate units before we can even hope

00:09:50.005 --> 00:09:51.065
to take on that question.

00:09:51.185 --> 00:09:55.905
So I think parsers, for example, have to do some type token kind of stuff.

00:09:56.025 --> 00:09:58.185
They certainly have to do a lot of variable binding.

00:09:58.825 --> 00:10:01.985
They're constantly doing variable binding. Even if the variable binding system

00:10:01.985 --> 00:10:04.105
itself is not perfect, they're doing a lot of it.

00:10:04.565 --> 00:10:09.405
They're binding syntax and semantics together and particular elements of the

00:10:09.405 --> 00:10:11.405
syntax trying to figure out the roles of things.

00:10:11.625 --> 00:10:15.745
And I don't think we're really going to be able to unravel that circuitry until

00:10:15.745 --> 00:10:19.945
we can at least say recognize the circuitry that underlies a binding and say,

00:10:20.125 --> 00:10:21.705
okay, it's invoked here in this way.

00:10:21.905 --> 00:10:25.105
Otherwise, it's just too unconstrained. I mean, you could think from a computer

00:10:25.105 --> 00:10:30.725
science perspective, having Lisp enables you to do variable binding and to represent tree structures.

00:10:31.025 --> 00:10:34.185
But once you have Lisp, then you can build lots of different parsers,

00:10:34.245 --> 00:10:37.745
starting from this corner or that corner, this kind of search or that kind of search.

00:10:38.005 --> 00:10:42.485
But you couldn't even understand the differences between those if you didn't

00:10:42.485 --> 00:10:45.585
first understand the basic elements, the atoms of the language.

00:10:45.905 --> 00:10:49.345
But before we get to that, Tony, because I think we made a bit of a jump now

00:10:49.345 --> 00:10:53.485
into the modeling exercise that I think we have a lot to discuss there,

00:10:53.485 --> 00:10:58.505
Because, Gary, you took a very specific line of attack, if you want,

00:10:58.685 --> 00:11:03.525
to substantiate this point that we also need to give more room to,

00:11:03.545 --> 00:11:06.365
let's say, a functional level of consideration that could guide,

00:11:06.485 --> 00:11:10.065
let's say, this more neuroscience-oriented exercise, to which I'm very sympathetic.

00:11:11.617 --> 00:11:15.457
Then your approach was to say, well, you know, we ended up with this intuition,

00:11:15.497 --> 00:11:20.277
if you want, in neuroscience of a canonical circuit, in this case, a neocortex.

00:11:20.517 --> 00:11:25.297
But you can also say the same claim would hold for other main structures in the brain.

00:11:25.657 --> 00:11:30.977
And what you wanted to say is, look, this is a misleading construct.

00:11:31.317 --> 00:11:35.117
It's a misleading line of thought in trying to understand the brain.

00:11:36.897 --> 00:11:42.897
So first, how would you define a canonical microcircuit? So what do you consider

00:11:42.897 --> 00:11:44.277
to be a canonical microcircuit?

00:11:44.817 --> 00:11:48.617
Why do you think it had such an impact in the field that you see it now as a

00:11:48.617 --> 00:11:50.957
dominant, let's say, paradigm, if you want?

00:11:51.277 --> 00:11:55.017
And then, of course, question three, what then is wrong with it?

00:11:56.057 --> 00:11:59.437
Well, I'm not going to try to define it. It's not my term, and I don't think

00:11:59.437 --> 00:12:01.517
it's a satisfactory approach.

00:12:01.677 --> 00:12:05.537
But I think the notion is supposed to be that there's one kind of circuit that's

00:12:05.537 --> 00:12:09.677
repeated throughout the cortex, that you'll see many instantiations of it,

00:12:09.697 --> 00:12:13.517
and that experience will tune different instantiations in different ways,

00:12:13.597 --> 00:12:15.017
but that they're all at some level identical.

00:12:15.457 --> 00:12:20.577
I think this comes from a superficial reading of anatomy and also maybe from

00:12:20.577 --> 00:12:25.257
a kind of particular scientific aesthetic, let's say.

00:12:25.377 --> 00:12:31.437
So the superficial reading of anatomy is the cortex is relatively uniform throughout its extent.

00:12:31.657 --> 00:12:35.277
Of course, if you say that in a room full of anatomists, they'll all get exercise

00:12:35.277 --> 00:12:36.537
and say, well, that's not really true.

00:12:36.697 --> 00:12:40.037
I mean, I look at this area and that area and they're very different to me.

00:12:40.437 --> 00:12:44.377
But if you just looked under a magnifying glass and you weren't an expert,

00:12:44.437 --> 00:12:49.437
you might reasonably say Broca's area and occipital cortex, they don't really look that different.

00:12:49.577 --> 00:12:51.597
They don't look as different as you might have a priori expected,

00:12:51.937 --> 00:12:54.837
given how different vision and language turn out to be.

00:12:55.697 --> 00:12:59.677
And so I think it starts from that. Then I think there's an aesthetic.

00:12:59.837 --> 00:13:04.377
I think people are looking for kind of simple principles to explain the brain.

00:13:04.497 --> 00:13:07.977
I don't see any reason to think that we're actually going to get a few simple

00:13:07.977 --> 00:13:10.497
principles to explain the brain, that that's going to be sufficient. But I think,

00:13:10.996 --> 00:13:14.536
A lot of people are attracted to those kinds of theories. And this is not something

00:13:14.536 --> 00:13:15.436
unique to neuroscience.

00:13:15.596 --> 00:13:18.696
So people are looking for that in physics. There are actually arguments in physics

00:13:18.696 --> 00:13:19.796
about whether it's plausible.

00:13:20.096 --> 00:13:23.896
In linguistics, Chomsky lately, of all people, has been looking for a kind of

00:13:23.896 --> 00:13:26.236
explanation for linguistics in a few principles.

00:13:26.396 --> 00:13:29.416
I don't think he's gotten a lot of mileage out of it. He's gotten a lot of followers,

00:13:29.556 --> 00:13:32.696
but I don't think that he's made a lot of convincing progress doing that.

00:13:32.876 --> 00:13:35.636
But I see it over and over again in lots of different fields.

00:13:36.076 --> 00:13:39.916
In economics, people start with this model of rational man. They think this

00:13:39.916 --> 00:13:42.276
one principle is going to explain things that it doesn't really.

00:13:42.616 --> 00:13:47.336
Yeah, but wait, there's an issue now, right? I don't think you're really denying the anatomy as such.

00:13:47.496 --> 00:13:51.116
I mean, in that sense, if you do inject in the thalamus, you will see that most

00:13:51.116 --> 00:13:56.876
projections end up in layer four of a six-layered cortex in our case, right?

00:13:57.176 --> 00:14:01.316
Well, I won't deny that there's a common background, let's say, of similarity.

00:14:01.796 --> 00:14:04.956
But a The common background of similarity is not the same thing as identity.

00:14:05.736 --> 00:14:08.316
Exactly right. So I think it hinges on that issue of identity,

00:14:08.596 --> 00:14:10.956
right? It does. So you can think like the hand and the foot,

00:14:11.016 --> 00:14:12.336
they're clearly genetically related.

00:14:12.516 --> 00:14:17.136
I used to know the number, like 95% of the genes, maybe this is in another organism,

00:14:17.236 --> 00:14:21.436
but there's a very heavy overlap between the hand and the foot in terms of their genes.

00:14:21.556 --> 00:14:24.876
But there's also some fine grain tuning that makes them do different things.

00:14:25.716 --> 00:14:29.116
And you might expect the same thing for cortical circuits. In some ways,

00:14:29.136 --> 00:14:31.396
my talk is a meditation on that thought.

00:14:31.396 --> 00:14:36.196
I mean, I could have arranged it differently, but if you assume that duplication

00:14:36.196 --> 00:14:38.656
and divergence is kind of the dominant paradigm of evolution,

00:14:38.956 --> 00:14:42.396
you've got some set of genes, and there are copies, and that copy allows you

00:14:42.396 --> 00:14:46.076
to build something new, whether it's a new photoreceptor with a new wavelength

00:14:46.076 --> 00:14:52.116
distribution, or it's a new vertebrae, whatever it is, there's lots of duplication and divergence.

00:14:52.116 --> 00:14:56.796
And if you saw that in the cortex, then what you might see is from a distance,

00:14:56.916 --> 00:14:58.116
all this stuff looks the same.

00:14:58.236 --> 00:15:03.336
But if you zeroed in, you might see the kind of tweakings that make a hand different

00:15:03.336 --> 00:15:06.336
from a foot. So what you're saying to conclude this bit then is to say,

00:15:06.456 --> 00:15:09.436
look, we might have a common template.

00:15:09.976 --> 00:15:13.496
But in its ultimate expression, there's a lot of variability.

00:15:13.856 --> 00:15:16.776
And if we want to understand function, we must actually focus on that variability

00:15:16.776 --> 00:15:19.036
as opposed to these common templates.

00:15:19.396 --> 00:15:21.336
Well, or the conjoined function of the two.

00:15:22.116 --> 00:15:25.236
I don't want to ignore the template. I mean, a good example here,

00:15:25.316 --> 00:15:28.956
though, might be like if I gave you a breadboard with transistors and lights

00:15:28.956 --> 00:15:32.056
and so forth, you could build an AND gate or an OR gate.

00:15:32.216 --> 00:15:34.836
Or if I had enough of them, you could build all kinds of gates and you could

00:15:34.836 --> 00:15:36.516
say, well, it's just one breadboard.

00:15:36.676 --> 00:15:42.196
But the logical or functional consequences of a breadboard where you have slightly

00:15:42.196 --> 00:15:43.416
different wires are immense.

00:15:43.656 --> 00:15:47.456
I don't know if you probably are old enough to have played with these old RadioShack

00:15:47.456 --> 00:15:51.336
kits with the springs and the wires and stuff like that. So in some sense,

00:15:51.396 --> 00:15:54.316
you had a template, they would call them like the 150 in one kit.

00:15:54.376 --> 00:15:56.916
You could build a radio or an

00:15:56.916 --> 00:16:01.716
alarm or whatever by just instantiating that template in different ways.

00:16:01.796 --> 00:16:06.276
And all the world's difference in the structural ways led to the functional

00:16:06.276 --> 00:16:07.136
difference. Right, exactly.

00:16:08.043 --> 00:16:12.063
One of the themes of this week has been, because of the complexity that we see

00:16:12.063 --> 00:16:13.423
in an area like neocortex,

00:16:13.663 --> 00:16:19.423
let's look across species and see how other animals' cortex is organized,

00:16:19.643 --> 00:16:25.003
and to see if there are common principles and perhaps extrapolate from the variety

00:16:25.003 --> 00:16:28.763
that we see to what might have been some ancestral form of cortex,

00:16:29.083 --> 00:16:33.763
which I think we all will expect will be simpler and will have less variety.

00:16:33.763 --> 00:16:38.503
And, you know, John Cass was talking about maybe five areas as opposed to,

00:16:38.523 --> 00:16:41.403
you know, dozens of areas in primates.

00:16:41.683 --> 00:16:45.663
So one strategy to address the problem you're describing would be to say,

00:16:45.723 --> 00:16:49.363
let's not try and build a canonical microcircuit based on mouse brain,

00:16:49.503 --> 00:16:54.483
but let's think about what the original first mammal microcircuit might have been.

00:16:54.483 --> 00:16:58.183
And perhaps there were only a small number of different microcircuits in cortex.

00:16:58.563 --> 00:17:02.983
And once we've solved that problem, then we could use it to address problems

00:17:02.983 --> 00:17:04.643
that mammals commonly have.

00:17:04.823 --> 00:17:09.703
Then we can think about how you modify that circuit to get towards human cognition.

00:17:10.003 --> 00:17:11.223
Is that a viable strategy?

00:17:11.623 --> 00:17:15.303
I think so. And you could look at, for example, central pattern generators.

00:17:15.483 --> 00:17:18.603
And a couple of people have made suggestions, like Sten Grillner we were mentioning a minute ago.

00:17:19.323 --> 00:17:23.063
You could look at central pattern generators and ask, just there you could say,

00:17:23.063 --> 00:17:25.303
Has there been duplication and divergence in that?

00:17:25.423 --> 00:17:28.583
Are there different variations on central pattern generators that you could

00:17:28.583 --> 00:17:30.343
use to do different kinds of computations?

00:17:30.883 --> 00:17:33.543
I don't know that anybody's pursued that in any great detail,

00:17:33.663 --> 00:17:36.223
but that's kind of a version of the strategy that you're talking about.

00:17:36.303 --> 00:17:38.863
And you could do that with coupled oscillators.

00:17:39.963 --> 00:17:44.563
There are lots of domains in which you could say, here is an ancestral circuit.

00:17:44.683 --> 00:17:46.323
What's happened to that ancestral circuit?

00:17:46.563 --> 00:17:49.103
I mean, in some sense, people are trying to map out vision that way.

00:17:49.163 --> 00:17:51.723
So we know that there's PAC-6 at the top of this cascade.

00:17:52.407 --> 00:17:56.047
And in some creatures, it goes on to guide the construction of a compound eye.

00:17:56.187 --> 00:17:59.867
And in some, it constructs a mammalian eye. And there are these beautiful experiments

00:17:59.867 --> 00:18:03.507
that Walter Goering did where you take PAC-6 from a mouse, you express it in

00:18:03.507 --> 00:18:06.107
a fly, and you get an eye in the fly.

00:18:06.127 --> 00:18:09.287
You put in the fly's antenna, but it's not a mouse eye, it's a fly's eye.

00:18:09.287 --> 00:18:13.547
And so that says there's a very ancient code here that specifies something like

00:18:13.547 --> 00:18:15.027
build your light sensitive organ here.

00:18:15.167 --> 00:18:19.227
And then, you know, the cascades diverge and people, because they're more accessible

00:18:19.227 --> 00:18:23.547
tools, they are starting to really work out, you know, when did this gene change?

00:18:23.727 --> 00:18:27.687
When did these downstream genes change? And I would like to see us do the same in human.

00:18:27.767 --> 00:18:32.847
And the big problem, which Asif mentioned in his talk, is that we're mostly

00:18:32.847 --> 00:18:36.087
limited to kind of comparative methods with people.

00:18:36.207 --> 00:18:39.967
And it turns out that the nearest neighbor species, which is the one you'd think

00:18:39.967 --> 00:18:42.147
you want to study, doesn't have language.

00:18:42.187 --> 00:18:46.067
And language is obviously pivotal in our experience. So it's a bit unfortunate

00:18:46.067 --> 00:18:51.767
from our perspective as scientists that we can't look at chimps and have sort

00:18:51.767 --> 00:18:55.507
of 80% of language there and ask what happened with their 80% of language.

00:18:55.587 --> 00:18:59.047
At some level, it's hard, but at some level, I think that's exactly right,

00:18:59.127 --> 00:19:02.127
that it should be one of the prongs of the research strategy is to try to figure

00:19:02.127 --> 00:19:05.967
out basically a phylogeny of computational or cortical circuit types.

00:19:06.047 --> 00:19:07.167
I think that that's essential. potential.

00:19:07.427 --> 00:19:11.127
In fact, maybe the best we'll be able to do with language is to figure out these

00:19:11.127 --> 00:19:14.387
are the circuits that we share with mammals, or maybe these are the ones we

00:19:14.387 --> 00:19:16.267
share with vertebrates, these are the ones we share with mammals,

00:19:16.387 --> 00:19:19.387
these are the ones with primates, and there are these one or two that are different.

00:19:19.507 --> 00:19:23.127
And how do those one or two that we don't see attested, and we may not even

00:19:23.127 --> 00:19:27.187
be able to see them until we really get the EM under our belts in the right

00:19:27.187 --> 00:19:29.687
way, but eventually we might find one or two that are new.

00:19:30.027 --> 00:19:32.947
And it's really not going to be that language is those one or two,

00:19:33.007 --> 00:19:35.607
but how they work together with with the rest of that phylogeny, right?

00:19:35.647 --> 00:19:40.107
Language is gonna put together a whole bunch, in my view, of different cortical

00:19:40.107 --> 00:19:43.947
circuit types that are all maybe going back to one or two common ancestors,

00:19:44.067 --> 00:19:46.607
but are really different versions of those, and that's what we need.

00:19:46.727 --> 00:19:51.407
So now if you try to figure out the properties of this archetypical mammalian

00:19:51.407 --> 00:19:54.507
brain, which actually is the target of our own work.

00:19:55.180 --> 00:19:59.880
Then in some sense, you also showed in your talk that there are certain,

00:20:00.020 --> 00:20:04.600
let's say, theoretical approaches, like the recurrent networks of Elman you

00:20:04.600 --> 00:20:07.200
mentioned and so on, that are not going to get us there, right?

00:20:07.240 --> 00:20:10.740
So you were saying that you were making the point there that in the theoretical

00:20:10.740 --> 00:20:15.380
approaches, at least the ones you presented today, something fundamental is missing.

00:20:15.740 --> 00:20:19.980
So what's that? What's missing there? I mean, for me, the biggest thing that's

00:20:19.980 --> 00:20:21.780
missing is an understanding of variable binding.

00:20:21.780 --> 00:20:27.340
So I think we as a collective field of neuroscientists and computational neuroscientists

00:20:27.340 --> 00:20:31.320
and psychologists have a pretty good grip on hierarchical feature detection.

00:20:31.520 --> 00:20:34.560
We don't know everything that there is to know about it, but we have very good

00:20:34.560 --> 00:20:35.760
reason to think that it happens.

00:20:35.900 --> 00:20:38.660
We have some notion about some of the tricks you might need,

00:20:38.780 --> 00:20:44.040
like divisive normalization and so forth, to make it all work out. We have a grip on it.

00:20:44.060 --> 00:20:47.760
We have some idea of where it might live, how to build computer models of it.

00:20:47.920 --> 00:20:51.460
We don't have anything like that for variable binding. We have a few kind of

00:20:51.460 --> 00:20:54.140
stray voices speaking in the wilderness about it.

00:20:54.180 --> 00:20:57.940
But I think we need it. I think variable binding is as important to higher-level

00:20:57.940 --> 00:21:00.600
cognition as hierarchical feature perception is division.

00:21:00.980 --> 00:21:05.780
So you would be saying these hierarchical feature detection approaches might

00:21:05.780 --> 00:21:09.020
be nice if you talk about perception. perception, but if we start to talk about

00:21:09.020 --> 00:21:12.700
higher level cognition or feeding into language, it's not going to follow the

00:21:12.700 --> 00:21:14.000
same principles. Is that right?

00:21:14.180 --> 00:21:17.800
I don't want to overstate that. I think that hierarchical feature perception

00:21:17.800 --> 00:21:20.540
plays a role in, say, speech perception.

00:21:20.820 --> 00:21:24.220
So it's not that language doesn't use this stuff. I'm basically talking about

00:21:24.220 --> 00:21:25.760
like the Hubel and Wiesel ideas here.

00:21:26.080 --> 00:21:29.420
It's not that I don't think that that has any role, but I think there's something else too.

00:21:29.560 --> 00:21:33.880
So it'd be like, I'm not sure what analogy, it'd be like, I'm trying to do math

00:21:33.880 --> 00:21:37.380
and I've already got addition down and addition is great. I'm never going to

00:21:37.380 --> 00:21:40.580
get rid of addition, but I need some multiplication and maybe some square roots too.

00:21:40.700 --> 00:21:43.680
And we don't really know so much about how to do the square roots.

00:21:43.720 --> 00:21:44.780
We have a hint about the multiplication.

00:21:44.900 --> 00:21:47.500
We're really good at the addition, but we need a bit of multiplication.

00:21:48.212 --> 00:21:51.852
Bit of a better toolkit if we're going to grasp higher level cognition,

00:21:52.052 --> 00:21:53.712
not because it's not going to use these other tools.

00:21:53.892 --> 00:21:56.292
I mean, probably every good idea

00:21:56.292 --> 00:22:00.812
in developmental biology toolkits gets exploited in language in some way.

00:22:00.952 --> 00:22:05.972
And one of the findings, I'm not a big fan of fMRI, but one of the big findings

00:22:05.972 --> 00:22:10.392
from fMRI, I think, is that language is really distributed way across the brain.

00:22:10.472 --> 00:22:13.072
It's not just Broca's area, something like you read in a textbook.

00:22:13.292 --> 00:22:15.552
I think language is exploiting all

00:22:15.552 --> 00:22:22.092
kinds of things from theory of mind to hierarchical perception in general.

00:22:22.232 --> 00:22:27.472
But the part of it that we least understand is how we concatenate all the symbols

00:22:27.472 --> 00:22:29.432
together and manipulate them and move them around.

00:22:29.712 --> 00:22:33.552
And that's pretty essential. It's actually a dirty secret to these hierarchical

00:22:33.552 --> 00:22:35.212
models of when you talk about perception.

00:22:35.432 --> 00:22:42.312
That is that on the one hand, they scale really badly. Like if you talk about

00:22:42.312 --> 00:22:45.052
perception, you want to have different kinds of invariances,

00:22:45.172 --> 00:22:47.672
position, orientation, scale.

00:22:48.392 --> 00:22:52.112
But that means duplication of wires at a massive scale.

00:22:52.292 --> 00:22:57.792
So that means on anatomical rounds, it's very questionable whether even perceptual

00:22:57.792 --> 00:23:01.292
structures in the cortex can follow such a wiring strategy.

00:23:01.772 --> 00:23:05.572
And there's a second, I think, massive problem if you talk about variable binding,

00:23:05.792 --> 00:23:09.592
which is that ultimately it's a labeled line system, right?

00:23:09.592 --> 00:23:12.872
Right. So I think if you talk about higher level cognition and you think about

00:23:12.872 --> 00:23:16.932
working memory and you think about variable binding, how to achieve that with

00:23:16.932 --> 00:23:21.392
labeled lines, I think it's going to be really a long shot. I think we have to rethink.

00:23:21.672 --> 00:23:26.492
I don't know exactly what you mean by labeled lines, but I would say that for

00:23:26.492 --> 00:23:30.912
me, that's like a moment where you pause and you say, maybe one of my assumptions are wrong.

00:23:30.912 --> 00:23:33.812
I don't know precisely what you mean by the labeled lines, but I would say in

00:23:33.812 --> 00:23:39.072
general that it would be very hard to get from the set of ideas that are floating

00:23:39.072 --> 00:23:43.212
around computational neuroscience to variable binding. So there are two options.

00:23:43.452 --> 00:23:46.892
One is to say variable binding doesn't exist. And a lot of people actually have

00:23:46.892 --> 00:23:47.872
tried to make that argument.

00:23:48.092 --> 00:23:51.212
The other is to say we're missing something and this is a big clue.

00:23:51.372 --> 00:23:52.852
And that's the line that I'm taking. Right.

00:23:53.372 --> 00:23:56.632
The labeled line you can think of, take the Hubel-Mises model,

00:23:56.812 --> 00:23:58.332
simple cells to complex cells.

00:23:58.332 --> 00:24:01.492
Every single synapse that defines now

00:24:01.492 --> 00:24:04.552
the complex cell has to be uniquely labeled with

00:24:04.552 --> 00:24:07.772
a certain feature otherwise your complex cell cannot do this encoding right

00:24:07.772 --> 00:24:11.412
so but that means that synapse cannot be used for anything else anymore now

00:24:11.412 --> 00:24:16.292
you're stuck with it right so I do think wires are cheap and so I'm not completely

00:24:16.292 --> 00:24:19.812
convinced by your first part of your argument but I think the labeled line stuff

00:24:19.812 --> 00:24:23.712
actually relates to the issue that I'm worried about so labeled lines are fine

00:24:23.712 --> 00:24:27.652
if you're limited to encountering things that you've seen before so you can.

00:24:28.660 --> 00:24:33.460
You can grow a grandmother node literally for your grandmother through a bunch

00:24:33.460 --> 00:24:36.440
of experience and use that to recognize your grandmother.

00:24:36.520 --> 00:24:39.180
At least, you know, there's some data that suggests you might be able to do that.

00:24:39.280 --> 00:24:42.740
But what about the things that are unfamiliar to us, like the sentences that

00:24:42.740 --> 00:24:43.640
we haven't heard before?

00:24:44.180 --> 00:24:48.220
There, it starts to, just on exponential grounds, it becomes implausible,

00:24:48.240 --> 00:24:50.880
like that you've got a node for my last sentence, right?

00:24:50.940 --> 00:24:54.200
You have to construct something on the fly. You don't have a pre-labeled node

00:24:54.200 --> 00:24:58.600
for my last sentence or probably any of the sentences we said during the course of the podcast.

00:24:58.880 --> 00:25:01.780
You might have labeled nodes for some idioms like kick the bucket.

00:25:01.920 --> 00:25:06.000
You might really have a node in your brain that recognizes because it's an idiom.

00:25:06.000 --> 00:25:07.000
It's not compositional.

00:25:07.140 --> 00:25:10.640
You can't figure out the kick the bucket means death from the words kick and

00:25:10.640 --> 00:25:14.160
bucket. And so you may have some of those, but at the same time,

00:25:14.180 --> 00:25:15.140
language is generative.

00:25:15.320 --> 00:25:18.720
And there are some things that you can understand. In fact, a lot of them for

00:25:18.720 --> 00:25:23.740
which a solution that relies on a pre-labeled node is not going to cut it.

00:25:23.780 --> 00:25:27.240
There's got to be a way of constructing a novel representation on the fly.

00:25:27.400 --> 00:25:29.780
And I think we lack understanding of how that works. Exactly.

00:25:31.840 --> 00:25:37.300
The talk very much focused on the cortical microcircuit, but to get the functionality

00:25:37.300 --> 00:25:40.440
that you're looking for, then we might look outside cortex,

00:25:40.740 --> 00:25:45.040
and you might want to combine microcircuit types, assuming that the concept

00:25:45.040 --> 00:25:47.260
has some validity in basal ganglia,

00:25:47.820 --> 00:25:51.540
cerebellum, who knows amygdala, and together between these different structures,

00:25:51.580 --> 00:25:54.840
which everyone agrees have very different internal architecture,

00:25:55.060 --> 00:25:58.140
we might approach the kind of complexity that you're looking for.

00:25:58.300 --> 00:26:02.180
So yes, there's variability within the cortex, but it's variability on a theme,

00:26:02.280 --> 00:26:05.940
and you get the extra power that you need for something like language by having

00:26:05.940 --> 00:26:09.720
a systems theory of how the brain does something like linguistic cognition.

00:26:10.160 --> 00:26:12.780
Well, first of all, I totally agree we want a systems theory.

00:26:12.900 --> 00:26:17.160
And I think you're right to point out that in the talk I didn't give enough...

00:26:18.190 --> 00:26:21.970
Attention to these other systems. So I think you're exactly right.

00:26:22.050 --> 00:26:24.210
The cortex doesn't work on its own, right?

00:26:24.310 --> 00:26:26.630
I mean, if you get rid of the subcortical things, for example,

00:26:26.670 --> 00:26:28.790
the cortex is not going to be able to do its usual work.

00:26:29.050 --> 00:26:34.030
It's clear that the way that the cortex works is in interaction with the rest of the brain.

00:26:34.170 --> 00:26:39.070
And I think you're also right that some of what I'm calling differences in circuit

00:26:39.070 --> 00:26:42.670
types may have to do with basically how those other resources are accessed.

00:26:42.730 --> 00:26:46.170
So maybe Maybe two things that I want to call different circuits,

00:26:46.290 --> 00:26:50.130
what they really chiefly vary in is in how much they're calling these other

00:26:50.130 --> 00:26:52.830
systems and what they're doing with those other systems.

00:26:52.930 --> 00:26:56.410
So I'm very sympathetic with that. And I think I just did a poor job of discussing it.

00:26:56.450 --> 00:26:59.650
There's a written but not published version where we did a little bit better

00:26:59.650 --> 00:27:02.190
job of at least pointing to that possibility.

00:27:02.410 --> 00:27:07.030
So I'm completely sympathetic to it. But I think probably if there's any difference

00:27:07.030 --> 00:27:11.150
between me and you, it's that I think some of it's going to be at the level

00:27:11.150 --> 00:27:14.650
of like the difference between an AND and an OR gate that's actually local.

00:27:15.570 --> 00:27:18.830
I totally agree that it's not going to be all local. It's going to be really

00:27:18.830 --> 00:27:22.970
important stuff that isn't local that has to do with long range projections and so forth.

00:27:23.050 --> 00:27:27.370
And I agree that I didn't emphasize that enough in the lecture.

00:27:27.410 --> 00:27:30.390
But I also think that there are probably going to be some important differences

00:27:30.390 --> 00:27:35.690
within the immediate circuits as well. So one of the questions is how those

00:27:35.690 --> 00:27:36.610
differences come about.

00:27:36.770 --> 00:27:40.950
And the other theme this week has been not just comparing species,

00:27:41.110 --> 00:27:43.770
but also thinking about development within a species.

00:27:44.030 --> 00:27:48.550
And that was an aspect of your talk, because I think that you would agree that

00:27:48.550 --> 00:27:52.130
at some point in development, there is something relatively homogenous,

00:27:52.150 --> 00:27:56.750
and then we get heterogeneity increasing across cortex.

00:27:56.830 --> 00:28:02.730
And at some point, there's no going back. This sort of equipotentiality of cortex ceases to happen.

00:28:03.410 --> 00:28:08.810
And I think one slide that I left out, by the way, is just stuff from the Allen

00:28:08.810 --> 00:28:13.610
Institute showing that gene expression differs from front to back and that the

00:28:13.610 --> 00:28:16.710
gene expression is more similar the closer you are to cortex.

00:28:16.830 --> 00:28:19.370
And that's one clue. And there are many other clues we need to put together.

00:28:19.650 --> 00:28:23.070
But that's one clue that says there is some differentiation here that's important.

00:28:23.070 --> 00:28:26.270
And it's important to realize you don't need a lot of genes to be different

00:28:26.270 --> 00:28:28.250
in order to build something really different.

00:28:28.390 --> 00:28:31.490
So you can get sickle cell anemia from one nucleotide change,

00:28:31.750 --> 00:28:34.550
you could make the difference between an AND gate and an OR gate with,

00:28:34.590 --> 00:28:36.570
you know, probably a single gene or less.

00:28:36.710 --> 00:28:40.310
I'm not sure we literally have those things, but I think that relatively small

00:28:40.310 --> 00:28:45.270
numbers of genes can differ against a background of many shared genes and lead

00:28:45.270 --> 00:28:47.110
to significant differences.

00:28:47.610 --> 00:28:51.310
Yeah, I think, I mean, one of the, we had previous Terence Deacon as a speaker

00:28:51.310 --> 00:28:55.470
for BCBT, and he gave a talk which would have been very appropriate this week

00:28:55.470 --> 00:28:57.490
on this idea of relaxed selection,

00:28:57.810 --> 00:29:01.850
that there could be sort of junk DNA there which is floating around and it could

00:29:01.850 --> 00:29:03.650
be recruited for something like language.

00:29:05.137 --> 00:29:09.977
In order to create new functionality very rapidly and it might even be recruited

00:29:09.977 --> 00:29:15.097
without necessarily having a mutation It could be recruited by some epigenetic

00:29:15.097 --> 00:29:16.877
mechanism as we also heard this week.

00:29:16.997 --> 00:29:21.257
So What was you could maybe clarify for it?

00:29:21.297 --> 00:29:25.957
So a bit more is how much you think that these developmental processes that

00:29:25.957 --> 00:29:28.937
are creating these qualitatively different circuits?

00:29:29.097 --> 00:29:31.037
How much of that is?

00:29:31.277 --> 00:29:34.237
Not requiring any external input and how much?

00:29:35.237 --> 00:29:38.757
Would that be experience-dependent, given what we know about the importance

00:29:38.757 --> 00:29:42.657
of culture and motheries and all these things for something like language?

00:29:42.657 --> 00:29:43.897
I think it's hard to put a number on it.

00:29:43.977 --> 00:29:47.437
I think that the right way to think about it is that the molecular constraints

00:29:47.437 --> 00:29:50.557
and the activity dependence are both critically important and they play together.

00:29:51.357 --> 00:29:55.557
So you can't really say, I mean, like people give these heritability numbers

00:29:55.557 --> 00:29:58.577
or something. They say IQ is, you know, 70% heritable.

00:29:58.697 --> 00:30:02.217
All that really means is we can correlate this much of the variation with the

00:30:02.217 --> 00:30:06.417
genes. The molecular processes themselves don't really work that way.

00:30:06.477 --> 00:30:11.577
They're not separable in a way that you can kind of partial out variance in a meaningful way.

00:30:12.057 --> 00:30:15.357
I think the molecular processes are really important and must be understood,

00:30:15.457 --> 00:30:17.917
and the activity processes are really important and must be understood.

00:30:18.217 --> 00:30:22.837
I mentioned some alternative splicing mechanisms that might actually integrate

00:30:22.837 --> 00:30:24.957
these things, and I think we should be looking for that too.

00:30:25.577 --> 00:30:29.497
I, in my talks, tend to emphasize the molecular at the expense of the activity

00:30:29.497 --> 00:30:31.357
almost as a political thing.

00:30:31.477 --> 00:30:36.397
So I think that most of the field pays attention to the activity-driven part

00:30:36.397 --> 00:30:40.517
of the equation and doesn't pay as much attention to the molecular side of things.

00:30:40.637 --> 00:30:45.297
So in computational neuroscience, except in a couple of small areas like topographic maps...

00:30:45.756 --> 00:30:49.256
Where people actually know what the molecules are, people kind of ignore the

00:30:49.256 --> 00:30:52.976
molecular contributions, and they focus on the activity.

00:30:53.176 --> 00:30:56.536
And I don't doubt that the activity is really, really important. I mean, that's obvious.

00:30:57.176 --> 00:31:01.816
But I do doubt that we can get a complete model without having some grasp on

00:31:01.816 --> 00:31:04.636
the developmental molecular mechanisms and how they might shape,

00:31:04.696 --> 00:31:08.696
say, two patches of cortical tissue to be subtly but importantly different,

00:31:08.796 --> 00:31:11.576
such that they respond different ultimately to that

00:31:11.576 --> 00:31:15.196
activity i think part of the reason people are enthusiastic

00:31:15.196 --> 00:31:19.176
about activity dependent uh generation of

00:31:19.176 --> 00:31:22.276
structure is that you can perhaps more likely automate

00:31:22.276 --> 00:31:25.996
it so if you get the right powerful learning algorithm and this of course was

00:31:25.996 --> 00:31:30.856
so seductive and exciting about connectionist models and and i think more recently

00:31:30.856 --> 00:31:35.036
deep learning is the idea that if the learning algorithm is right the system

00:31:35.036 --> 00:31:39.276
will just build itself given the activity and and the The alternative,

00:31:39.336 --> 00:31:40.036
which you're suggesting,

00:31:40.196 --> 00:31:44.476
does imply for the people who want to model computationally what's going on,

00:31:44.516 --> 00:31:48.496
that we're going to have to understand and build more of the infrastructure

00:31:48.496 --> 00:31:52.836
before those kinds of activity-dependent learning systems can take off.

00:31:53.076 --> 00:31:57.416
I think you're exactly right at multiple levels. So one is, I think you're right about the appeal.

00:31:57.596 --> 00:32:00.856
I mean, it would be great if we could have one algorithm that ruled them all.

00:32:00.936 --> 00:32:06.496
That would make everybody's life easier. It'd be much easier to build much better AI, for example.

00:32:07.516 --> 00:32:10.336
But at the same time, just because it's easier doesn't mean that it's right.

00:32:11.034 --> 00:32:14.094
I think that if you look at what machine learning, for example,

00:32:14.094 --> 00:32:16.794
has been able to do well and what it hasn't been able to do well,

00:32:16.974 --> 00:32:21.054
there are domains where bottom-up learning just doesn't seem to do the trick.

00:32:21.234 --> 00:32:24.894
So we are still struggling with common sense reasoning. We are still struggling

00:32:24.894 --> 00:32:26.734
with natural language understanding.

00:32:26.934 --> 00:32:31.014
We've got machines that can read license plates very well. It's kind of a variation

00:32:31.014 --> 00:32:33.014
on the Hubel and Wiesel kind of stuff,

00:32:33.114 --> 00:32:35.914
but we don't have machines that can understand discourse. Of course.

00:32:36.034 --> 00:32:40.254
Something else I'm involved in now is trying to have a successor to the Turing test.

00:32:40.474 --> 00:32:45.014
And one of the proposals that I made is that we have a comprehension test where

00:32:45.014 --> 00:32:48.234
you ask a machine to watch a video and then answer questions like,

00:32:48.274 --> 00:32:51.334
why did Walter White take out a hit on Jesse or something like that.

00:32:51.394 --> 00:32:53.714
Stuff that would be easy for any 14-year-old.

00:32:53.814 --> 00:32:58.114
But so far, computers don't know how to do that, to watch some general kind

00:32:58.114 --> 00:33:00.794
of scene and understand what's going on. They could label things.

00:33:00.894 --> 00:33:03.894
They could say that's a person and that's a person, but they couldn't necessarily

00:33:03.894 --> 00:33:05.654
tell you much about the scene.

00:33:05.814 --> 00:33:10.734
And that's because I think people haven't been investing the time to get the

00:33:10.734 --> 00:33:12.934
basis structure from which you can do the learning.

00:33:12.934 --> 00:33:17.154
Another way I think about this, and I see you want to jump in,

00:33:17.234 --> 00:33:20.774
but the way I think about it is that a lot of work in developmental robotics

00:33:20.774 --> 00:33:22.554
basically starts with a blank slate.

00:33:22.754 --> 00:33:24.894
And I don't think that work has gotten us that far.

00:33:25.174 --> 00:33:29.554
The intuition that you'd like a robot that's embodied, I think, is a very good one.

00:33:29.814 --> 00:33:33.854
But I think that people shy away from what is fairly hard work,

00:33:33.974 --> 00:33:37.574
I think, to build the right basis set so that you can then go out and learn.

00:33:37.774 --> 00:33:41.534
But now, we shouldn't sell computational neuroscience. That's short,

00:33:41.634 --> 00:33:45.354
right? Because yes, even with the interest in activity-dependent processes,

00:33:45.594 --> 00:33:49.854
these are often studied on the basis of some defined circuit.

00:33:50.314 --> 00:33:55.474
And implicitly, that means that's the contribution of these Evo-Devo processes

00:33:55.474 --> 00:33:59.434
that give you this template, if you want, on which an activity-dependent process

00:33:59.434 --> 00:34:02.494
is sculpted, if you want, the ultimate functional circuit.

00:34:03.774 --> 00:34:05.494
Well, yes, but…,

00:34:06.623 --> 00:34:09.503
like the fundamental intuition i think or

00:34:09.503 --> 00:34:12.723
result from evo devo is about conservation and

00:34:12.723 --> 00:34:15.543
duplication and divergence that you have families of

00:34:15.543 --> 00:34:19.463
mechanisms that that are variations on themes like if i had to say there's one

00:34:19.463 --> 00:34:23.323
thing that evo devo has really shown us it's like you know hox genes for example

00:34:23.323 --> 00:34:26.623
getting used over and over again in lots of different kinds of contexts and

00:34:26.623 --> 00:34:31.063
i don't see any reflex of that intuition in computational neuroscience really

00:34:31.063 --> 00:34:33.483
anywhere except maybe be the topographic map.

00:34:34.123 --> 00:34:39.883
But now, so, okay, so we started with this notion of canonical microcircuits

00:34:39.883 --> 00:34:44.563
in European not really delivering on understanding higher level cognition,

00:34:44.763 --> 00:34:48.703
in particular issues around variable binding that you need in language, okay?

00:34:49.083 --> 00:34:55.063
But then what's now your counterproposal, right? What should we be looking at?

00:34:55.623 --> 00:35:01.023
I think we should be trying to find a set of circuits rather than one that probably

00:35:01.023 --> 00:35:03.603
have a family resemblance structure to one another.

00:35:03.843 --> 00:35:09.083
So in a computer, that set would start with things like ANDs and ORs and NANDs and XCORs.

00:35:09.203 --> 00:35:13.563
There's a family resemblance between those, but with very different computational repercussions.

00:35:13.943 --> 00:35:17.003
And ultimately, I think we're looking for something similar,

00:35:17.103 --> 00:35:19.163
maybe at that grain level, maybe a bit higher.

00:35:19.383 --> 00:35:24.403
I made some specific proposals, like we want mechanisms for understanding sequencing, for example.

00:35:24.603 --> 00:35:28.623
And that's a simplification, right? The sequencing itself is probably going

00:35:28.623 --> 00:35:32.643
to require that we understand things about working memory and copying things

00:35:32.643 --> 00:35:33.883
from buffers and so forth.

00:35:34.003 --> 00:35:39.103
So the top-down suggestions that I gave in my talk are really collapsing a number of levels.

00:35:39.283 --> 00:35:44.383
And I think really we need to iteratively do this process of finding intermediate computational.

00:35:46.098 --> 00:35:50.018
What can you do with a set of neurons or what does biology do with a set of neurons?

00:35:50.318 --> 00:35:54.158
What do you do with sets of sets of neurons? What do you do with sets of sets of sets and so forth?

00:35:54.358 --> 00:35:58.578
And so I think of a parser, for example, as being made up of very structured

00:35:58.578 --> 00:36:01.158
combinations of all of these kinds of processes.

00:36:01.738 --> 00:36:05.478
Yeah, but Gary, and sometimes you make a dual proposal, right?

00:36:05.498 --> 00:36:09.698
Because the one that you're saying, well, to make progress, we need some functional guidance.

00:36:09.878 --> 00:36:13.378
And I can give you a list of that. That's what you now sort of was elaborating.

00:36:13.918 --> 00:36:17.198
On the other hand, you're telling, you also told us, look, we should rethink

00:36:17.198 --> 00:36:24.978
cortex in terms of a more, let's say, configurable and variable substrate of a set of computations.

00:36:25.158 --> 00:36:29.238
I think these are two complementary proposals that you're making, or am I wrong here?

00:36:29.518 --> 00:36:33.318
Well, I see a connection between them, but I mean, it's an open empirical question.

00:36:33.438 --> 00:36:39.198
But the notion is that we're going to find motifs structurally in the cortex.

00:36:39.658 --> 00:36:43.818
Projects. The motifs may be variations on a theme, but there will be motifs.

00:36:44.478 --> 00:36:47.758
And I'm talking at the neuron level. I know there's interesting work at the

00:36:47.758 --> 00:36:52.518
sort of voxel level, but I think at the neuron level, we're going to find motifs,

00:36:52.578 --> 00:36:54.498
and those motifs are going to map onto functions.

00:36:54.698 --> 00:36:59.218
And we want to say, you know, there are 10 or 20 or 40 different motifs that

00:36:59.218 --> 00:37:02.298
come in these different flavors. These are the kinds of computations that they do.

00:37:02.758 --> 00:37:06.538
And once we have that level, then we can say, well, When you put those together,

00:37:06.638 --> 00:37:08.598
what kind of computations do you get out of there?

00:37:08.898 --> 00:37:14.278
But now, in some sense, you're advocating, I think, a position that actually

00:37:14.278 --> 00:37:18.498
is also coming out of the community that's pushing canonical microcircuits.

00:37:18.638 --> 00:37:22.438
Like for instance, Rodney Douglas, who has been together with Kevin Martin doing

00:37:22.438 --> 00:37:27.058
a lot of the anatomy on this, the notion of cortical canonical microcircuits

00:37:27.058 --> 00:37:32.338
is now suggesting that cortical circuits can be seen as finite state machines.

00:37:32.338 --> 00:37:38.058
So that means you have computational configurability given a standard hardware template.

00:37:38.897 --> 00:37:41.237
And that sounds very compatible to what you're having in mind.

00:37:41.377 --> 00:37:42.117
Or is there a difference?

00:37:42.497 --> 00:37:46.777
I mean, I think it's in the, in some sense, I think they're in the same family

00:37:46.777 --> 00:37:48.257
of hypotheses and in some sense not.

00:37:48.397 --> 00:37:52.977
So one hypothesis is you look around the brain and there are these configurable

00:37:52.977 --> 00:37:55.777
finite state automata in different places or something like that.

00:37:56.997 --> 00:38:00.717
Another is that the grain level of these circuits is smaller and they're not

00:38:00.717 --> 00:38:03.457
all finite state machines. They're doing other kinds of things.

00:38:03.597 --> 00:38:06.877
So for example, finite state machines don't have memory and memory is actually

00:38:06.877 --> 00:38:10.137
one of the components that I think is critical. And so if all you had was a

00:38:10.137 --> 00:38:12.737
bunch of finite state machines, it would be complicated.

00:38:12.897 --> 00:38:15.797
You might be able to pull the system out of it. I mean, you can think about

00:38:15.797 --> 00:38:18.717
Turing machines and so forth as a possible argument.

00:38:18.757 --> 00:38:21.457
Of course, this is the direction they would like to go then.

00:38:22.797 --> 00:38:26.717
My intuition is that's not the right way to go. I can't say for sure that it's wrong.

00:38:26.917 --> 00:38:30.657
And some level what I'm saying is that the empirical research direction needs

00:38:30.657 --> 00:38:34.737
to go towards itemizing these things or enumerating these things.

00:38:34.737 --> 00:38:39.917
So it is possible, but I think it is unlikely, for reasons I've been explaining,

00:38:40.137 --> 00:38:42.917
that you really will have this one circuit. You have many copies.

00:38:43.137 --> 00:38:46.557
This circuit might be adaptable to do different things. I think the part of

00:38:46.557 --> 00:38:51.117
their view that I'm most sympathetic to is this idea that you could essentially

00:38:51.117 --> 00:38:53.897
reconfigure that circuit to do different kinds of computations.

00:38:53.897 --> 00:38:59.037
Um another possibility is there more variations on themes the way i'm describing

00:38:59.037 --> 00:39:02.617
where maybe particular genes tell you to build this kind of gate versus that

00:39:02.617 --> 00:39:07.097
kind of gate or things like that they're certainly in the same school let me

00:39:07.097 --> 00:39:09.117
just say one other thing that's a different um.

00:39:09.897 --> 00:39:13.437
Thinking about that is different from saying well let's just build a

00:39:13.437 --> 00:39:16.437
map of the entire brain right and run the simulation and

00:39:16.437 --> 00:39:19.557
see what happens but if you do talk about these these primitive

00:39:19.557 --> 00:39:22.337
computational functions how big

00:39:22.337 --> 00:39:25.257
is that set in your opinion as a

00:39:25.257 --> 00:39:27.977
prediction right i don't know i i would say the

00:39:27.977 --> 00:39:30.757
lower bound is like the 10 or 20

00:39:30.757 --> 00:39:35.957
that i put up on the the screen and the upper bound is is you know sort of unknowable

00:39:35.957 --> 00:39:42.417
we we have to do the empirical work my my guess from looking at other parts

00:39:42.417 --> 00:39:48.157
of biology is that there are you know hundreds or maybe thousands but not

00:39:48.277 --> 00:39:53.397
hundreds of thousands, that the ones that are there, most of them get used a lot.

00:39:53.537 --> 00:39:56.277
You know, there might be a kind of Zipf's Law thing where, you know,

00:39:56.297 --> 00:39:59.957
a few of them get used only very rarely, and maybe those are actually critical for language.

00:40:00.177 --> 00:40:03.557
But a lot of motifs get used over and over again. I mean, that's sort of how

00:40:03.557 --> 00:40:07.697
biology tends to work, and that's maybe the best guidance we've got.

00:40:08.177 --> 00:40:10.397
Another way to think about it, I guess, would be coming from psychology.

00:40:10.897 --> 00:40:14.557
You could say, well, working memory is something you need in every computation,

00:40:14.757 --> 00:40:16.017
or practically every computation.

00:40:16.017 --> 00:40:20.337
Sequencing is something you need in a lot of computations normalization is something

00:40:20.337 --> 00:40:24.737
you need over and over again tree structures you might only need in planning

00:40:24.737 --> 00:40:29.017
and language and so you know you could try to get a handle on it that way mm-hmm.

00:40:30.588 --> 00:40:34.928
I understand where you're coming from, but I think that where we go from there,

00:40:35.008 --> 00:40:37.088
it becomes problematic in a way.

00:40:37.508 --> 00:40:41.548
Partly what you're saying is we don't know enough about the richness of the

00:40:41.548 --> 00:40:43.668
cortex to really understand how it operates.

00:40:43.868 --> 00:40:50.408
And that is feeding into a kind of frenzy of let's get more data on the brain.

00:40:50.588 --> 00:40:54.208
But I think at the other time you're saying, look, hang on, we don't understand

00:40:54.208 --> 00:40:57.408
enough of the data that we already have to do this effectively.

00:40:57.408 --> 00:41:00.188
You know so so i i think we're in

00:41:00.188 --> 00:41:03.268
a position now in our field

00:41:03.268 --> 00:41:06.068
where uh if you like the people

00:41:06.068 --> 00:41:09.148
that want to get more data on the brain are in a bit of an ascendancy

00:41:09.148 --> 00:41:14.028
and a lot of the money that's coming into the field is moving towards data gathering

00:41:14.028 --> 00:41:19.508
uh and there's a risk that the people that in the past who've been thinking

00:41:19.508 --> 00:41:25.208
about building functional models of this um that that approach is not being

00:41:25.208 --> 00:41:26.368
supported to the extent it was.

00:41:26.428 --> 00:41:29.708
Perhaps because we haven't succeeded, because we've been working at different

00:41:29.708 --> 00:41:33.808
levels of description and fighting battles between symbolism and connectionism,

00:41:33.808 --> 00:41:37.248
for instance, which really weren't helping the overall cause.

00:41:38.608 --> 00:41:44.688
So how do you see where we go from here? Because my concern, possibly yours, is that,

00:41:45.342 --> 00:41:48.582
alongside all this data collection, we need to build theories.

00:41:48.882 --> 00:41:51.282
We need to build theories at multiple levels of description.

00:41:51.642 --> 00:41:54.182
And I think at some point, we need a principle of parsimony,

00:41:54.302 --> 00:41:57.722
which says, look, we can't possibly include all the data.

00:41:57.862 --> 00:42:02.822
We have to leave some things out and see how far we can get with a subset of data.

00:42:03.182 --> 00:42:06.002
I mean, I'm basically very sympathetic to what you just said.

00:42:06.542 --> 00:42:12.462
The parts that I'm not 100% sympathetic is, I don't think we have all the data that we need now.

00:42:12.582 --> 00:42:14.742
I mean, I do think that we need to do more data collection. We will never have

00:42:14.742 --> 00:42:17.642
all the data. We probably never have all the data. I think there's some specific

00:42:17.642 --> 00:42:19.882
places where I would like to see more data.

00:42:20.762 --> 00:42:24.882
In particular, I would like to see more comparisons between different cortical areas.

00:42:25.282 --> 00:42:29.942
So I would like to see not a whole brain map, which I don't think we'd know

00:42:29.942 --> 00:42:34.842
what to do with, but like focus comparisons between some prefrontal areas and

00:42:34.842 --> 00:42:39.002
some motor areas and some occipital areas. What do they have in common? What do they have?

00:42:39.522 --> 00:42:43.102
What's different about them? And I think you need to map that at multiple levels.

00:42:43.102 --> 00:42:46.662
So I think it has to be at the neuron level.

00:42:46.882 --> 00:42:50.442
I think you have to have activity. I think you probably need to know a lot about

00:42:50.442 --> 00:42:53.782
protein expression, you know, all the way down to the synapse level.

00:42:54.022 --> 00:42:57.842
But I think the key there is that you want to look at different bits of cortex,

00:42:57.962 --> 00:43:01.422
a small number, and really try to understand what is uniform about them?

00:43:01.482 --> 00:43:03.862
What is different about them? How do the computations vary?

00:43:04.102 --> 00:43:07.862
And I think that should be the starting point where it's very focused on trying

00:43:07.862 --> 00:43:11.982
to ultimately give accounts of what computation is done there.

00:43:11.982 --> 00:43:15.622
So using like optogenetic techniques to probe these kinds of circuits and say,

00:43:15.722 --> 00:43:18.502
you know, if I alter the input, what happens?

00:43:18.582 --> 00:43:21.902
Does the same thing happen in occipital cortex as it happens in prefrontal cortex?

00:43:22.162 --> 00:43:25.862
There's lots of problems. This is not trivial to do, but in outline,

00:43:26.022 --> 00:43:29.902
that's what I would like to see done on the empirical side. And then I'm completely

00:43:29.902 --> 00:43:32.002
sympathetic on the theoretical side.

00:43:32.122 --> 00:43:35.422
I think that theory just doesn't have enough prestige in neuroscience,

00:43:35.662 --> 00:43:37.182
doesn't have enough money behind it.

00:43:37.242 --> 00:43:40.662
I don't think there are enough institutions in place to support theorists.

00:43:40.702 --> 00:43:45.202
I think theorists sort of are generally routed towards modeling very...

00:43:45.901 --> 00:43:49.141
Kind of narrow, straightjacketed pieces of empirical data. They're not given

00:43:49.141 --> 00:43:52.881
enough room to think broadly and not given enough prestige.

00:43:53.081 --> 00:43:57.961
And I think we need to build institutions to strengthen the theory side.

00:43:58.121 --> 00:44:01.821
I don't think that there's nearly enough for that relative to the tool building itself.

00:44:02.261 --> 00:44:07.061
But maybe the thing that's missing there as well for theory is that we always

00:44:07.061 --> 00:44:12.361
think about computation as opposed to behavior, because what really matters is behavior.

00:44:13.161 --> 00:44:16.781
I agree with that in the long run, but maybe not entirely in the short run.

00:44:16.901 --> 00:44:22.281
So my concern is that I don't think we can go from wiring diagrams to behavior

00:44:22.281 --> 00:44:24.981
without some intermediate theories of computation.

00:44:25.541 --> 00:44:32.941
So I really do think that the behavior is crucial. I worry that too about these

00:44:32.941 --> 00:44:36.901
brain initiatives and behaviors not paid attention to enough.

00:44:37.041 --> 00:44:40.921
But I think that we need to understand the primitives before we're going to

00:44:40.921 --> 00:44:42.621
have hope of understanding the behavior.

00:44:42.881 --> 00:44:47.761
So you couldn't understand how Microsoft Word works unless you had some theory

00:44:47.761 --> 00:44:51.701
of computation underlying it. You want to know about registers and subroutines

00:44:51.701 --> 00:44:56.601
and object-oriented programming languages and things like that.

00:44:56.681 --> 00:45:01.061
You need some intermediate things before you can understand some complex cognitive artifact.

00:45:01.741 --> 00:45:05.421
I take Microsoft Word to be a kind of cognitive artifact in the sense that it

00:45:05.421 --> 00:45:07.961
responds to different commands in different ways and so forth.

00:45:08.221 --> 00:45:10.621
It's sort of at the right grain level.

00:45:12.021 --> 00:45:15.801
And we just don't have that intermediate connectivity. I'm not sure if I agree with that.

00:45:15.801 --> 00:45:21.801
I mean, in terms of behaviors such as seen in classical conditioning or in operant

00:45:21.801 --> 00:45:22.861
conditioning, forging,

00:45:23.101 --> 00:45:28.801
right there, I think, I'm not saying it's a close case, but links to behavior

00:45:28.801 --> 00:45:33.201
are also established on theoretical grounds, and they're pretty coherent stories.

00:45:34.321 --> 00:45:38.701
Well, there's two things to say there. One is, I didn't have in mind retracting

00:45:38.701 --> 00:45:43.601
your gill when you're doing classical conditioning. I think we have most of

00:45:43.601 --> 00:45:47.101
the tools that we already need, but I don't think it's the kind of behavior that I had in mind.

00:45:47.201 --> 00:45:49.881
I was thinking about behavior like understanding a sentence,

00:45:49.961 --> 00:45:56.441
or foraging might be a good example, where you need a rich set of internal representations.

00:45:57.681 --> 00:46:01.401
For those, I think we need this firm computational grounding.

00:46:01.401 --> 00:46:04.681
The other thing I would say is I don't see theory in computation as at all exclusive.

00:46:04.881 --> 00:46:09.681
I see it as the theory that we're trying to develop is a theory that links the

00:46:09.681 --> 00:46:14.001
neurophysiology and so forth, neuroanatomy, with the computation.

00:46:14.261 --> 00:46:19.121
So the theory that I want to see us develop is really one that goes from the

00:46:19.121 --> 00:46:22.981
neural instantiation to the computation, ultimately to the behavior.

00:46:22.981 --> 00:46:27.401
It's just that I think we can't immediately go from the neural instantiation

00:46:27.401 --> 00:46:31.081
to the behavior in any meaningful way, because it's just too complicated without

00:46:31.081 --> 00:46:32.481
this intervening layer of explanation.

00:46:33.921 --> 00:46:37.581
And do you think we should be paying more attention to embodiment and society

00:46:37.581 --> 00:46:42.121
and trying to understand these systems, or do you think the focus on the brain is appropriate?

00:46:43.894 --> 00:46:47.854
I think those things are important. I'm not sure we're to the point yet where

00:46:47.854 --> 00:46:51.154
for the kinds of questions that I'm asking about, they're going to make a difference.

00:46:51.314 --> 00:46:55.274
I mean, ultimately, in the grand scheme of society, we want to understand how

00:46:55.274 --> 00:46:58.134
embodied people participate in society and so forth.

00:46:58.214 --> 00:47:00.894
But understanding social structure

00:47:00.894 --> 00:47:04.674
is not something where I think neuroscience has that much to say yet.

00:47:04.794 --> 00:47:08.974
So there's a field of neuroeconomics that tries to derive economic principles

00:47:08.974 --> 00:47:13.654
from neural wiring and so forth. I don't think we're really in a position to do that well yet.

00:47:13.894 --> 00:47:20.194
And language, the way that parent-child interactions scaffold children's language acquisition.

00:47:20.694 --> 00:47:25.274
I mean, how critical is that to our understanding of how humans gain language?

00:47:26.034 --> 00:47:30.374
For me, it's something that would come later. And I guess that's partly having

00:47:30.374 --> 00:47:32.974
to do with my background in language acquisition.

00:47:33.074 --> 00:47:36.894
I would say that all kids acquire language, even in a very broad range of social

00:47:36.894 --> 00:47:41.614
circumstances, ranging from ones where parents are kind of, the term I've heard

00:47:41.614 --> 00:47:44.294
is helicopter parents, where they're hovering around their kid,

00:47:44.434 --> 00:47:46.874
every utterance, they're, I'm a helicopter,

00:47:47.154 --> 00:47:49.854
I will disclose.

00:47:50.614 --> 00:47:53.294
And then there are parents that don't really interact with their kids,

00:47:53.354 --> 00:47:57.054
and the kids learn from their siblings, or mostly by observation and so forth.

00:47:57.194 --> 00:48:01.554
And the system is relatively robust to a very wide range of inputs.

00:48:01.554 --> 00:48:03.954
There's an interesting question. It's not fully robust.

00:48:04.134 --> 00:48:08.034
So kids that have parents that talk more have bigger vocabularies that might

00:48:08.034 --> 00:48:12.034
be partly genetic, which most people worry, and it's probably partly experiential.

00:48:12.034 --> 00:48:13.494
And I think those are interesting questions.

00:48:13.534 --> 00:48:17.734
But I don't think that we know enough about the basics of how the universal

00:48:17.734 --> 00:48:22.054
part of the system is put together to really be able to make sense yet of those

00:48:22.054 --> 00:48:25.834
kinds of things at a mechanistic level. So I still want to know how we represent

00:48:25.834 --> 00:48:27.454
one sentence in the brain.

00:48:27.634 --> 00:48:30.594
And once you can tell me that, then I'll move on to like, you know,

00:48:30.594 --> 00:48:32.534
why you learned this one a little bit faster than the other.

00:48:32.674 --> 00:48:36.514
But now, so to come back to the issue of behavior versus computation,

00:48:37.034 --> 00:48:42.694
from a methodological perspective, the three sources of information we have

00:48:42.694 --> 00:48:45.574
understanding mind and brain is anatomy, physiology, and behavior.

00:48:46.314 --> 00:48:53.014
And actually, what I believe is the correct methodology is to have a conversion

00:48:53.014 --> 00:48:56.534
validation of these sources of information on our model so we can identify what

00:48:56.534 --> 00:48:58.234
the computation is they perform, okay?

00:48:58.314 --> 00:49:01.914
As opposed to first identifying computation and then going to behavior.

00:49:03.254 --> 00:49:07.274
Well, I'm not sure that's a substantive difference. So,

00:49:08.443 --> 00:49:12.263
I agree we can't access the computation directly, and we're trying to converge

00:49:12.263 --> 00:49:14.283
on what the computation is.

00:49:14.403 --> 00:49:17.303
But I'm not sure what the alternative is that you think that I'm endorsing.

00:49:17.403 --> 00:49:22.023
And what I'm saying is that the computation, the characterizing the computation

00:49:22.023 --> 00:49:25.263
is absolutely central to putting the system together.

00:49:25.443 --> 00:49:28.943
And I can't tell you how many neuroscience conferences I've been to lately where

00:49:28.943 --> 00:49:32.443
the word computation scarcely is even mentioned. And I think in some sense,

00:49:32.563 --> 00:49:35.963
that's what I'm railing against here is the idea that, you know,

00:49:35.983 --> 00:49:36.883
once we have the circuit, the

00:49:36.883 --> 00:49:40.283
computation comes for free and that we don't have hard work to do there.

00:49:41.443 --> 00:49:45.383
You go to neuroscience talks and people explain their channels and never invoke

00:49:45.383 --> 00:49:48.523
the word computation and you don't know what computation they're even thinking about.

00:49:48.663 --> 00:49:51.803
I think that's problematic. But that means you would use the word computation

00:49:51.803 --> 00:49:57.383
on a broad sense, like what are the transformations or operations that these circuits perform?

00:49:57.383 --> 00:50:00.223
Form it is not necessarily like in a

00:50:00.223 --> 00:50:03.243
true machine sense of computation well i'm interested in both i think

00:50:03.243 --> 00:50:06.243
that the right full account of

00:50:06.243 --> 00:50:09.263
things has to involve both the fine

00:50:09.263 --> 00:50:13.523
level i mean like if you if you're talking about a computer again you in order

00:50:13.523 --> 00:50:18.523
to understand how word works you need to understand both the fine grain of like

00:50:18.523 --> 00:50:21.483
transistors and how they make gates and you need to understand something like

00:50:21.483 --> 00:50:26.463
about the api of an operating system if you really want to understand how it works.

00:50:26.603 --> 00:50:30.103
And not everybody does, but to the extent that we're trying to reverse engineer

00:50:30.103 --> 00:50:33.723
the mind, it's sort of comparable to someone who would try to reverse engineer

00:50:33.723 --> 00:50:36.023
Microsoft Word and build their own.

00:50:36.123 --> 00:50:39.323
Well, in order to build your own copy of Microsoft Word, you'd at least need

00:50:39.323 --> 00:50:41.863
to know what an API is and what a programming language is.

00:50:42.003 --> 00:50:45.663
And the people who built the programming languages would have to know what assembly

00:50:45.663 --> 00:50:49.503
code is. Maybe you could survive without it because their different levels become insulated.

00:50:49.643 --> 00:50:53.743
So maybe when we study the brain, not everybody who is connecting to behavior

00:50:53.743 --> 00:50:56.723
needs to understand every intermediate level, but somebody's got to be able

00:50:56.723 --> 00:51:00.423
to make the mappings between each of these levels that we're talking about.

00:51:00.603 --> 00:51:05.363
So when we talk about a computer, somebody can map between the transistors and the microprocessors.

00:51:05.523 --> 00:51:09.223
And even if I can't personally, I know there's an ordered mapping that explains

00:51:09.223 --> 00:51:11.983
it. I know roughly how it works, and we need that level of it.

00:51:12.383 --> 00:51:16.583
But I think with your Microsoft Word analogy, we don't want to start by reverse

00:51:16.583 --> 00:51:18.483
engineering a system of that complexity.

00:51:18.803 --> 00:51:22.143
If you're an alien, and you wanted to know how that program worked,

00:51:22.463 --> 00:51:26.743
you'd probably want to get a hold of a much simpler text editor and try and figure that out first.

00:51:26.943 --> 00:51:31.163
And we can do the same, obviously, in neuroscience, so the comparative approach.

00:51:31.463 --> 00:51:33.803
And I think the other thing I want to push on here a bit.

00:51:34.739 --> 00:51:37.959
Development because you know children don't start

00:51:37.959 --> 00:51:41.239
talking in multi-word sentences uh until

00:51:41.239 --> 00:51:45.239
they're several years old and before that there are various earlier grammars

00:51:45.239 --> 00:51:50.499
which are my son's 20 months he does a fair number of multiple yeah but come

00:51:50.499 --> 00:51:56.499
on he's your your son gary please he's my primary source so yeah so there's

00:51:56.499 --> 00:51:59.459
this phase when they're they're doing a lot of two-word utterances.

00:51:59.979 --> 00:52:05.199
And, you know, and surely there's some kind of substrate for that,

00:52:05.259 --> 00:52:09.959
which will be interesting to understand as a precursor for the substrate for adult language.

00:52:10.139 --> 00:52:14.179
And to get to adult language, perhaps we should understand the substrate for

00:52:14.179 --> 00:52:17.859
that and build that, and then think about the mechanisms that we'll construct from there.

00:52:18.139 --> 00:52:22.519
I totally agree that we want smaller systems. I mean, when I say a parser is too big, I mean it.

00:52:22.579 --> 00:52:25.979
I think that on the behavioral side, we

00:52:25.979 --> 00:52:29.499
want to understand things like how can you repeat a word you know very

00:52:29.499 --> 00:52:32.259
small things that doing a whole language system is

00:52:32.259 --> 00:52:35.399
just outside the scope of what we can do now it's like trying to do microsoft word

00:52:35.399 --> 00:52:38.919
when we don't know what a text editor is we don't know what it means to draw

00:52:38.919 --> 00:52:43.479
you know characters on the display like we're really at a a primitive level

00:52:43.479 --> 00:52:47.059
of understanding these things and we i totally agree we want to find simpler

00:52:47.059 --> 00:52:52.139
pieces so how do you imitate a word how How do you recognize the difference

00:52:52.139 --> 00:52:54.699
between blue car and car blue?

00:52:55.417 --> 00:52:59.697
You know, these kind of very basic things we need to work out before we understand.

00:52:59.897 --> 00:53:04.417
How do you comprehend language in the context of a discourse and you know all

00:53:04.417 --> 00:53:06.877
the things that are going on around you and how do you integrate that?

00:53:06.937 --> 00:53:12.137
We don't have the basic tools out of which such systems are built yet.

00:53:12.477 --> 00:53:16.317
But in the case of language, for instance, and this question of how we come

00:53:16.317 --> 00:53:21.597
to be able to use variables when we think, but perhaps that's something we develop

00:53:21.597 --> 00:53:22.857
as we practice language,

00:53:22.997 --> 00:53:28.257
you become able to use more and more abstract tokens and to realize that tokens

00:53:28.257 --> 00:53:29.497
are interchangeable and so on.

00:53:29.597 --> 00:53:34.777
Some of my own work gives a piece of an argument that the variable binding itself might be innate.

00:53:34.977 --> 00:53:39.417
I did some work showing that seven-month-olds could do a kind of variable binding.

00:53:39.497 --> 00:53:42.737
It was actually on the right side of one of my slides.

00:53:42.817 --> 00:53:47.457
So I showed that kids could learn ABA structures or ABB structures and then

00:53:47.457 --> 00:53:48.677
generalize them to new words.

00:53:48.697 --> 00:53:50.997
So they don't seem to be just using transitional probabilities.

00:53:50.997 --> 00:53:54.697
And then, as in what is typical in developmental psychology,

00:53:54.997 --> 00:53:56.737
someone said, well, I can do that even younger.

00:53:56.877 --> 00:54:00.797
And so now we know that the paradigm that I invented, minus a control that I'd

00:54:00.797 --> 00:54:03.777
like to see run, can be done in newborns.

00:54:03.777 --> 00:54:07.617
So even newborns, and I know that's not a perfect argument for nativism,

00:54:07.677 --> 00:54:12.757
but it's at least evidential that newborns apparently can do a computation that

00:54:12.757 --> 00:54:15.237
I believe requires variable binding.

00:54:15.237 --> 00:54:18.497
So on the particulars of variable binding, I actually think that that's part

00:54:18.497 --> 00:54:20.717
of our innate armamentarium.

00:54:21.097 --> 00:54:23.617
As to language as a whole, I think there's a lot of learning.

00:54:23.777 --> 00:54:28.337
So you might plausibly think that kids are born with the ability to represent

00:54:28.337 --> 00:54:30.837
arbitrary relationships, which you need for words.

00:54:31.077 --> 00:54:34.957
They might be born with the ability to concatenate symbols, even if they don't

00:54:34.957 --> 00:54:35.997
know what those symbols are.

00:54:36.999 --> 00:54:39.159
They have to learn lots of things that are language particular.

00:54:39.659 --> 00:54:43.359
They may have to learn that you map syntax to semantics, or maybe that part

00:54:43.359 --> 00:54:46.859
is known, but a lot of the detail about how you do that might have to be learned.

00:54:47.299 --> 00:54:51.779
So, I mean, the most extreme nativist theories are like some of the ones that

00:54:51.779 --> 00:54:55.999
Chomsky was pushing in the 1980s and that I was trained on in graduate school,

00:54:56.059 --> 00:54:59.659
where there are a lot of very specific principles, like of what is a legal tree

00:54:59.659 --> 00:55:01.539
structure and what is an illegal tree structure.

00:55:01.699 --> 00:55:04.999
So if these two items are not in this geometric relation to one another,

00:55:05.079 --> 00:55:07.759
the sentence is ruled Stuff like that might not really be innate,

00:55:07.879 --> 00:55:09.799
even though Chomsky argued that it is.

00:55:09.879 --> 00:55:13.799
The ability to represent something like a tree structure, I'm guessing is,

00:55:13.899 --> 00:55:16.879
it's probably not exactly a tree structure for reasons that I mentioned before.

00:55:17.359 --> 00:55:22.079
But my guess is actually that either chimps don't have that structure at all,

00:55:22.099 --> 00:55:23.899
or they don't know how to use it for new things.

00:55:23.999 --> 00:55:27.219
Maybe they can use it for motor planning, but they don't have the ability to

00:55:27.219 --> 00:55:30.499
say, hey, this is a useful mental representation that I can do,

00:55:30.599 --> 00:55:34.379
a representational format that I can do other useful work with. Yeah.

00:55:34.479 --> 00:55:40.759
So getting to the finish line, there's two issues I would like to clarify with

00:55:40.759 --> 00:55:42.279
respect to your proposal, right?

00:55:42.319 --> 00:55:47.799
So after criticizing the canonical microcircuit as being too restricted in thinking

00:55:47.799 --> 00:55:49.959
about the kinds of cognitive function we want to get.

00:55:50.559 --> 00:55:54.659
I was rather surprised that you were proposing field programmable gate arrays

00:55:54.659 --> 00:56:00.539
as, let's say, an example of the configurable kind of computation you want.

00:56:00.539 --> 00:56:05.539
Because FPGAs, which are widely used for, let's say, real-time processing because

00:56:05.539 --> 00:56:08.979
of their parallel operation, are actually...

00:56:10.370 --> 00:56:18.470
A very paradigmatic example, if you want, of a canonical microcircuit repeated many times in silicon.

00:56:18.850 --> 00:56:23.450
But a configurable one, crucially. Right. So, I mean, I chose that deliberately.

00:56:23.670 --> 00:56:27.970
So there is this at least superficial similarity, and yet it needs to be resolved

00:56:27.970 --> 00:56:28.890
with the functional diversity.

00:56:29.250 --> 00:56:33.270
And I think that's what the FPGA gives you is superficially and initially,

00:56:33.450 --> 00:56:36.810
in fact, not just superficially, it is literally identical across its extent.

00:56:36.810 --> 00:56:42.010
Then the configuration comes from instructions that say, you know,

00:56:42.070 --> 00:56:43.730
I want this to behave in this way.

00:56:43.950 --> 00:56:47.130
And the real difference ultimately comes down to, I think that that configuration

00:56:47.130 --> 00:56:50.690
can be partly done molecularly just as in any other part of the body.

00:56:50.850 --> 00:56:54.070
And that indeed illustrates your point to say, look, there might be an initial

00:56:54.070 --> 00:56:58.570
infrastructure as in the FPGA, but that then gets configured and that gives

00:56:58.570 --> 00:57:02.310
you variability across the microcircuits. That's right. This is roughly the idea.

00:57:02.550 --> 00:57:06.570
Yes. So the second thing is, to me, your proposal sounds very reminiscent of

00:57:06.570 --> 00:57:08.910
Jerry Edelman's idea of neurodharmonism, where you would say,

00:57:08.970 --> 00:57:14.030
look, developmental factors give rise to what he then called a primary repertoire,

00:57:14.350 --> 00:57:17.430
highly redundant, with lots of possible mappings.

00:57:18.630 --> 00:57:23.090
Upon that, selection takes place due to the engagement with the real world.

00:57:23.150 --> 00:57:24.750
So now you have your secondary repertoire.

00:57:25.570 --> 00:57:29.850
Secondary repertoire can perform complex functions through what he calls reentry,

00:57:30.010 --> 00:57:34.330
including recursion. and then you would rely on what he calls value-based learning

00:57:34.330 --> 00:57:36.950
to accept rules of engagement with the world.

00:57:37.170 --> 00:57:42.510
So would it be fair to say that you're now a neo-Edelmanian?

00:57:43.670 --> 00:57:46.690
I'm more sympathetic to that view than I might have been before.

00:57:47.730 --> 00:57:51.230
I'm not sure that I would put as much weight on selection.

00:57:52.370 --> 00:57:57.390
I do think there's a basic stock of elements. I don't remember his exact way

00:57:57.390 --> 00:58:00.170
of thinking about that. We may share that.

00:58:01.990 --> 00:58:06.010
I think that the basic elements can be fairly sophisticated computation.

00:58:06.450 --> 00:58:11.750
So let me rephrase that. I think some of the basic elements.

00:58:14.062 --> 00:58:17.382
I need another word here. So I've been talking about building blocks all along.

00:58:17.482 --> 00:58:21.422
Some of the basic assemblies of building blocks can do fairly sophisticated

00:58:21.422 --> 00:58:23.082
things, probably without learning.

00:58:23.242 --> 00:58:28.602
So my paradigm example of this would be imprinting, where an organism sees a

00:58:28.602 --> 00:58:32.202
stimulus that falls in a certain class, makes basically a one trial decision

00:58:32.202 --> 00:58:37.622
about that. I don't think that the Edelman notion gives you a good handle on that.

00:58:37.682 --> 00:58:40.922
I don't think it's necessarily incompatible. I think ultimately it's a broad

00:58:40.922 --> 00:58:43.042
umbrella and you could work it out in different kinds of ways.

00:58:43.222 --> 00:58:46.282
But for me, I want to know why there are circuits like that.

00:58:46.342 --> 00:58:50.442
I think there are going to be some circuits at that level in language where

00:58:50.442 --> 00:58:54.962
you're really looking for specific stimuli and doing specific things with those stimuli.

00:58:55.582 --> 00:58:59.902
And, you know, in that work, we know that, you know, Conrad Lorenz is not,

00:58:59.922 --> 00:59:01.022
in fact, as good an example.

00:59:01.082 --> 00:59:05.282
You won't imprint on Lorenz if you have a proper duct to imprint on.

00:59:05.382 --> 00:59:08.262
So, you know, there's some specificity there to how those work.

00:59:08.942 --> 00:59:13.442
And that's got to be part of the picture. And I see how to shoehorn that into

00:59:13.442 --> 00:59:16.342
his theory, but I don't see it as sort of following from it. Right. Okay.

00:59:17.282 --> 00:59:21.462
Unfortunately, we cannot ask Jerry anymore because he died in May this year.

00:59:23.562 --> 00:59:28.802
So you're now in this business of understanding the brain, also certainly from

00:59:28.802 --> 00:59:29.762
the perspective of language.

00:59:30.802 --> 00:59:36.242
Also, I think you made a really good case for linking functional considerations

00:59:36.242 --> 00:59:38.622
with structural considerations, right?

00:59:38.662 --> 00:59:43.622
And not to decouple the two and say, well, let's just worry about structure, then it will all happen.

00:59:43.942 --> 00:59:46.922
So given all that experience and also given your objectives perspectives

00:59:46.922 --> 00:59:50.162
in terms of resetting neuroscience what's the

00:59:50.162 --> 00:59:53.102
what's what's gary's law we should follow in

00:59:53.102 --> 00:59:56.442
studying the brain and mind pay attention

00:59:56.442 --> 01:00:00.082
to the bridges don't i mean you just said it i mean i'm not sure i have one

01:00:00.082 --> 01:00:04.262
law but i think what you said is right that that you can't study these things

01:00:04.262 --> 01:00:08.182
in isolation maybe that's the phrase if it's you know you can't study these

01:00:08.182 --> 01:00:11.682
things in isolation you can't study the structure or the function and expect

01:00:11.682 --> 01:00:15.182
to really understand the cognitive neurosciences. You have to think about the bridges.

01:00:15.762 --> 01:00:19.662
And then, so four years from now, we're going to come visit you in New York

01:00:19.662 --> 01:00:21.562
or wherever you're going to be four years from now.

01:00:21.862 --> 01:00:26.182
And we're going to challenge you on a prediction you're going to make today.

01:00:26.522 --> 01:00:31.802
So what's the one specific prediction you're willing to make that you will find

01:00:31.802 --> 01:00:34.662
confirmed four years from now when we visit you?

01:00:35.662 --> 01:00:39.722
Four years from now? Yeah, four years. It might be three. It depends how fast

01:00:39.722 --> 01:00:40.602
you're going to be. Okay.

01:00:43.157 --> 01:00:47.417
I mean, these things are so subject to, the kinds of things that I'm talking

01:00:47.417 --> 01:00:51.777
about are research programs that aren't done by one person, they're done by societies.

01:00:52.177 --> 01:00:56.737
And so there's a lot that's dependent on how society allocates its resources

01:00:56.737 --> 01:00:59.437
for how far along we get in these problems.

01:00:59.437 --> 01:01:03.117
Problems maybe the first thing that i think will be confirmed is

01:01:03.117 --> 01:01:06.337
that there'll be important differences in recurrent

01:01:06.337 --> 01:01:09.737
motifs at the neuron level so we'll we'll

01:01:09.737 --> 01:01:12.657
find sets of neural motifs we won't initially know what

01:01:12.657 --> 01:01:16.217
they do computationally but we'll say hey that's a number another one of these

01:01:16.217 --> 01:01:22.297
number 17s now that we can drill down to the multicellular i mean to the to

01:01:22.297 --> 01:01:27.317
the circuits containing multiple neurons we keep seeing this kind of uh connectivity

01:01:27.317 --> 01:01:30.337
and this kind of connectivity over and over again.

01:01:30.457 --> 01:01:33.557
We don't know what it means yet, but I think three or four years from now,

01:01:33.677 --> 01:01:36.617
there's a good chance with all the money that's being poured into EM,

01:01:36.737 --> 01:01:40.817
for example, with good analytic techniques, people will be able to pick out

01:01:40.817 --> 01:01:43.937
those motifs and say, hey, these are interesting.

01:01:44.397 --> 01:01:47.657
And with luck, we'll have this not just in, say, visual cortex,

01:01:47.717 --> 01:01:52.437
and be able to say the distribution of these motifs is different in prefrontal

01:01:52.437 --> 01:01:53.837
cortex than visual cortex.

01:01:54.157 --> 01:01:57.417
That's got to be telling us something. We won't know four years from now know

01:01:57.417 --> 01:02:01.277
what it's telling us, but I hope in four years we'll at least be able to say

01:02:01.277 --> 01:02:04.977
that much. We'll be able to say, I see the stock of motifs is different in these

01:02:04.977 --> 01:02:07.357
two areas. And that's something that we can try to leverage now.

01:02:07.517 --> 01:02:10.337
Exactly. Okay. Gary Marcus, thank you very much for this conversation.

01:02:10.577 --> 01:02:12.897
Thank you very much. That was great.

01:02:14.517 --> 01:02:19.917
The CSN Podcast was produced by the Convergent Science Network of Biometrics

01:02:19.917 --> 01:02:26.397
and Biohybrid Systems, a project funded by the European Sevens Research Framework Program.

01:02:27.917 --> 01:02:33.177
For more interviews, recorded lectures, or upcoming conferences in the field

01:02:33.177 --> 01:02:39.277
of biometrics and biohybrid systems, go to csnnetwork.com.

01:02:39.440 --> 01:02:47.600
Music.