WEBVTT

00:00:03.517 --> 00:00:10.517
This is the Convergent Science Network podcast. Leading researchers in the domain

00:00:10.517 --> 00:00:16.797
of neuroscience, brain theory and technology are interviewed by Paul Verschoor and Tony Prescott.

00:00:18.717 --> 00:00:23.737
So this is Paul Verschoor with the Convergent Science Network podcast.

00:00:24.577 --> 00:00:27.377
And in this episode that we recorded as part

00:00:27.377 --> 00:00:30.397
of the CSN Barcelona Cognition Brain and

00:00:30.397 --> 00:00:33.217
Technology Summer School series I'm talking

00:00:33.217 --> 00:00:36.797
to Dimitri Sklosky and Dimitri

00:00:36.797 --> 00:00:41.877
you come out of engineering now of theoretical physics and from theoretical

00:00:41.877 --> 00:00:46.297
physics you went into neuroscience and also in your in your presentation this

00:00:46.297 --> 00:00:51.917
morning you also show how let's say this interest in theory is translating towards

00:00:51.917 --> 00:00:53.797
how you understand brains.

00:00:54.157 --> 00:01:02.597
So how do you see that link exactly between theory and the practical sides of neuroscience?

00:01:05.097 --> 00:01:14.117
So historically, neuroscience has been rich on a number of experimental observations

00:01:14.117 --> 00:01:19.597
and facts that have been assembled through a large number of techniques

00:01:19.837 --> 00:01:23.557
on a variety of different levels and animals.

00:01:24.057 --> 00:01:29.557
But what has been lacking is a theoretical framework that would allow to put

00:01:29.557 --> 00:01:35.357
these facts into a unified perspective and to make future predictions and eventually

00:01:35.357 --> 00:01:38.057
to understand how the brain works.

00:01:41.657 --> 00:01:50.197
Neuroscientists always are interested in finding an appropriate framework to put the facts in.

00:01:51.097 --> 00:01:59.637
In our case, we had a lot of success with doing it by borrowing the ideas from

00:01:59.637 --> 00:02:05.197
electrical engineering, specifically from the field of adaptive signal processing. Right.

00:02:05.437 --> 00:02:13.477
So what you emphasized a lot there was how very specific ideas about signal

00:02:13.477 --> 00:02:18.117
processing, as developed within engineering for a long time by now,

00:02:18.317 --> 00:02:21.757
might help us to get some leverage in how we can understand the brain.

00:02:21.757 --> 00:02:28.117
And so what do you see as some promising starting points there when you talk

00:02:28.117 --> 00:02:29.817
about adaptive filtering?

00:02:29.997 --> 00:02:33.097
Because in some sense, there are quite a number of people who go around saying,

00:02:33.137 --> 00:02:34.497
well, the brain is an adaptive filter.

00:02:36.317 --> 00:02:40.057
Common filters have been very successful in engineering, so the brain also must

00:02:40.057 --> 00:02:41.217
operate like one, et cetera.

00:02:41.297 --> 00:02:46.237
But to make that now more specific, where do you really see leverage in these

00:02:46.237 --> 00:02:49.417
kinds of more normative approaches towards the brain? Yeah.

00:02:50.190 --> 00:02:59.010
Well, I think to make such ideas successful and of practical value is to make

00:02:59.010 --> 00:03:03.170
a connection between theory and experiment on a very specific level,

00:03:03.270 --> 00:03:05.470
so that we could predict, for example,

00:03:05.610 --> 00:03:11.990
response properties of individual neurons given a certain stimulus presentation,

00:03:12.310 --> 00:03:15.950
which could be compared with electrophysiological recordings, for example.

00:03:15.950 --> 00:03:23.770
And I think one of the places where this could be done easiest are sensory systems,

00:03:23.930 --> 00:03:26.250
like visual, for example,

00:03:26.470 --> 00:03:32.710
where you have complete control over the stimulus and you also have access to

00:03:32.710 --> 00:03:38.010
the neurons by recording in the retina or further down in the vertebrate pathway in the LGN.

00:03:38.010 --> 00:03:44.370
And the reason I think that the engineering principles should apply there is

00:03:44.370 --> 00:03:50.390
because both systems have to deal with the same kind of limitations that are

00:03:50.390 --> 00:03:52.750
presented by the physical world,

00:03:52.910 --> 00:04:00.070
such as limitations on the dynamic range and the bandwidth of the communication

00:04:00.070 --> 00:04:04.410
from the retina, say, to the LGN and from the LGN to the cortex.

00:04:05.090 --> 00:04:09.410
But now, can you give me an example? Could you talk me through an example of

00:04:09.410 --> 00:04:15.110
how such a filter would help us to understand what happens in the retina and LGN?

00:04:15.110 --> 00:04:18.350
So one example which goes

00:04:18.350 --> 00:04:28.310
back in time is the receptive fields of the retinal or LGN neurons that have

00:04:28.310 --> 00:04:35.890
been known to be biphasic in time and center-surround in space.

00:04:36.912 --> 00:04:43.912
Both of those observations can be explained based on the predictive coding framework

00:04:43.912 --> 00:04:45.732
that originates in engineering,

00:04:46.112 --> 00:04:54.632
where you say that the system is trying to compress the incoming signal by subtracting

00:04:54.632 --> 00:04:57.532
a prediction from the actual signal value.

00:04:57.532 --> 00:05:02.672
So, in terms of biphasic temporal response, for example, you would use the previous

00:05:02.672 --> 00:05:08.112
values of the signal to predict the current one, and that's how you get a biphasic shape.

00:05:08.292 --> 00:05:13.072
In the case of the spatial response shape, the center-surround shape,

00:05:13.312 --> 00:05:19.472
you use the surrounding values of the signal, like the surrounding values of

00:05:19.472 --> 00:05:24.932
image pixels, to predict the value of the pixel, a central pixel in the image,

00:05:25.052 --> 00:05:27.032
and subtracting that prediction.

00:05:27.292 --> 00:05:32.692
And that's how you can explain both the biphasic and the center-surround shape of the response.

00:05:33.952 --> 00:05:40.112
But now, as a start, the intuition to look at encoding by neurons,

00:05:40.352 --> 00:05:45.272
certainly in sensory systems, from this perspective, goes back quite a long way.

00:05:45.372 --> 00:05:48.112
I mean, you mentioned yourself, Barlow, for instance.

00:05:48.452 --> 00:05:54.792
That's right. When he tapped into this. So what does progress mean if we compare

00:05:54.792 --> 00:05:56.892
it to the intuitions of Barlow and where we are today?

00:06:02.652 --> 00:06:10.532
So Barlow and Atnav particularly borrowed the ideas from engineering.

00:06:10.972 --> 00:06:17.012
And I think Barlow's point of view is usually summarized by the maximum redundancy reduction,

00:06:17.332 --> 00:06:24.572
where the idea is that you just transmit the part of the signal that is non-redundant

00:06:24.572 --> 00:06:26.932
or new or surprising or couldn't have predicted.

00:06:27.392 --> 00:06:32.872
And then it has been followed by the introduction of predictive coding framework,

00:06:33.272 --> 00:06:39.712
which is a concrete quantitative framework that allows to want to generate the

00:06:39.712 --> 00:06:43.932
predictions for the receptive fields like biphasic and center-surround responses that I talked about.

00:06:43.932 --> 00:06:48.012
And that was done by Srinivasan Laughlin and Dubs in 1982.

00:06:49.172 --> 00:06:56.332
And then this line of work was continued by several people, most notably Attick and Van Hatteren.

00:06:57.276 --> 00:07:04.876
And what we did recently is just try to take this work to its logical conclusion

00:07:04.876 --> 00:07:11.216
and derive a normative theory in the most direct and transparent way.

00:07:11.756 --> 00:07:18.156
And in that way, we could compare the predictions to the actual measurements

00:07:18.156 --> 00:07:25.796
of spatio-temporal receptive fields that were done in catangion and insect visual system.

00:07:25.796 --> 00:07:30.436
Okay, so what would now be the key parameters of this model?

00:07:30.936 --> 00:07:35.636
So if I would like to take the model as a filter to look again at the brain

00:07:35.636 --> 00:07:39.036
itself, what are the key parameters I should be sensitive to?

00:07:39.776 --> 00:07:49.016
Right, so the model actually has no free parameters in a sense that once you

00:07:49.016 --> 00:07:51.396
define a natural stimulus ensemble,

00:07:51.396 --> 00:07:54.616
um so that's a statistics of the

00:07:54.616 --> 00:07:57.616
input and um you specify

00:07:57.616 --> 00:08:00.636
the signal to noise ratio then there

00:08:00.636 --> 00:08:03.896
is a unique prediction for the filter shape and that

00:08:03.896 --> 00:08:10.076
can be compared with the um electrophysiologically measured um receptive fields

00:08:10.076 --> 00:08:16.536
by say a reverse correlation method but now um in some sense if you take something

00:08:16.536 --> 00:08:20.936
like the signal to noise ratio of these neurons this might not be a constant necessarily, right?

00:08:20.996 --> 00:08:25.296
This could vary depending on, let's say, the presence of neuromodulators or not.

00:08:25.396 --> 00:08:30.476
So how well would it generalize beyond this fixing of these kinds of parameters?

00:08:31.592 --> 00:08:37.472
Right. I was actually referring to the signal-to-noise ratio in the input that

00:08:37.472 --> 00:08:43.072
would have to do with the absolute intensity level.

00:08:44.152 --> 00:08:47.272
But there is also internal noise, of course,

00:08:47.452 --> 00:08:57.352
and the impact of that noise depends on the specific circuit implementation of the filter,

00:08:57.432 --> 00:09:04.612
which we have one proposal for that I discussed earlier this morning,

00:09:04.732 --> 00:09:06.912
which is based on the lattice filter idea.

00:09:07.272 --> 00:09:11.972
Before we go to the lattice filter, let's look at the simpler case first.

00:09:12.192 --> 00:09:18.972
Because In some sense, the key signature that you see as confirming the physiology

00:09:18.972 --> 00:09:23.992
is biphasic response, which essentially means I would have, let's say,

00:09:23.992 --> 00:09:25.452
some onset-driven response,

00:09:25.752 --> 00:09:31.592
and then I would have, let's say, an orthogonal or an opposite response in turn,

00:09:31.652 --> 00:09:35.012
like I might have, let's say, a depolarizing response to something,

00:09:35.112 --> 00:09:36.712
and then I'm hyperpolarizing, right?

00:09:37.452 --> 00:09:42.392
This would be a quick characterization of, let's say, the simplest form of such a filter.

00:09:43.192 --> 00:09:46.852
And so so what i'm curious about is how

00:09:46.852 --> 00:09:49.632
specific can you make these responses right so

00:09:49.632 --> 00:09:53.852
also what i post you the question post in the morning one way

00:09:53.852 --> 00:09:56.892
to to look at these these the negative side of

00:09:56.892 --> 00:10:01.992
the response the the negative tail could be to say like well after hyperpolarization

00:10:01.992 --> 00:10:05.812
is a standard feature of all neurons whether they're encoding anything or not

00:10:05.812 --> 00:10:10.912
so this is just a non-specific component and now you're saying no no it's a

00:10:10.912 --> 00:10:14.432
specific component because that's exactly what I need for my predictive filter.

00:10:14.712 --> 00:10:18.632
So how could we make this experimentally testable? Or do you think the data

00:10:18.632 --> 00:10:22.352
is already out there to allow us to make a decision on this?

00:10:23.320 --> 00:10:29.780
I think the answer is partially yes. And the reason, I mean,

00:10:29.820 --> 00:10:34.640
of course, there is a physiological mechanism like after hyperpolarization that

00:10:34.640 --> 00:10:36.420
has to underlie this response.

00:10:36.660 --> 00:10:44.060
But I think the evidence to say that this response is there to implement productive

00:10:44.060 --> 00:10:50.640
filtering comes from the change in the filter shape in response to the change

00:10:50.640 --> 00:10:51.700
in the stimulus statistics.

00:10:51.700 --> 00:10:56.160
So, for example, at high contrast, at high signal-to-noise ratio,

00:10:56.500 --> 00:11:02.140
you get very strong biphasic response with a strong and sharp negative component.

00:11:02.580 --> 00:11:07.080
When you go to low contrast, to low signal-to-noise ratio,

00:11:07.420 --> 00:11:14.420
the filter changes and it carries most of the weight in the first peak,

00:11:14.620 --> 00:11:21.840
which gets wider, and the negative rebound actually gets much weaker.

00:11:22.140 --> 00:11:27.740
And that would be consistent with the filter doing more of low-pass filtering

00:11:27.740 --> 00:11:30.580
rather than high-pass filtering.

00:11:31.560 --> 00:11:36.580
Such change would be expected from the predictive coding framework,

00:11:36.820 --> 00:11:40.580
but would require a different physiological implementation.

00:11:41.380 --> 00:11:48.460
And so the fact that the neuron response follows this prediction suggests that

00:11:48.460 --> 00:11:51.920
the predictive coding framework has value.

00:11:52.540 --> 00:11:55.400
Yeah, but that would also mean that for the predictive coding framework,

00:11:56.120 --> 00:12:01.140
you should see a stimulus-specific modulation of both the positive and the negative

00:12:01.140 --> 00:12:02.820
phase, right? That's correct.

00:12:04.260 --> 00:12:08.100
And is there sufficient evidence for that? If we go to just,

00:12:08.180 --> 00:12:13.180
let's say, standard physiology of the visual system, and we start to look at,

00:12:13.240 --> 00:12:16.440
let's say, the encoding of more or less complex scenes as an example,

00:12:16.640 --> 00:12:18.780
do you see examples of this?

00:12:21.300 --> 00:12:27.800
Only very few. As I mentioned, there is a change in contrast that people looked

00:12:27.800 --> 00:12:33.720
at, and lately they've been adding more and more noise to the images and looking at the response.

00:12:34.060 --> 00:12:44.240
But I think that this line of work actually should become a bigger project right now,

00:12:44.240 --> 00:12:50.640
and we're trying to make connection with experimentalists to actually test this

00:12:50.640 --> 00:12:57.260
idea more exhaustively by playing with the different statistical ensembles of

00:12:57.260 --> 00:13:00.580
natural scenes and seeing if the filter would change accordingly.

00:13:00.860 --> 00:13:04.360
So it is in some sense work in progress. Right, okay, understood.

00:13:04.840 --> 00:13:12.780
Now my other question is also, so before we go to the complex version of the

00:13:12.780 --> 00:13:14.780
model, which is this lattice filter system.

00:13:17.036 --> 00:13:20.596
Something funny happens in the argument about adaptive filters, right?

00:13:20.676 --> 00:13:25.776
Because, as we also discussed earlier, originally the intuition is like, well,

00:13:25.896 --> 00:13:30.696
adaptive filters were actually a near optimal way or an optimal way,

00:13:30.776 --> 00:13:36.156
if you want, to show how you can transduce information through some channel, okay?

00:13:36.696 --> 00:13:42.116
And this is also developed as a technique, given the limitations and the possibilities

00:13:42.116 --> 00:13:45.696
of the engineering that we do. And there, indeed, bandwidth is always an issue.

00:13:47.296 --> 00:13:55.116
But now for brains themselves, for the brain, maybe bandwidth is not an issue in the same way.

00:13:55.216 --> 00:13:59.836
Like, for instance, you could argue that actually the whole principle of the

00:13:59.836 --> 00:14:04.696
brain is to do massive IO, but all these connections that you have available

00:14:04.696 --> 00:14:08.496
to you with actually minimal local computation,

00:14:08.796 --> 00:14:10.516
because biophysically that's more complex.

00:14:10.676 --> 00:14:14.536
So the whole design might actually be exactly orthogonal to what the engineers

00:14:14.536 --> 00:14:17.136
were thinking of when they designed these adaptive filters.

00:14:17.696 --> 00:14:22.516
So maybe then it's the wrong metaphor to apply to this system.

00:14:22.896 --> 00:14:28.016
Yeah, it's actually a very astute observation, Paul, because when we started

00:14:28.016 --> 00:14:34.556
this work, our main motivation was a so-called communication bottleneck,

00:14:34.556 --> 00:14:37.836
as I think Barlow referred to it, where he said, well,

00:14:38.016 --> 00:14:44.996
you know, there are many more photoreceptors in the vertebrate retina than there are ganglion cells.

00:14:45.376 --> 00:14:50.156
And so there is a bottleneck for transmitting information to the rest of the

00:14:50.156 --> 00:14:54.776
brain, and therefore there must be compression, which could be done by redundancy reduction.

00:14:55.236 --> 00:14:58.556
And that's the philosophy that we started with.

00:14:59.136 --> 00:15:06.076
But I think as the work progressed, especially after looking at other systems, such as, for.

00:15:10.546 --> 00:15:18.326
As explicit as it is in the vertebrate pathway, we started questioning this

00:15:18.326 --> 00:15:22.046
assumption of the need for compression.

00:15:22.486 --> 00:15:27.826
And at the same time, what started to become clear that there could be computational

00:15:27.826 --> 00:15:36.326
advantages to implementing this predictive coding framework and decorrelating the incoming signal,

00:15:36.506 --> 00:15:41.346
especially when it's done in a stage-wise fashion like it's done in the lattice filter.

00:15:42.066 --> 00:15:47.346
Because when I looked up in signal processing textbooks,

00:15:47.726 --> 00:15:53.566
it turns out that lattice filters, in addition to performing the correlation

00:15:53.566 --> 00:16:01.306
of the original signal stream, They also are used for feature construction,

00:16:01.706 --> 00:16:09.086
because if you output signal from each stage of the lattice filter,

00:16:09.306 --> 00:16:11.586
you get a set of orthogonal features,

00:16:11.906 --> 00:16:21.306
which are very convenient to be used as a set for predicting or training or

00:16:21.306 --> 00:16:24.906
learning a correct response to another input.

00:16:24.906 --> 00:16:29.106
Input, which presumably is what the brain is trying to achieve in associative

00:16:29.106 --> 00:16:30.226
learning or something like that.

00:16:31.626 --> 00:16:36.866
So now we move to this lattice filter. What makes the lattice filter interesting

00:16:36.866 --> 00:16:39.686
and how is it different from the filter we just talked about?

00:16:40.506 --> 00:16:49.666
Right, so lattice filter is a specific circuit implementation of a predictive coding filter,

00:16:49.966 --> 00:16:57.886
and the defining characteristics is that decorrelation is done in stages, and,

00:16:58.762 --> 00:17:05.822
where each consecutive stage decorrelates the signal on a different time scale.

00:17:06.882 --> 00:17:10.702
Which seem to correspond to electrophysiological

00:17:10.702 --> 00:17:15.202
observations of receptive fields in the retina and LGN.

00:17:15.422 --> 00:17:20.602
Specifically, it has been measured that the temporal receptive fields in LGN

00:17:20.602 --> 00:17:23.422
are longer than the ones in the retina.

00:17:23.422 --> 00:17:26.202
And that's why it

00:17:26.202 --> 00:17:29.122
seems that the lattice filter may be a good

00:17:29.122 --> 00:17:32.382
model for the system so the key

00:17:32.382 --> 00:17:35.482
thing is a lattice filter is hierarchical at every

00:17:35.482 --> 00:17:39.882
state it performs the same operation roughly which essentially is to just decorrelate

00:17:39.882 --> 00:17:43.322
the image is that the reasonable way to interpret it that's right on a different

00:17:43.322 --> 00:17:48.362
time scale okay so that means you you decorrelate at varying time scales as

00:17:48.362 --> 00:17:51.902
you go through the filter as you go through this cascade of filters essentially

00:17:51.902 --> 00:17:54.102
that's right but With the local operations, it would be the same.

00:17:54.582 --> 00:17:59.442
Similar. In the simplest model, they're exactly the same, but the lattice filter

00:17:59.442 --> 00:18:03.602
can be modified to have slightly different operations.

00:18:04.182 --> 00:18:10.582
So what does the lattice filter now solve that your previous linear filter did not solve?

00:18:12.122 --> 00:18:20.482
So I think a better way to put it is that the general linear filter is a mathematical

00:18:20.482 --> 00:18:28.042
concept that performs optimal prediction and therefore widens the incoming stimulus.

00:18:29.122 --> 00:18:33.362
Leitz's filter is a specific circuit implementation of that filter.

00:18:34.022 --> 00:18:39.962
And in particular, it's the one which allows you to do decorrelation in stages

00:18:39.962 --> 00:18:48.782
by using biologically realistic elements such as neurons that have relatively short time constants.

00:18:48.782 --> 00:18:55.302
Distance, but giving you the ability to de-correlate the signal over a longer

00:18:55.302 --> 00:18:59.782
time scale due to the cascade hierarchical structure that you mentioned. Okay.

00:18:59.962 --> 00:19:05.122
But now, so if it's getting close to implementation, it also means that you

00:19:05.122 --> 00:19:08.942
must be able to make more specific predictions about how a biological system

00:19:08.942 --> 00:19:10.902
could implement such a filter.

00:19:11.042 --> 00:19:14.082
So what would be these specific predictions that would come out of that?

00:19:14.502 --> 00:19:19.882
That's exactly right. Right, that's the reason to go for this specific circuit implementation,

00:19:20.462 --> 00:19:26.582
and I think the strength of the lattice filter model is that you can map the

00:19:26.582 --> 00:19:32.702
specific units in the circuit to the specific neurons in the brain.

00:19:33.362 --> 00:19:39.562
And in particular, then, we could predict the responses of the retinal neurons

00:19:39.562 --> 00:19:46.002
versus the LGN neurons, according to what the lattice filter tells you, but also we.

00:19:46.960 --> 00:19:54.180
Can predict that there should be at least two different types of responses in LGN, for example,

00:19:54.380 --> 00:20:02.080
which correspond to the so-called forward and backward prediction error filters in the lattice filter.

00:20:02.400 --> 00:20:07.640
And those responses actually have been discovered electrophysiologically,

00:20:07.880 --> 00:20:13.920
and they could be identified with the classes of cells that are called lagged

00:20:13.920 --> 00:20:16.580
and non-lagged cells in the LGN.

00:20:16.700 --> 00:20:21.760
And the non-lagged were discovered by Mastro Nardi more than 20 years ago,

00:20:22.560 --> 00:20:29.140
and they have a distinct property that, although they also have a biphasic response,

00:20:29.420 --> 00:20:34.440
but the second phase is greater in amplitude than the first,

00:20:34.660 --> 00:20:36.680
which is rather unusual.

00:20:36.920 --> 00:20:40.220
Is that still a negative phase, or that can be a positive phase, or it doesn't matter?

00:20:40.500 --> 00:20:44.560
Right. So the important thing is that the two phases have the different signs,

00:20:44.560 --> 00:20:48.640
It doesn't matter whether the first one is plus and the second is minus,

00:20:48.700 --> 00:20:50.780
or the first is minus and the second is plus.

00:20:52.080 --> 00:20:57.820
In this way, the classification into lagged and non-lagged cells is separate

00:20:57.820 --> 00:21:00.820
from the classifications of cells into on and off.

00:21:01.740 --> 00:21:07.280
So both lagged and non-lagged cells can come in on and off varieties.

00:21:08.060 --> 00:21:10.600
What's the duration of this lagged response?

00:21:15.020 --> 00:21:23.920
So what lagged cells do is in response to a step stimulus, they respond with

00:21:23.920 --> 00:21:28.240
an initial delay which could be a few tens of milliseconds and that's why they're

00:21:28.240 --> 00:21:29.760
called lagged. Right, okay.

00:21:29.920 --> 00:21:36.120
But then how would your filter cascade account for that kind of lag given that

00:21:36.120 --> 00:21:41.240
you would only have, let's say, three or four synaptic steps to get there, right?

00:21:41.280 --> 00:21:45.480
If we go from the photoreceptor to our ganglion cells and then to your lag cells,

00:21:46.200 --> 00:21:49.420
actually would, well, let's say two to three synaptic steps.

00:21:50.480 --> 00:21:55.040
So how would you now relate that to this cascade of filters,

00:21:55.160 --> 00:21:58.320
which are positive and your negative prediction components?

00:22:00.086 --> 00:22:04.566
So, it is a little bit hard to explain without showing any diagrams,

00:22:04.806 --> 00:22:09.826
but I can say... But for that, we recommend people to look at the video lecture. Exactly.

00:22:10.826 --> 00:22:18.206
But the basic idea is that the photoreceptors at the very front of the cascade

00:22:18.206 --> 00:22:20.346
perform low-pass filtering,

00:22:20.686 --> 00:22:26.306
thus introducing the delay of various frequency components.

00:22:26.306 --> 00:22:34.166
And then each stage of the pathway invokes a so-called all-pass filter,

00:22:34.486 --> 00:22:42.286
which transmits all the frequency components equally, but introduces differential

00:22:42.286 --> 00:22:47.466
phase delays depending on the frequency, which could be thought of as delays,

00:22:47.866 --> 00:22:49.586
as just pure delays.

00:22:49.586 --> 00:22:57.506
And as those delays accumulate from stage to stage, they lead to this electrophysiologically.

00:22:59.486 --> 00:23:01.246
Notable lagged responses.

00:23:01.726 --> 00:23:05.646
But it would mean in your case, the prediction would be that somewhere in this

00:23:05.646 --> 00:23:09.926
network of ganglion cells or so, this buildup of delays would then happen.

00:23:10.406 --> 00:23:14.246
Is that the logical consequence? That's right.

00:23:14.426 --> 00:23:23.506
I think those interactions could happen already on the bipolar cell level.

00:23:23.626 --> 00:23:32.646
For example, it is known that although off-bipolar cells have fast response

00:23:32.646 --> 00:23:40.766
and they use ionotropic ion channels.

00:23:41.866 --> 00:23:50.086
The on-bipolar cells have a delayed response because they use a metabotropic ion channels.

00:23:50.446 --> 00:23:55.166
And that delay response I think is about 20 ms or so.

00:23:55.366 --> 00:24:00.186
So that could be the original source of the delay.

00:24:00.466 --> 00:24:07.146
But then there is further processing, of course, in the bipolar to ganglion

00:24:07.146 --> 00:24:13.446
cell synapses and in the amacrine cell network interacting with bipolar cells.

00:24:13.926 --> 00:24:20.166
So what I liked also in this model that you presented is that you actually purposefully

00:24:20.166 --> 00:24:23.666
want to apply it both to, let's say, vertebrates and invertebrate systems, right?

00:24:23.686 --> 00:24:30.266
Because you do believe, or wish at least, that the same principles will hold.

00:24:30.426 --> 00:24:31.606
Right. This is correct. Correct.

00:24:32.966 --> 00:24:39.506
So as an example, you talked about a specific system in the fly brain, right?

00:24:39.566 --> 00:24:44.106
So how well did the model map onto that system? How well did that work out?

00:24:44.946 --> 00:24:50.346
Yeah, I think what you said is really important that, you know, really good.

00:24:51.858 --> 00:24:56.898
Powerful and correct theory has to apply across species, has to be general enough.

00:24:57.358 --> 00:25:03.698
And so we're particularly pleased that this theory works for invertebrates as well, like in flies.

00:25:03.958 --> 00:25:13.958
And in particular, in flies, the photoreceptors synapse on the cells called,

00:25:14.438 --> 00:25:25.318
large monopolar cells, which have two biggest classes that are called L1 and L2,

00:25:26.258 --> 00:25:35.298
and they're very similar in their initial response and the part of their anatomical

00:25:35.298 --> 00:25:40.358
features, which led us to think about the dual pathway communication,

00:25:40.918 --> 00:25:44.378
which is a hallmark of the lattice filter itself.

00:25:44.818 --> 00:25:50.478
And that's how we got to the idea of the lattice filter, thinking about L1 and

00:25:50.478 --> 00:25:55.838
L2 as being those two pathways, the forward and the backward prediction error filter.

00:25:56.058 --> 00:26:01.318
So what evidence did you find that they could indeed exchange information in

00:26:01.318 --> 00:26:02.978
a way that would be consistent with that model?

00:26:04.638 --> 00:26:15.198
So that evidence is still somewhat sketchy because it is very hard to do electrophysiology in flies.

00:26:15.518 --> 00:26:20.338
And what electrophysiology has been done was based mostly on recording from

00:26:20.338 --> 00:26:24.958
cell bodies, although there was some in the axons.

00:26:26.158 --> 00:26:32.678
But those measurements initially did not show difference in response properties between L1 and L2.

00:26:32.838 --> 00:26:36.918
However, more recent measurements of the calcium dynamics

00:26:37.318 --> 00:26:44.078
in the axon terminals of L1, L2, which is the output of those cells,

00:26:44.258 --> 00:26:48.078
showed a different response between L1 and L2.

00:26:48.278 --> 00:26:53.298
And in fact, the kind of response that has been reported by the clandinin lab

00:26:53.298 --> 00:27:05.058
shows features indicative of the forward and backward pathways of the lightest Right.

00:27:06.538 --> 00:27:09.798
So, now you're in a very unique position, right?

00:27:09.858 --> 00:27:15.418
Because you work at Janelia Farms, and you also have been very much involved

00:27:15.418 --> 00:27:20.138
in a very detailed reconstruction of the brain of these flies, right?

00:27:21.798 --> 00:27:25.818
Now, the data set that you have for playing there, and I hope that you can explain

00:27:25.818 --> 00:27:29.738
to me a little bit what you guys have been doing there, this is also giving

00:27:29.738 --> 00:27:33.338
you now a grounding again to look at this more theoretical model, right?

00:27:33.418 --> 00:27:38.838
So what's the data you have in your hands now on that fly brain at the anatomical

00:27:38.838 --> 00:27:42.598
level that would help you understand this kind of filter model?

00:27:43.658 --> 00:27:49.718
So what you're referring to is another direction in my group,

00:27:49.778 --> 00:27:57.238
which is a high-throughput reconstruction of the connectome or the wiring diagram

00:27:57.238 --> 00:27:59.138
of the brain on the synapse level.

00:27:59.658 --> 00:28:08.438
And we've been doing that in the fly visual system, and this project is now bearing fruits.

00:28:08.438 --> 00:28:16.658
And in particular, we were able to assemble the visual pathway in fly through

00:28:16.658 --> 00:28:19.198
the first two neuropills, lamina and medulla.

00:28:19.398 --> 00:28:30.858
And medulla was done for the first time. And what we have now is a kind of idealized processing column,

00:28:31.038 --> 00:28:37.878
which repeats itself in the visual system in parallel.

00:28:38.238 --> 00:28:45.758
And that column contains about 50 neurons, and we have attempted to map out

00:28:45.758 --> 00:28:49.118
all the synaptic connections between those 50 neurons.

00:28:49.438 --> 00:28:51.298
How many synaptic connections do they have?

00:28:51.978 --> 00:28:55.418
So it's of order of 10,000. Okay.

00:28:56.298 --> 00:28:58.938
That's including the gap junction, so that's only the synaptic connections?

00:28:59.338 --> 00:29:05.658
The current imaging techniques that we're using based on electron microscopy

00:29:05.658 --> 00:29:11.718
doesn't allow us to see gap junctions clearly in our data set.

00:29:11.858 --> 00:29:15.618
So what we're reporting is just the chemical synapse. Right, okay.

00:29:15.778 --> 00:29:23.058
So now the wiring diagram that you now have extracted from that column How does

00:29:23.058 --> 00:29:26.498
it map onto this model of an adaptive filter?

00:29:27.553 --> 00:29:31.453
So parts of that wiring diagram are consistent with the lattice filter,

00:29:31.753 --> 00:29:37.273
but what we are also seeing is that the circuit is more complex,

00:29:37.453 --> 00:29:42.693
and in particular, it seems that it isn't just focused on the decorrelation,

00:29:42.813 --> 00:29:47.713
as one would expect from the compression and redundancy reduction point of view,

00:29:47.813 --> 00:29:51.413
but also on the feature extraction that we mentioned previously,

00:29:52.433 --> 00:29:57.413
using, of course, the correlation, but on feature extraction that can be used

00:29:57.413 --> 00:30:05.333
to build other features for more specific purposes, such as,

00:30:05.393 --> 00:30:06.993
for example, motion detection.

00:30:07.433 --> 00:30:11.533
But how would I see that at an anatomical level? So I have my photoreceptor

00:30:11.533 --> 00:30:18.273
projections, and I have my lobula, and I have my lamina and medulla.

00:30:18.273 --> 00:30:23.673
So what are the specific wiring templates, if you want, that you can now extract from this?

00:30:23.793 --> 00:30:28.093
Okay, this wiring template is more the predictive filter, and that wiring template

00:30:28.093 --> 00:30:29.793
is, let's say, feature extraction.

00:30:31.213 --> 00:30:42.233
Yeah, so, of course, just from anatomy, it might be hard to know that conclusively.

00:30:42.233 --> 00:30:45.353
So what we

00:30:45.353 --> 00:30:48.153
know already is that you know l1 and

00:30:48.153 --> 00:30:51.533
l2 both are postsynaptic

00:30:51.533 --> 00:30:54.253
to photoreceptors which is what you would

00:30:54.253 --> 00:30:58.193
expect in the first stage of the lattice filter we know that they interact by

00:30:58.193 --> 00:31:02.473
means of gap junctions which have been reported by the bores lab so they could

00:31:02.473 --> 00:31:06.813
potentially be the two pathways forward and backward pathways of the lattice

00:31:06.813 --> 00:31:13.013
filter and their output as measured by calcium imaging seems to support that interpretation.

00:31:13.753 --> 00:31:17.453
What happens afterwards in the medulla is.

00:31:19.091 --> 00:31:26.211
Are rather somewhat unclear and still work in progress, but it is already known

00:31:26.211 --> 00:31:31.391
that L1 pathway and L2 pathway are involved in motion detection.

00:31:31.911 --> 00:31:35.231
And so what we are doing,

00:31:35.251 --> 00:31:42.751
we're tracing through the cells postsynaptic through L1 and L2 to see if they

00:31:42.751 --> 00:31:50.811
would be consistent with the interpretation of feature construction or decorrelation.

00:31:51.051 --> 00:31:58.071
Okay, but now if you have 50 neurons a column you have about 10,000 connections

00:31:58.071 --> 00:32:01.931
so I could argue that in that setup

00:32:01.931 --> 00:32:05.251
you can find basically any type of connection pattern you would like.

00:32:05.491 --> 00:32:10.691
So in some sense the question is more about which obvious connection patterns are absent.

00:32:11.091 --> 00:32:16.811
So which pattern of connectivity is absent that would support your hypothesis?

00:32:18.351 --> 00:32:21.891
So, let me say it this way.

00:32:22.051 --> 00:32:28.251
So, it is true that there is a zoo of connections, and we have to orient ourselves in it.

00:32:28.891 --> 00:32:35.471
But there are a few things that help us. The first one is that each connection

00:32:35.471 --> 00:32:40.651
has a multiple number of synapses in parallel.

00:32:40.651 --> 00:32:46.571
So if neurons A and B have synaptic connection, there are usually tens or sometimes

00:32:46.571 --> 00:32:49.591
more than a hundred synaptic contacts in parallel.

00:32:49.991 --> 00:32:55.831
And so we can order connections in terms of their strength by using the number

00:32:55.831 --> 00:32:58.111
of contacts as a proxy for connection weight.

00:32:58.291 --> 00:33:01.471
And of course, initially we focus on the strongest connection.

00:33:01.791 --> 00:33:06.311
So in some sense, we first look at the scaffolding of that network.

00:33:07.191 --> 00:33:10.771
That's the first thing that we use.

00:33:10.911 --> 00:33:18.631
The second is how that circuit is divided in between columns.

00:33:19.551 --> 00:33:24.571
So the fly visual system, if you think about looking at a fly eye,

00:33:25.271 --> 00:33:32.891
starts with about 800 so-called amatidia, which correspond to basically pixels

00:33:32.891 --> 00:33:37.751
of the image that the fly sees.

00:33:38.151 --> 00:33:44.131
And then each of those pixels is initially processed independently from the other.

00:33:44.891 --> 00:33:49.931
And that forms the basis of the so-called column that has about 50 neurons.

00:33:50.071 --> 00:33:50.991
That's the one I described.

00:33:51.391 --> 00:33:54.771
And the processing structure is rather periodic.

00:33:55.051 --> 00:33:59.771
So it's almost a crystalline structure of 800 units. Okay.

00:34:00.211 --> 00:34:04.891
And we focused initially on the connection within the unit.

00:34:06.434 --> 00:34:13.894
But if the structure were to perform motion detection, it has to correlate signals

00:34:13.894 --> 00:34:16.334
from adjacent pixels, at least.

00:34:17.354 --> 00:34:20.094
And therefore, there must be connections between the columns.

00:34:20.654 --> 00:34:26.074
And so we know then that the connections that are necessary for motion detection

00:34:26.074 --> 00:34:28.114
should span multiple columns.

00:34:28.114 --> 00:34:37.354
And that's how we can determine which connections would have to be involved in motion detection.

00:34:37.634 --> 00:34:41.114
So if we did not see any connections between columns at this stage,

00:34:41.354 --> 00:34:45.774
we would be very surprised because then the system couldn't do motion detection.

00:34:46.074 --> 00:34:52.094
Fortunately, we found such connections, and they are a natural substrate for motion detection.

00:34:52.094 --> 00:34:57.474
Did you find any evidence for the famous Reichardt detector that sort of tries

00:34:57.474 --> 00:35:01.614
to extract motion by correlating different input signals?

00:35:01.894 --> 00:35:04.954
So I think this is a million-dollar question, of course,

00:35:05.014 --> 00:35:10.854
and I think philosophically the Reichardt detection is correct in the sense

00:35:10.854 --> 00:35:18.434
that the system is comparing a signal from one pixel with a delayed signal from an adjacent pixel.

00:35:18.434 --> 00:35:27.354
But we seem to favor a slightly different form of such comparison,

00:35:27.594 --> 00:35:31.474
which actually does not involve multiplication,

00:35:31.954 --> 00:35:38.774
but contains another non-linearity that is necessary for forming a motion signal.

00:35:38.914 --> 00:35:42.114
Which is what, the threshold? Erectification. Okay.

00:35:43.274 --> 00:35:48.334
So this is pretty amazing, right? So sometimes you guys have it all now because

00:35:48.334 --> 00:35:54.114
you have access to an exquisite data set of this brain. You have a theory.

00:35:55.190 --> 00:35:59.390
Now you're trying to match. But in some sense, I could say, but maybe you're

00:35:59.390 --> 00:36:00.970
barking up the wrong tree, right?

00:36:01.030 --> 00:36:07.090
Because maybe the fly brain or any brain did not evolve to optimize signal transduction.

00:36:07.150 --> 00:36:09.190
It just got optimized to generate behavior.

00:36:09.890 --> 00:36:15.730
And all you're telling me now is how in this complex set of connections in the

00:36:15.730 --> 00:36:17.990
fly brain, I can optimize a signal.

00:36:18.390 --> 00:36:21.870
But at some point, this fly just has to go left or right or up and down the

00:36:21.870 --> 00:36:25.850
land or whatever, or, you know, pursue the sugar.

00:36:26.950 --> 00:36:32.790
So where does this mapping take place? How do I get behavior out of this and

00:36:32.790 --> 00:36:34.870
also functionally relevant responses?

00:36:36.410 --> 00:36:41.450
Right. So this is, of course, the hologram of neuroscience. How do you get from

00:36:41.450 --> 00:36:42.610
sensory inputs to behavior?

00:36:44.130 --> 00:36:48.190
And we think that we're moving in the right direction.

00:36:48.610 --> 00:36:53.230
And there are two arguments that I can make.

00:36:53.350 --> 00:36:58.230
Well, the first one is the idea behind those redundancy reduction and predictive

00:36:58.230 --> 00:37:04.730
coding approaches is to come up with some theoretical framework which is not

00:37:04.730 --> 00:37:11.030
based on the specific task that the animal has to perform, right?

00:37:11.110 --> 00:37:15.190
So you say, well, I want to communicate information to the rest of the brain

00:37:15.190 --> 00:37:18.930
domain as fully and as quickly as possible.

00:37:19.430 --> 00:37:26.410
And then whatever task is needed to do will be possible to do if we preserve all the information.

00:37:27.090 --> 00:37:31.510
So initially, of course, this requirement was viewed as the strength of a theory,

00:37:31.590 --> 00:37:36.190
because you can come up with the predictions that were task-independent,

00:37:36.270 --> 00:37:40.130
which I think is completely appropriate for the front end of the visual system.

00:37:40.310 --> 00:37:45.030
As we are moving further into the visual system, of course, this is not good

00:37:45.030 --> 00:37:49.570
anymore and we have to come up with task-specific computations.

00:37:49.810 --> 00:37:52.810
And I think motion detection is the first step in that direction.

00:37:53.990 --> 00:38:05.190
The second part of my answer is, you know, I would put it in the fallen way,

00:38:05.350 --> 00:38:14.390
which I learned from talking with Sydney Brenner, who spearheaded the reconstruction of the C.

00:38:14.490 --> 00:38:17.650
Elegans conictome about 30 years ago.

00:38:18.290 --> 00:38:24.810
The explanation is the following. Whenever you come up with a model, there is no way to prove.

00:38:26.491 --> 00:38:31.631
Fully and rigorously that the model is correct. You can only disprove models

00:38:31.631 --> 00:38:37.511
by saying that they don't fit experimental facts, and if the model is not disproven,

00:38:37.551 --> 00:38:38.531
it's still in the running.

00:38:40.071 --> 00:38:45.811
But you always get questions, you know, how do you know that your model is the right one?

00:38:47.631 --> 00:38:54.751
How come there couldn't be some other wire that sends information directly from

00:38:54.751 --> 00:38:59.211
the photoreceptors to the avoidance response neurons in flies.

00:39:00.471 --> 00:39:03.891
And that's exactly why we are doing a complete conic-tome reconstruction.

00:39:04.391 --> 00:39:07.931
Because once we reconstruct all the connections in the fly brain,

00:39:08.111 --> 00:39:11.651
we can answer that there is no other wire.

00:39:13.271 --> 00:39:19.151
And so I think that if we combine the theoretical approaches with electrophysiology

00:39:19.151 --> 00:39:22.871
and with behavioral tests with the conic-tome,

00:39:22.871 --> 00:39:26.271
that's when we can say well no we

00:39:26.271 --> 00:39:30.111
are not barking up the wrong tree um this is

00:39:30.111 --> 00:39:32.891
a necessary condition the model would have to

00:39:32.891 --> 00:39:35.811
use this particular pathway but now

00:39:35.811 --> 00:39:39.631
that but there's one aspect that i'm missing because you could

00:39:39.631 --> 00:39:42.731
also are you look certainly if you look at the insect case we know

00:39:42.731 --> 00:39:46.071
that that after the medulla we start

00:39:46.071 --> 00:39:49.831
to hit these whitefield neurons that we know physiologically show

00:39:49.831 --> 00:39:52.951
responses that are highly adapted to the behavior of

00:39:52.951 --> 00:39:56.191
the animal like you might have specific optic flow patterns you

00:39:56.191 --> 00:40:02.191
might have approaching obstacles like time to contact type responses so in that

00:40:02.191 --> 00:40:06.991
sense you just want in your in your model and in your reconstruction just one

00:40:06.991 --> 00:40:12.051
synapse away from that response so isn't another test of your model to show

00:40:12.051 --> 00:40:16.051
that which is one single synaptic step,

00:40:16.251 --> 00:40:21.291
you can then generate these kinds of established and also behaviorally relevant

00:40:21.291 --> 00:40:24.271
physiological responses of your wide-field neurons.

00:40:24.971 --> 00:40:27.291
So how would that work in your model?

00:40:28.922 --> 00:40:36.062
Yeah, so I think that I cannot give you a complete answer now because we haven't

00:40:36.062 --> 00:40:37.542
done that part of the work,

00:40:37.722 --> 00:40:41.982
but this is something that people have thought about in the motion detection

00:40:41.982 --> 00:40:46.842
context because once you have elementary motion detection that detect motion

00:40:46.842 --> 00:40:50.022
in a local part of the visual field,

00:40:50.882 --> 00:40:58.202
then the inputs of those local detectors can be combined to produce a global

00:40:58.202 --> 00:41:03.722
motion response in a given neuron.

00:41:04.562 --> 00:41:10.822
And one example of those neurons are, for example, the neurons that are supposed

00:41:10.822 --> 00:41:16.022
to reflect rotations around different axes,

00:41:16.982 --> 00:41:26.142
in the flight of a fly, which require a very particular map of the motion response direction,

00:41:26.542 --> 00:41:35.842
and they can be, of course, built by using motion detectors in different parts

00:41:35.842 --> 00:41:39.542
of the visual field with different directional selectivity.

00:41:39.882 --> 00:41:46.242
Right. But then, in your case, to extract those features, you would have to

00:41:46.242 --> 00:41:49.082
tap into the filter cascade at multiple levels.

00:41:50.420 --> 00:41:53.860
You cannot just read it out at a single level, if I understood it right.

00:41:54.640 --> 00:42:05.680
It depends on how complicated a temporal sequence you need to predict the response.

00:42:06.040 --> 00:42:12.640
I think for the motion detection, you need to just compare two points in time,

00:42:13.780 --> 00:42:18.960
at least if I take a correlation-based framework of Reichert and Hassenstein seriously.

00:42:20.420 --> 00:42:24.800
So I think that you should be able to do that relatively simply.

00:42:24.960 --> 00:42:29.600
But of course, for more complex predictions, then you would have to tap into

00:42:29.600 --> 00:42:31.340
the lattice filter on many stages.

00:42:31.500 --> 00:42:36.640
Right. Okay. So that might be not a testable prediction that would come from this framework.

00:42:37.000 --> 00:42:42.360
That's correct. Okay. So that means if in your reconstruction of this fly brain,

00:42:42.640 --> 00:42:48.160
you do not find a very wide multi-scale readout from these wide-field neurons

00:42:48.160 --> 00:42:52.740
in this whole column or this set of columns,

00:42:53.480 --> 00:42:56.500
then the cascade filter might not be the way the problem is solved.

00:42:56.940 --> 00:43:02.760
That is true, of course. But I have to say that our observation is that,

00:43:03.680 --> 00:43:06.980
unlike engineering application of lattice filters, for example,

00:43:06.980 --> 00:43:12.940
in speech processing, it's not uncommon to have a hundred or more stages of the lattice filter.

00:43:13.520 --> 00:43:16.900
We think in the brain, there aren't as many stages.

00:43:17.820 --> 00:43:25.780
And just having a few stages can accomplish a lot because the processing in

00:43:25.780 --> 00:43:29.740
the brain is applied not on a single channel level,

00:43:29.800 --> 00:43:39.580
because those columns actually interact with each other the further you go into the system.

00:43:40.120 --> 00:43:44.320
Then the filter can accomplish a lot, actually, with just a few stages.

00:43:45.200 --> 00:43:48.580
Okay, but that's the kind of compression of such a filter that the engineers

00:43:48.580 --> 00:43:49.580
haven't really tried yet.

00:43:50.663 --> 00:43:57.603
Well, I cannot say that they haven't tried, but it's certainly not just a textbook version of a filter.

00:43:58.643 --> 00:44:02.863
I don't know, it may exist in the literature on some level.

00:44:02.863 --> 00:44:11.543
I think one thing that it seems that the brain is using all the time that is

00:44:11.543 --> 00:44:15.163
rare in engineering is the use of nonlinearities.

00:44:16.663 --> 00:44:22.563
And I think that's one thing we can learn from brains. But I guess engineers

00:44:22.563 --> 00:44:24.243
have good reasons not to use them.

00:44:25.023 --> 00:44:31.183
Because it complicates their lives. Exactly. It's difficult to analyze and understand. Exactly. Right.

00:44:31.523 --> 00:44:34.003
And nature doesn't have these kinds of scruples.

00:44:34.983 --> 00:44:41.943
So we also touched upon this whole issue of, let's say, optimal encoding frameworks

00:44:41.943 --> 00:44:46.623
versus finite capacity models, right?

00:44:46.663 --> 00:44:49.603
So the adaptive filter is more finite capacity, because you start to squeeze

00:44:49.603 --> 00:44:53.623
as much information as you can through a channel that's this bottleneck.

00:44:54.603 --> 00:44:58.763
But in optimal coding frameworks you start to sort of you're not too worried

00:44:58.763 --> 00:45:03.163
about that problem right you just worry about how do I sort of compress my information

00:45:03.163 --> 00:45:06.323
in an optimal way in sort of information theoretical terms,

00:45:07.023 --> 00:45:12.183
so do you see this as contradictory approaches or do you see this as complementary

00:45:12.183 --> 00:45:17.463
do you see this column of 50 cells in the thigh brain maybe doing both or is

00:45:17.463 --> 00:45:20.923
it really sort of more an exclusive choice that we have to make here.

00:45:22.755 --> 00:45:28.715
I'm not sure I got the exact... Well, in the engineering literature,

00:45:29.175 --> 00:45:32.555
these would be seen as different approaches, right?

00:45:32.635 --> 00:45:36.615
It's possibly also contradictory, whether you deal with optimal coding or with

00:45:36.615 --> 00:45:38.175
finite capacity channels.

00:45:39.595 --> 00:45:48.535
Oh, yeah. Yeah, so I think that at this point, the experimental evidence is

00:45:48.535 --> 00:45:53.575
just maybe barely sufficient to make such a fine distinction.

00:45:55.335 --> 00:46:00.835
But that's currently, you know, we're investigating that currently and see if

00:46:00.835 --> 00:46:04.315
we need additional experiments to make that distinction clearly. Right.

00:46:05.075 --> 00:46:07.855
So the other thing is, if we now go back to the vertebrate case,

00:46:08.035 --> 00:46:12.615
we have the retina, now you have the LGN, you give us a model of how we can think about this.

00:46:12.635 --> 00:46:16.575
There's of, let's say, optimally decorrelating these inputs.

00:46:17.195 --> 00:46:21.295
And now we hit the cortex. And then you could argue, well, now the cortex has

00:46:21.295 --> 00:46:23.235
this perfectly massaged signal.

00:46:23.955 --> 00:46:27.975
So that means from a signal processing perspective, the game for cortex should

00:46:27.975 --> 00:46:29.895
become a different game, right?

00:46:29.955 --> 00:46:32.535
Because now it's It's optimally decorrelated. It's a perfect signal.

00:46:33.075 --> 00:46:37.835
So from the perspective of an adaptive filter, your cortex, so this higher level

00:46:37.835 --> 00:46:39.955
processing story, will be playing a different game.

00:46:40.495 --> 00:46:43.435
What would that game be if you would have to guess today?

00:46:44.855 --> 00:46:49.055
Well, I guess I would have to speculate at this point.

00:46:49.055 --> 00:46:59.035
But I think the way I would put it is that if the goal of predictive coding

00:46:59.035 --> 00:47:04.895
is to de-correlate the signal as much as possible,

00:47:05.055 --> 00:47:11.975
then the stages in the retina and the LGN take you as far as possible with linear filters.

00:47:14.115 --> 00:47:17.895
Yes, there are non-linearities in neurons along the way, but there is also evidence

00:47:17.895 --> 00:47:23.095
they can conspire in a way to generate a linear response.

00:47:24.575 --> 00:47:31.115
Now, when you get to the cortex, the responses of the cells are decidedly non-linear,

00:47:31.235 --> 00:47:35.355
as for example exhibited by the complex cells in V1.

00:47:35.355 --> 00:47:43.275
And so what I think is happening is that the cortex may be continuing the job

00:47:43.275 --> 00:47:48.655
of the decorrelating of the input stimulus and the feature construction,

00:47:49.055 --> 00:47:53.535
which the lattice filter stages were doing before,

00:47:53.535 --> 00:48:01.935
but invoking additional non-linearities to make even better predictions and

00:48:01.935 --> 00:48:07.375
make the outgoing signals even more independent than is possible with linear filters.

00:48:07.715 --> 00:48:14.255
Okay. Or possibly actually start to correlate again, to group features together in meaningful ways.

00:48:14.595 --> 00:48:20.755
To predict function. Yeah. Predict behavioral output, yes. Yeah.

00:48:22.195 --> 00:48:27.415
Okay, that's great. So we made progress here in understanding sensory systems.

00:48:28.155 --> 00:48:34.135
So, but now, Dimitri, so to finish up, we always have two questions,

00:48:34.255 --> 00:48:36.255
right? So you come from theoretical physics.

00:48:36.895 --> 00:48:42.195
You have been working very hard on, let's say, the anatomy of these brains,

00:48:42.295 --> 00:48:43.775
so you know really how hard that is.

00:48:44.275 --> 00:48:48.035
And then you try to combine that now with also theoretical work,

00:48:48.195 --> 00:48:49.975
which I think is actually the only way forward, right?

00:48:49.995 --> 00:48:52.835
Look at all this detail within anatomy.

00:48:53.535 --> 00:48:57.355
So on the basis of experience, what would be Dimitri's law that we should all

00:48:57.355 --> 00:49:02.095
follow in our aims to understand the brain and behavior?

00:49:09.955 --> 00:49:11.195
That's a tough question.

00:49:18.035 --> 00:49:31.275
So, I come from a tradition in theoretical physics which favored starting with a simple,

00:49:31.815 --> 00:49:37.635
and intuitive model of even the most complex phenomena that one could be studying.

00:49:37.635 --> 00:49:46.955
And the reason for doing this is, I think, because our brains are better suited

00:49:46.955 --> 00:49:49.975
at analyzing the simple models.

00:49:50.355 --> 00:49:55.755
And by simple, I mean models involving very few relevant parameters.

00:49:56.895 --> 00:50:03.675
And if you build such a model, it kind of prevents you from overfeeding experimental

00:50:03.675 --> 00:50:08.115
data. And also, it makes all the assumptions very transparent.

00:50:08.955 --> 00:50:12.115
And I think that my.

00:50:15.366 --> 00:50:22.386
My style of work in theoretical neuroscience is strongly based on that tradition

00:50:22.386 --> 00:50:23.726
in theoretical physics,

00:50:23.986 --> 00:50:33.746
where the simplicity and the clarity of the model is the main driving force.

00:50:33.746 --> 00:50:41.646
And so, even when studying such complex things like the brain and how it computes,

00:50:42.026 --> 00:50:50.326
I would like to start with the situations that can be broken down on the very simple level involving,

00:50:51.186 --> 00:50:57.086
very few variables and few or no adjustable parameters.

00:50:57.086 --> 00:51:01.826
And that's why we focused on the sensory system.

00:51:02.246 --> 00:51:09.586
And only once we get a foothold there, I think then we can move on further.

00:51:09.766 --> 00:51:17.226
But again, chipping away a small and digestible part of the problem that we

00:51:17.226 --> 00:51:21.686
can solve in a clear and intuitive way, and then move on.

00:51:22.146 --> 00:51:25.406
Okay, so Dmitri's Law is keep it simple. Pretty much.

00:51:25.566 --> 00:51:29.146
Okay, very good. Good. And the second one, so if I'm going to get you back here

00:51:29.146 --> 00:51:33.946
five years from now, since I'm such an unpleasant person, I would like to remind

00:51:33.946 --> 00:51:35.566
you of the predictions you have been making.

00:51:36.726 --> 00:51:41.726
So what's the one prediction that you feel most strongly about today?

00:51:41.826 --> 00:51:45.846
I should remind you of five years from now to ask you, does that really come out?

00:51:47.526 --> 00:51:50.866
What's that one prediction I should remind you of five years from now?

00:51:55.408 --> 00:52:05.568
So I think that our strongest predictions are the shapes of receptive fields of the LGN neurons,

00:52:05.908 --> 00:52:16.428
and although some of them have been seen already, I think that there are more

00:52:16.428 --> 00:52:20.408
details there that the lattice filter model contains,

00:52:20.768 --> 00:52:26.688
but haven't been completely verified experimentally, and in particular on the

00:52:26.688 --> 00:52:33.308
circuit level, I think we have a model for computation,

00:52:33.648 --> 00:52:39.408
but how it is implemented in terms of individual neuron synaptic properties.

00:52:40.688 --> 00:52:44.768
For example, in LGN, there is a structure called triadic synapse and so on.

00:52:44.768 --> 00:52:53.648
On how that structure builds up the receptive field, the lattice filter predicts, is not clear.

00:52:53.868 --> 00:52:55.748
But the prediction would be

00:52:55.748 --> 00:53:01.268
that it is exactly the computation that is proposed by the lattice filter,

00:53:02.288 --> 00:53:09.968
which is an all-pass filter, which has a frequency-depending phase delay.

00:53:10.248 --> 00:53:14.608
Right. Very good. Good. So, Dmitry Slotsky, thank you very much for this conversation.

00:53:14.948 --> 00:53:18.188
Thank you very much, Paul, for having me here. Great.

00:53:20.328 --> 00:53:26.008
The CSN podcast was produced by the Convergent Science Network of Biometrics

00:53:26.008 --> 00:53:32.388
and Biohybrid Systems, a project funded by the European 7th Research Framework Program.

00:53:32.880 --> 00:54:00.698
Music.