WEBVTT

00:00:03.597 --> 00:00:10.517
This is the Convergent Science Network podcast. Leading researchers in the domain

00:00:10.517 --> 00:00:16.797
of neuroscience, brain theory and technology are interviewed by Paul Verschure and Tony Prescott.

00:00:20.037 --> 00:00:23.377
This is Paul Verschure with the Convergent Science Network podcast.

00:00:24.357 --> 00:00:29.257
And in this episode, I'm speaking with Giovanni Pizzulo, who is a speaker at

00:00:29.257 --> 00:00:33.697
our summer school, Barcelona Cognition Brain and Technology summer school.

00:00:34.197 --> 00:00:42.217
And Giovanni, you started out with investigating this whole issue of the predictive brain.

00:00:42.697 --> 00:00:45.437
Yes. So what does it mean for your predictive brain exactly?

00:00:46.537 --> 00:00:51.157
Well, in a sense, the predictive brain is just a big idea.

00:00:51.317 --> 00:00:54.917
Well, the idea is that the brain is not a passive organ.

00:00:54.917 --> 00:00:58.277
So rather than just sitting down and waiting for the next stimulus,

00:00:58.457 --> 00:01:02.717
in a sense, it really always tries to anticipate what comes next,

00:01:02.837 --> 00:01:07.477
to set up some internal goals, and so to have some big internal processing.

00:01:07.637 --> 00:01:12.397
So rather than being stimulus-based, it's really trying to anticipate the necessities

00:01:12.397 --> 00:01:14.997
or maybe the opportunities for action.

00:01:14.997 --> 00:01:19.357
So that's really the big principle that drives all the brain processing because

00:01:19.357 --> 00:01:23.897
now, because the brain has some predictions, some goals, it can really organize

00:01:23.897 --> 00:01:26.317
its sensory processing, its attention processes,

00:01:26.477 --> 00:01:33.417
and its motor preparation processes for quickly responding and doing some adaptive actions.

00:01:33.617 --> 00:01:36.837
So to do that, it has to be predictive and proactive. Mm-hmm.

00:01:37.902 --> 00:01:42.202
But then where does that really start? What are the good examples of this kind of prediction?

00:01:43.322 --> 00:01:49.762
Well, in the literature, there are many, many partially disconnected literature on prediction.

00:01:50.002 --> 00:01:53.702
So predictions in sensory processing, predictions in the motor domain,

00:01:54.002 --> 00:01:56.842
predictions also in higher cognition, in that, for instance,

00:01:56.942 --> 00:02:00.142
I can try to anticipate what your next question will be.

00:02:00.242 --> 00:02:05.362
So all these literatures can now proceed a little bit disconnected.

00:02:05.362 --> 00:02:11.382
But for instance, in the sensory domain, you see a lot of predictive stuff going

00:02:11.382 --> 00:02:14.442
on, such as the anticipation of the next stimulus.

00:02:15.122 --> 00:02:21.002
In the motor control domain, you really have to get rid of what are the consequences

00:02:21.002 --> 00:02:23.862
of your action for many reasons.

00:02:23.862 --> 00:02:28.202
One big reason in the motor domain is that sensory perceptions are very ambiguous,

00:02:28.402 --> 00:02:32.802
so by also predicting what comes next, you better estimate the state of the world.

00:02:33.402 --> 00:02:38.222
Another big reason is for decision-making. So if you can really anticipate what

00:02:38.222 --> 00:02:39.522
the effects of your action are,

00:02:39.742 --> 00:02:43.162
then you can also select them on different courses of actions beforehand.

00:02:43.162 --> 00:02:47.362
Beforehand, such as, for instance, if I anticipate that by going left,

00:02:47.602 --> 00:02:53.582
I will get some big reward, and by going right, I will get some big punishment,

00:02:53.682 --> 00:02:55.162
then that's the decision.

00:02:55.922 --> 00:03:00.862
But now, in some sense, let's say also the physiological study of learning that

00:03:00.862 --> 00:03:03.722
started with Pavlov, one of

00:03:03.722 --> 00:03:07.442
the first observations of Pavlov was that the brain is predicting, right?

00:03:07.502 --> 00:03:11.862
So that's also a key feature of classical conditioning. So in some sense,

00:03:11.982 --> 00:03:15.122
the concept of prediction is with us for a long time.

00:03:15.242 --> 00:03:19.782
And as some speakers have said, like sometimes certain things get so old that

00:03:19.782 --> 00:03:20.842
they sound new again, right?

00:03:20.922 --> 00:03:27.862
So what is actually really new in this current movement of the brain as a predictor?

00:03:28.722 --> 00:03:32.902
Well, actually, I completely agree that also in classical conditioning,

00:03:33.002 --> 00:03:35.542
you have this predicting dynamics going on.

00:03:35.702 --> 00:03:41.582
But I think that we should try to really distinguish at least two kinds of predictions.

00:03:43.097 --> 00:03:45.737
It's not a sharp distinction, but it's useful to do probably.

00:03:45.917 --> 00:03:50.097
So one is an implicit kind of prediction in that, for instance,

00:03:50.257 --> 00:03:51.937
in the conditioning studies.

00:03:52.217 --> 00:03:57.377
So a stimulus, which is typically highly predictive of another stimulus,

00:03:57.597 --> 00:03:59.717
which is in turn highly predictive of reward.

00:04:00.157 --> 00:04:03.057
So the first stimulus, which is predictive of another stimulus,

00:04:03.257 --> 00:04:06.037
it becomes itself good to achieve.

00:04:06.037 --> 00:04:10.157
So in that case the brain becomes implicitly

00:04:10.157 --> 00:04:13.477
predictive in that it stores the relationships between

00:04:13.477 --> 00:04:16.737
the first and the second stimulus but to do

00:04:16.737 --> 00:04:19.737
that you don't really need to to maintain into

00:04:19.737 --> 00:04:22.737
your memory or to an internal representation of the

00:04:22.737 --> 00:04:29.637
predictive relation between the two you just attach some good label to the to

00:04:29.637 --> 00:04:34.837
the first stimulus whereas there is a second kind of prediction which is a more

00:04:34.837 --> 00:04:40.437
explicit prediction in which you keep into your brain some model of the environment.

00:04:40.877 --> 00:04:46.197
So the point is that in the first case you really have some implicit mechanisms

00:04:46.197 --> 00:04:51.177
that link good or bad to sequences of things, whereas in the second mechanism

00:04:51.177 --> 00:04:54.797
you also maintain a model of these sequences of things.

00:04:54.937 --> 00:05:02.457
For instance in the motor domain you can, for instance, let's say you have to grasp a moving ball.

00:05:02.917 --> 00:05:07.337
There are two different ways to do that. So one way is to look at the moving

00:05:07.337 --> 00:05:10.457
ball and then jump to some future state.

00:05:11.544 --> 00:05:15.844
To some future state that the ball will reach at some point.

00:05:16.404 --> 00:05:21.304
This can be done in two ways. The first way is just learning the contingencies automatically.

00:05:21.804 --> 00:05:29.104
So I know that if I see the ball moving, then I have to go to some position,

00:05:29.424 --> 00:05:31.864
which is not the current position, but the future position.

00:05:32.604 --> 00:05:36.504
Without even anticipating that position explicitly, you can do that automatically.

00:05:36.504 --> 00:05:43.864
The second way to do that is really doing what is called a forward model,

00:05:44.044 --> 00:05:47.704
so a predictive model that really tells you where the ball will be in the future.

00:05:48.024 --> 00:05:52.464
Use this forward model online, so this predictive mechanism online,

00:05:52.744 --> 00:05:56.704
and then move the hand to the predicted position.

00:05:58.364 --> 00:06:03.004
So to recapitulate this idea, there are implicit and explicit mechanisms,

00:06:03.124 --> 00:06:06.984
and the novelty of or the predictive brain hypothesis, if you want,

00:06:07.104 --> 00:06:10.544
is the hypothesis that the brain systematically incorporates,

00:06:12.024 --> 00:06:17.824
regularities of the external environment in a structured way so as to form strong

00:06:17.824 --> 00:06:22.124
models, statistical models, for instance, of the regularities of the external environment.

00:06:22.544 --> 00:06:28.324
And then it systematically uses these models to plan what to do next and also

00:06:28.324 --> 00:06:31.384
to drive perception, to drive attention, and so on.

00:06:31.384 --> 00:06:34.724
So, whereas in the first formulation of prediction,

00:06:35.024 --> 00:06:39.364
it was more an implicit prediction, in the predictive brain hypothesis,

00:06:39.584 --> 00:06:44.124
at least in some of its formulation, it's much more about building models for

00:06:44.124 --> 00:06:48.564
the world, on top of which you can run explicit predictions. Right, so then….

00:06:49.507 --> 00:06:53.127
If you would look at, let's say, the theoretical literature on the predictive

00:06:53.127 --> 00:06:58.947
brain, what in your mind are right now the outstanding examples of this approach?

00:06:59.567 --> 00:07:03.307
Well, there are many of them.

00:07:03.567 --> 00:07:11.987
So probably in a very famous paper by Rao and Ballard, it was already in the

00:07:11.987 --> 00:07:16.267
99, if I remember well, it was about this predictive coding idea,

00:07:16.507 --> 00:07:18.287
predictive coding idea in perception.

00:07:18.287 --> 00:07:22.687
So the idea is that the brain continuously generates prediction about the stimulus,

00:07:22.907 --> 00:07:29.967
and it uses these predictions in a top-down manner, whereas it uses prediction errors,

00:07:30.247 --> 00:07:33.607
so the difference between the prediction and the sensory evidence,

00:07:33.867 --> 00:07:36.227
as a revision mechanism.

00:07:36.227 --> 00:07:39.747
So that was a milestone in a sense.

00:07:39.887 --> 00:07:44.207
But, well, there are many other papers now. There is a very big framework put

00:07:44.207 --> 00:07:48.787
forward by Carl Freestone, which is called the Free Energy Framework or the

00:07:48.787 --> 00:07:50.047
Active Inference Framework.

00:07:50.127 --> 00:07:53.347
In that sense, Freestone really tries to look at the whole brain,

00:07:53.447 --> 00:07:57.187
not just the perceptual system, as a big prediction machine.

00:07:59.087 --> 00:08:03.927
There are many other frameworks. One is put forward by Moshibar,

00:08:04.047 --> 00:08:05.387
again, in the perceptual domain,

00:08:05.627 --> 00:08:09.987
whereas in the motor domain, also the leading view, one of the most authoritative

00:08:09.987 --> 00:08:15.847
view put forward by people such as Danny Wolpert or by Shadmer or many other

00:08:15.847 --> 00:08:19.227
people, is that in reality,

00:08:19.407 --> 00:08:23.347
what you really predict is the sensory consequences of your actions.

00:08:24.447 --> 00:08:27.907
So that framework is more tied to motor predictions.

00:08:27.907 --> 00:08:34.027
Other people, such as Mark Sherrow or others, they have tried to expand this

00:08:34.027 --> 00:08:39.167
framework from the prediction of the sensory consequences of actions to more

00:08:39.167 --> 00:08:40.707
complicated forms of cognition,

00:08:40.947 --> 00:08:45.127
such as reusing this prediction for understanding of the action of others,

00:08:45.367 --> 00:08:48.187
such as reusing this prediction for imagery,

00:08:48.467 --> 00:08:50.967
and so thinking at more abstract situations.

00:08:50.967 --> 00:08:56.787
So, I would say that there are many fields in which prediction have been studied,

00:08:56.947 --> 00:09:00.687
and also some converging frameworks.

00:09:01.027 --> 00:09:08.327
Right. So now, if we talk about your own experiments in the role of prediction in problem solving,

00:09:09.987 --> 00:09:13.587
you showed some examples of what you called embodied problem solving. Yes.

00:09:13.887 --> 00:09:20.447
Right? So how does that, which aspects of the predictive brain are these experiments exercising?

00:09:23.667 --> 00:09:28.367
Well, it's not so simple to explain the videos that I've shown,

00:09:28.547 --> 00:09:32.467
but just to recap, it's just a video of a climber.

00:09:32.587 --> 00:09:38.867
And maybe you know, but climbers prior to a competition, they have some time

00:09:38.867 --> 00:09:42.367
to look at the climbing wall that they will then climb during the competition.

00:09:43.167 --> 00:09:47.347
The nice point is that they see this climbing wall for the first time and there

00:09:47.347 --> 00:09:48.587
are many in these climbing walls,

00:09:48.707 --> 00:09:52.967
there are many climbing holes arranged all over or through the wall and the

00:09:52.967 --> 00:09:58.047
climber has one minute or a few minutes for figuring out how to better climb

00:09:58.047 --> 00:10:04.747
this wall and if you look at the video or if you look at the climbing competition you see people,

00:10:04.747 --> 00:10:11.007
People, climbers really moving the arms such as to really anticipate the next moves.

00:10:11.167 --> 00:10:13.807
They see moving the arms in one direction, then the other direction.

00:10:13.947 --> 00:10:18.907
That's really a form of problem solving because they see the climbing wall for the first time.

00:10:18.947 --> 00:10:21.407
They have to figure out a good plan for action.

00:10:21.607 --> 00:10:25.007
I call it problem solving, not only planning, because it's really complex.

00:10:25.107 --> 00:10:27.147
There are many constraints. It depends on where you go.

00:10:27.767 --> 00:10:32.327
Then you can reach or cannot reach the other hold. So you can also find in your

00:10:32.327 --> 00:10:34.667
mind many, many solutions, compare them.

00:10:35.007 --> 00:10:43.887
And in this specific kind of setups, the hypothesis is that it is really the

00:10:43.887 --> 00:10:47.527
motor system that is governing this process.

00:10:48.327 --> 00:10:55.767
So expert climbers are really able to anticipate the force they have to put into this motor act.

00:10:55.907 --> 00:10:58.767
They can really anticipate a lot of proprioceptive information.

00:10:58.767 --> 00:11:04.027
They can really figure out if a hold is in their reach or out of reach,

00:11:04.167 --> 00:11:08.587
if this hold is too little to be really grasped, if it is too far away for them,

00:11:08.647 --> 00:11:09.927
then they have to take another route.

00:11:10.167 --> 00:11:13.947
So that's, I call it embodied problem solving for many reasons.

00:11:14.027 --> 00:11:18.227
So one reason is that you really use knowledge of your body and of your motor

00:11:18.227 --> 00:11:21.227
system to figure out how to better solve this problem.

00:11:21.227 --> 00:11:24.487
Problem and another reason is that it's also

00:11:24.487 --> 00:11:27.187
overtly embodied in that you really use your

00:11:27.187 --> 00:11:31.767
body your body movements as a scaffold as a help a helping tool for solving

00:11:31.767 --> 00:11:35.907
the problem that's a bit tricky to explain without the video but in a sense

00:11:35.907 --> 00:11:40.427
by simply moving your yourself you don't need to to keep everything into your

00:11:40.427 --> 00:11:44.787
memory you use your body as part of the problem solving process.

00:11:45.747 --> 00:11:50.467
Okay, but now you could also see it as a form of, let's say, motor programming.

00:11:50.587 --> 00:11:55.707
That you say, look, I have to execute a movement sequence very rapidly. Okay.

00:11:56.115 --> 00:12:00.035
Um, and I'm just going to now rehearse this movement sequence that I can sort

00:12:00.035 --> 00:12:06.515
of automatically trigger one set of behavioral motion motor patterns after the

00:12:06.515 --> 00:12:08.675
other, and it can climb up the wall more rapidly.

00:12:09.975 --> 00:12:13.915
Yeah, that's, that's part of the, that's part of the story. So the point is

00:12:13.915 --> 00:12:15.235
forming this motor program.

00:12:15.395 --> 00:12:21.335
Uh, the, the reason why I call it an embodied problem solving is that the solution is not so trivial.

00:12:21.335 --> 00:12:26.875
So the point is that you have to try to form this motor program by assembling

00:12:26.875 --> 00:12:33.955
partial skills that you used in the past in novel ways.

00:12:34.075 --> 00:12:39.935
So there is a lot of flexibility in this new assemblage of motor or small motor programs.

00:12:40.535 --> 00:12:43.955
Yeah, but the problem solving would suggest that there is also,

00:12:44.035 --> 00:12:47.895
let's say, a goal state in the world that you want to achieve in order that

00:12:47.895 --> 00:12:49.975
you also have to manipulate aspects of that world.

00:12:50.435 --> 00:12:55.575
Yeah. Well, in this case, it's essentially making sure your body goes through

00:12:55.575 --> 00:12:56.875
a certain sequence of motions.

00:12:57.815 --> 00:13:03.935
So how would your view on embodied problem solving then generalize to problem solving in general?

00:13:04.095 --> 00:13:07.835
Because I would assume that you want to identify some more, let's say,

00:13:07.855 --> 00:13:12.715
generic aspects of problem solving as opposed to specialized aspects of problem solving.

00:13:13.155 --> 00:13:17.775
Yeah, okay. So there are two parts of this story. So one part is that even in

00:13:17.775 --> 00:13:23.675
that case, even in this climbing example, you have a goal. So the goal is reaching the top.

00:13:24.175 --> 00:13:30.675
And then it is not simply running through a trajectory, but also finding out

00:13:30.675 --> 00:13:31.855
which is the good trajectory.

00:13:32.195 --> 00:13:38.735
So it's not so different from the Tower of London or other typical problem-solving setup.

00:13:38.735 --> 00:13:43.635
So you have a goal state, and you have many, many sequences of moves that you

00:13:43.635 --> 00:13:47.495
can do, and you explore this space of possibilities.

00:13:47.835 --> 00:13:53.235
But the trick is that you explore it in intelligent ways, so you don't simply try out all of them.

00:13:53.455 --> 00:13:57.655
You really use your expertise to find out how to better explore this.

00:13:58.015 --> 00:14:02.595
The second point is that you really use knowledge incorporated into your body,

00:14:02.615 --> 00:14:06.255
your motor programs, to figure out what are the constraints of the problem.

00:14:06.981 --> 00:14:10.381
The climbing problem has many constraints because, as I said before,

00:14:10.541 --> 00:14:16.141
if you are too much to the left of the climbing wall, you cannot reach any more the holds on the right.

00:14:16.381 --> 00:14:24.581
Or vice versa, if you jump too high, you cannot probably make your body stiff

00:14:24.581 --> 00:14:27.361
enough to really hold the climbing hold.

00:14:27.461 --> 00:14:33.181
There are many constraints that you really discover while solving the problem

00:14:33.181 --> 00:14:36.021
by reenacting this motor knowledge. That's my point.

00:14:36.981 --> 00:14:41.261
And the second part of the answer was about how much this generalizes to more

00:14:41.261 --> 00:14:43.001
complex and abstract problem solving.

00:14:43.321 --> 00:14:47.821
So, well, first of all, this is just a nice video that I showed just for illustrating

00:14:47.821 --> 00:14:53.641
the possibility of solving problems by reusing motor knowledge or sensory motor knowledge.

00:14:53.801 --> 00:14:57.641
It could also be affective knowledge. So knowledge incorporated into the same

00:14:57.641 --> 00:15:00.501
models that you use for acting in the external world.

00:15:00.841 --> 00:15:06.201
So I think that as a general strategy, that's the way we should look at higher cognition.

00:15:06.981 --> 00:15:10.881
The challenge is looking at how you solve higher cognitive skills,

00:15:10.901 --> 00:15:16.441
such as problem solving in abstract domains, by reusing these strategies that

00:15:16.441 --> 00:15:21.581
you acquired so as to efficiently deal with the external environment.

00:15:21.581 --> 00:15:28.461
So when you have these sensory-motor strategies, then probably our earlier ancient

00:15:28.461 --> 00:15:32.161
evolutionary ancestors, they

00:15:32.161 --> 00:15:36.321
had only these simple strategies to deal with their current situations.

00:15:37.101 --> 00:15:42.701
Whereas higher cognitive skills is mostly about very complex abstract situations,

00:15:43.121 --> 00:15:45.641
non-perceptually available events, distal goals.

00:15:45.641 --> 00:15:52.681
But the challenge is seeing how simple strategies that we used in the past to

00:15:52.681 --> 00:15:58.481
survive in the environment can be sophisticated and reused in these more complex cognitive domains.

00:15:58.781 --> 00:16:03.401
So that's a challenge. My example was just to illustrate the possibility that

00:16:03.401 --> 00:16:07.481
you can really solve problems by reusing systematically knowledge incorporated

00:16:07.481 --> 00:16:10.341
into your internal models in an intelligent way.

00:16:10.581 --> 00:16:14.561
And that allows also for a lot of flexibility. possibility is not for planning

00:16:14.561 --> 00:16:17.661
the next action, but also for solving very complex problems.

00:16:18.501 --> 00:16:22.001
That's a bit the wish, right? That's the wish. Yes. Then the question is,

00:16:22.101 --> 00:16:25.861
how far are you in realizing that wish?

00:16:26.201 --> 00:16:32.541
So first in the climber case, do you see a big difference between expert and novice climbers?

00:16:33.241 --> 00:16:38.801
Yeah, absolutely. Yes. So part of the story, which is also one of the reasons

00:16:38.801 --> 00:16:44.441
why I'm interested in it, is that But in a sense, it is the expertise that not

00:16:44.441 --> 00:16:46.061
only modulates the climbing ability,

00:16:46.401 --> 00:16:48.701
but also the ability to think about problems.

00:16:49.940 --> 00:16:54.140
That's supporting for the embodied cognition view, for the view that it's really

00:16:54.140 --> 00:16:59.180
knowledge incorporated into your motor system that helps you not only in executing the actions,

00:16:59.380 --> 00:17:06.100
but also for instance in imagining the actions, in reusing the repertoire for solving problems.

00:17:06.460 --> 00:17:11.600
At the same time, we have evidence that knowledge incorporated in your motor

00:17:11.600 --> 00:17:16.060
system also helps you understand what other people are doing.

00:17:16.640 --> 00:17:19.420
Right. So that's the idea in a sense.

00:17:19.820 --> 00:17:22.920
Yeah, that's in the step into the social cognition component.

00:17:23.060 --> 00:17:26.740
But in some sense what you're saying, look, the standard view would be,

00:17:26.820 --> 00:17:32.040
let's say we perceive the world, we have some sort of high fidelity in the end

00:17:32.040 --> 00:17:35.400
interpretation of the world on the basis we should make decisions and perform action.

00:17:35.620 --> 00:17:38.680
But now the consequence of what you're saying is say, look, I'm

00:17:38.680 --> 00:17:41.300
actually acting in this world and the way I act in the

00:17:41.300 --> 00:17:45.180
world is now modulating or directly filtering the

00:17:45.180 --> 00:17:48.340
way I'm going to perceive this world absolutely so the processing

00:17:48.340 --> 00:17:51.280
goes backwards in that sense so the expert

00:17:51.280 --> 00:17:56.140
climbing is looking at this world very differently as a novice climber yes and

00:17:56.140 --> 00:18:02.020
you have evidence for that uh well we yes we perform them uh well of course

00:18:02.020 --> 00:18:06.720
we have a tiny evidence up to now but i think the this evidence points in the

00:18:06.720 --> 00:18:09.800
right direction so one memory study that we performed,

00:18:10.080 --> 00:18:12.900
comparing novice and expert climbers.

00:18:13.780 --> 00:18:20.740
So we simply asked these novice and expert climbers to look at three climbing routes,

00:18:21.020 --> 00:18:25.080
one simple, that both of them were able to execute, one difficult,

00:18:25.320 --> 00:18:28.920
that only the expert group was able to do, and one impossible,

00:18:29.020 --> 00:18:31.120
which was not climbable, actually.

00:18:31.280 --> 00:18:37.040
It was not really a climbing route. It was just a random displacement of climbing holds.

00:18:37.967 --> 00:18:43.707
And so the task was simply remembering the climbing holds in the right sequence.

00:18:44.067 --> 00:18:49.967
What we found is that in the easy condition that both novice and expert were

00:18:49.967 --> 00:18:55.407
able to climb, both were also able to remember quite well.

00:18:55.567 --> 00:19:01.047
So there was no evidence for a better remembering for experts.

00:19:01.207 --> 00:19:07.387
While for the difficult route, only the experts were really able to remember it well.

00:19:07.387 --> 00:19:12.507
And this, in our opinion, this points to the fact that really the way experts

00:19:12.507 --> 00:19:18.867
structure the perceptually also the situation can also help in this memory task.

00:19:19.807 --> 00:19:24.967
And so we had this control condition in which for the impossible route,

00:19:25.127 --> 00:19:30.107
in that route, we did not find any advantage for the experts because this ability,

00:19:30.307 --> 00:19:34.627
this increased memory ability is really tied to the climbability of the wall.

00:19:34.627 --> 00:19:38.867
It's not a generic ability to remember climbing holds without context.

00:19:40.127 --> 00:19:47.767
But to respond completely to your question, I would say that this sensory motor approach,

00:19:48.027 --> 00:19:52.687
so this sensory motor influence of thinking on cognition, to me,

00:19:52.687 --> 00:19:56.387
at least, it acts on two timescales, at least two timescales.

00:19:56.527 --> 00:19:58.927
One timescale is learning and development.

00:19:59.789 --> 00:20:03.889
So the point is that because I have some sensory motor skills,

00:20:04.189 --> 00:20:06.769
then I change the way I perceive the world.

00:20:06.849 --> 00:20:12.129
So I structure my memory and my perceptual abilities in such a way that then

00:20:12.129 --> 00:20:14.129
supports better cognitive abilities.

00:20:14.429 --> 00:20:17.929
So that acts on a longer time scale of learning and development.

00:20:18.389 --> 00:20:23.669
The second way the sensory motor system supports cognition is more in the online cognition.

00:20:23.669 --> 00:20:27.509
Cognition, because I can reenact my motor problems right now,

00:20:27.609 --> 00:20:32.249
my motor programs right now, so I can use them so as to support,

00:20:32.409 --> 00:20:35.509
for instance, imagery or action perception.

00:20:35.729 --> 00:20:40.529
So there are two sides of the same coin. So in a sense, this framework predicts

00:20:40.529 --> 00:20:43.909
that if you increase your sensory motor abilities in one domain,

00:20:44.109 --> 00:20:48.889
then you really shape your perception, your memory, and all your cognitive processing.

00:20:48.889 --> 00:20:56.549
And then you are also better able to run imaginary experiments based on this

00:20:56.549 --> 00:20:58.989
motor expertise that you can now reenact.

00:20:59.249 --> 00:21:05.069
Right. But then would the expert climbers recognize the unclimbable wall more

00:21:05.069 --> 00:21:07.629
rapidly than the novices, novice climbers?

00:21:10.069 --> 00:21:15.549
Well, probably, well, if the task is exactly recognized whether or not it is

00:21:15.549 --> 00:21:18.329
climbable or unclimbable, that's the task.

00:21:18.329 --> 00:21:22.989
If this is the task, well, of course, this is a bit of a tricky task because

00:21:22.989 --> 00:21:27.569
unclimbable could be, is not such a clear concept.

00:21:27.649 --> 00:21:32.989
But anyway, yes, I would say that in that case, what will happen is just that

00:21:32.989 --> 00:21:35.469
the climbers try to climb it mentally.

00:21:35.469 --> 00:21:40.709
So they run this imaginary climbing simulation, and then because the expert

00:21:40.709 --> 00:21:45.209
will anticipate some failure in climbing, because they have high confidence

00:21:45.209 --> 00:21:48.949
in their internal models, they will say, no, no, no, this cannot be climbed.

00:21:49.269 --> 00:21:54.229
Whereas for the novices, maybe they will try out to anticipate.

00:21:54.369 --> 00:22:01.169
They will also fail, but without noticing the, well, without trusting too much their simulations.

00:22:01.269 --> 00:22:04.369
So that will be my... Right.

00:22:04.609 --> 00:22:10.329
But now, still a problem for this point of view is that you could say,

00:22:10.369 --> 00:22:13.929
look, the climber is a highly specialized human being.

00:22:14.929 --> 00:22:19.349
So it's not very surprising that they have certain cognitive capabilities because

00:22:19.349 --> 00:22:24.329
they have been overtrained in dealing with certain kinds of environments like climbing walls.

00:22:24.329 --> 00:22:27.389
So how would you see this sort of.

00:22:28.669 --> 00:22:31.829
But you want to look at this as an example where

00:22:31.829 --> 00:22:37.249
you say well it's actually very basic sensory motor patterns or it's really

00:22:37.249 --> 00:22:42.789
action itself and the way we generate action from which the rest of cognition

00:22:42.789 --> 00:22:49.509
is sort of bootstrapped or is depends upon it's constrained by so how is it

00:22:49.509 --> 00:22:51.229
going to work exactly how do you see that work,

00:22:52.128 --> 00:22:57.168
Well, I would say that now we don't have any perfect story yet,

00:22:57.308 --> 00:23:02.028
but the point is really that action and the goals implied in the action and

00:23:02.028 --> 00:23:06.068
the predictions implied in the action, they are really key to cognition and

00:23:06.068 --> 00:23:08.668
to develop increasingly more complex cognitive skills.

00:23:09.448 --> 00:23:13.108
So at the beginning, so that could be a simple story.

00:23:13.188 --> 00:23:17.168
So at the beginning, the brain is simply organized around these goal selection

00:23:17.168 --> 00:23:19.248
and specification and selection tasks.

00:23:19.248 --> 00:23:22.428
So they have to integrate information from the external world,

00:23:22.508 --> 00:23:28.728
from the memory, from affective states, to very, very quickly jump into a good action selection.

00:23:29.248 --> 00:23:35.348
And to do that, as I told before, you're mostly relying on your internally generated

00:23:35.348 --> 00:23:40.168
goal state, affective processes, attentional processes.

00:23:40.348 --> 00:23:44.108
The stimulus helps you, but then it's the brain, which is autonomous, is doing the job.

00:23:45.068 --> 00:23:50.908
But then as the sensory motor abilities of this primitive architecture develop

00:23:50.908 --> 00:23:54.788
over time and increase your abilities, they increase their ability to control

00:23:54.788 --> 00:23:56.388
the external world and to predict it.

00:23:57.388 --> 00:24:04.108
So they start incorporating more and more knowledge, more and more also structure

00:24:04.108 --> 00:24:09.208
from the external environment into their predictions, into their motor control actions.

00:24:09.208 --> 00:24:12.848
So how are you going to validate this prediction?

00:24:13.068 --> 00:24:16.568
Is it a pure experimental exercise or are there other ways to get this validated?

00:24:17.641 --> 00:24:24.281
The framework, you mean? Well, the framework, that's much more a lifelong research

00:24:24.281 --> 00:24:29.561
program, if you want, because we are trying to do many, many validations.

00:24:29.621 --> 00:24:32.661
A few empirical data, of course, already exist.

00:24:32.761 --> 00:24:37.741
If you think of the literature, how similar the brain networks for imagery and

00:24:37.741 --> 00:24:41.241
for execution or for motor preparation, they are.

00:24:41.241 --> 00:24:46.721
So you already see that some mental operations are really supported by the sensory

00:24:46.721 --> 00:24:49.321
motor system in the brain. So that's some evidence.

00:24:50.221 --> 00:24:55.541
There is evidence, well, there are also patient studies that say that in some

00:24:55.541 --> 00:25:02.441
conditions, the imaginary action cannot be really stopped from being overtly executed.

00:25:02.441 --> 00:25:05.321
So that's another part of the story because

00:25:05.321 --> 00:25:08.761
in this story it is it is the internalization of

00:25:08.761 --> 00:25:11.981
the predictive mechanism that really supports cognition so you

00:25:11.981 --> 00:25:15.941
first predict predict things in the external world then you internalize this

00:25:15.941 --> 00:25:20.861
capability of predicting and this eventually lets you to rehearsing entire sequences

00:25:20.861 --> 00:25:25.601
of action without executing them but then a prediction of this framework is

00:25:25.601 --> 00:25:31.341
that if you really do this covert mental imagery if you want or discover covered predictive magnets,

00:25:31.581 --> 00:25:37.781
and you cannot really separate these internal processes from the overt execution,

00:25:38.181 --> 00:25:41.161
then this implies that what you imagine, you immediately do.

00:25:42.161 --> 00:25:48.781
And there is evidence that this unfortunately happens in patients with bilateral parietal lesions.

00:25:49.561 --> 00:25:54.481
And other evidence of this close connection between what you think and what you do is,

00:25:55.018 --> 00:25:58.298
also exist for utilization behavior for instance utilization behavior

00:25:58.298 --> 00:26:01.518
patients they are not really able to inhibit their

00:26:01.518 --> 00:26:04.618
motor programs for grasping objects and uh just

00:26:04.618 --> 00:26:09.738
reacting in a very uh quick way to what they say so this uh this tells a story

00:26:09.738 --> 00:26:16.538
on on how uh this more um abstract thinking part is tied to the sensory motor

00:26:16.538 --> 00:26:21.378
programs okay but now this these this this in how do you what's What's the role

00:26:21.378 --> 00:26:23.798
of the internal simulation in that case?

00:26:23.938 --> 00:26:27.018
Because on the one hand, you're saying, well, we run these internal simulations.

00:26:27.138 --> 00:26:28.698
Apparently, they would run in parallel.

00:26:29.738 --> 00:26:33.858
But on the other hand, if you look at the deficits that you mentioned in these

00:26:33.858 --> 00:26:40.998
patients, it's not that they are executing a whole sequence of actions in parallel, right?

00:26:41.078 --> 00:26:45.638
In utilization behavior, you would grasp a single object and the whole organism

00:26:45.638 --> 00:26:46.578
would be focused on that.

00:26:46.578 --> 00:26:53.118
So therefore, is this kind of evidence coming more from the clinic,

00:26:53.178 --> 00:26:57.418
from clinical studies, really supporting this hypothesis of internal assimilation

00:26:57.418 --> 00:26:59.198
and the role of action in the structure of cognition?

00:26:59.658 --> 00:27:05.138
Well, for the first evidence I mentioned of the patient who was unable to inhibit

00:27:05.138 --> 00:27:11.878
the imagined action, that's quite clear, because then what you think is what you do.

00:27:11.878 --> 00:27:16.338
So if you imagine, let's say, pointing to one direction, then you point in that

00:27:16.338 --> 00:27:18.298
direction, you cannot really stop from doing that.

00:27:18.518 --> 00:27:23.238
So in this framework that I'm supporting, the idea is that you run this imaginary

00:27:23.238 --> 00:27:24.638
simulation of pointing.

00:27:25.998 --> 00:27:30.018
This is typically done by also inhibiting the overt motor execution.

00:27:30.158 --> 00:27:33.038
But if you have a deficit, you cannot inhibit that output.

00:27:33.198 --> 00:27:36.558
It rapidly turns out into an overt action.

00:27:36.558 --> 00:27:41.518
In the utilization behavior, the link is tiny in a sense.

00:27:41.518 --> 00:27:48.398
The idea is that one idea in the literature of affordances, also the canonical

00:27:48.398 --> 00:27:52.118
neurons literature, is that when you look at an object,

00:27:52.338 --> 00:27:58.458
then you automatically, let's say, pre-potenciate some motor programs that are

00:27:58.458 --> 00:28:00.558
good for interacting with that object.

00:28:00.558 --> 00:28:07.058
Jet okay but typically this does not result into an overt execution so in a

00:28:07.058 --> 00:28:12.398
sense this is just a pre potentiation which can be also interpreted in terms

00:28:12.398 --> 00:28:15.358
of simulating grasping this cap in front of me,

00:28:16.030 --> 00:28:18.710
So in that case, I'm not reasoning about simulating the action.

00:28:18.790 --> 00:28:24.570
This is just a more automatic process of mentally rehearsing of the good action that they can do.

00:28:25.190 --> 00:28:31.530
Again, if you don't have this, let's say, this mechanism working well,

00:28:31.650 --> 00:28:33.990
you cannot really inhibit also the overt execution.

00:28:35.010 --> 00:28:38.830
I would say that while in my first example of the imaginary situation,

00:28:39.190 --> 00:28:42.590
the link is clearer, in this second example, it's less clear.

00:28:42.590 --> 00:28:47.070
But to me, it's another supporting example that your internal mental process

00:28:47.070 --> 00:28:50.770
is always oriented towards anticipating possibilities for action.

00:28:51.210 --> 00:28:56.850
In some cases, this is just a simple automatic process that tells you what is possible to do.

00:28:56.910 --> 00:29:01.230
In other cases, this is more intentional imagery.

00:29:01.510 --> 00:29:05.370
So you really run long-term simulation imaging sequences of actions.

00:29:06.270 --> 00:29:09.110
But I see a continuity in this.

00:29:09.110 --> 00:29:13.930
So the good thing about this framework is that you do not have to postulate

00:29:13.930 --> 00:29:17.090
any different set of cognitive representation for different tasks,

00:29:17.210 --> 00:29:22.870
any module for thinking that is completely segregated from the action control and specification,

00:29:23.270 --> 00:29:28.090
but you have a continual reuse of the same abilities in more complex ways.

00:29:28.885 --> 00:29:33.845
Okay, but then would you see this as being, let's say, layered in some way,

00:29:33.985 --> 00:29:37.865
that you have, let's say, sensory motor capabilities of, let's say,

00:29:37.865 --> 00:29:41.845
varying levels of complexity with some sort of discrete steps between them,

00:29:41.925 --> 00:29:43.245
or is it really a continuum?

00:29:44.905 --> 00:29:49.005
Well, that's a hard question. one important

00:29:49.005 --> 00:29:51.905
thing in this framework that probably answers partially to

00:29:51.905 --> 00:29:54.805
your question is that we have

00:29:54.805 --> 00:29:57.805
to probably focus we have to talk about

00:29:57.805 --> 00:30:00.965
actions as the unity and also we

00:30:00.965 --> 00:30:03.885
know that action can be specified at different levels of detail

00:30:03.885 --> 00:30:06.825
so there are actions that are specified

00:30:06.825 --> 00:30:09.765
at the level of movement of a single finger whereas there

00:30:09.765 --> 00:30:12.625
are other actions in which the action but also

00:30:12.625 --> 00:30:15.905
the goal of the action is specified at a more abstract

00:30:15.905 --> 00:30:18.985
level that is grasping this uh this object

00:30:18.985 --> 00:30:21.825
this cup whereas there is still another level of

00:30:21.825 --> 00:30:27.545
action and intention associated which is uh let's say drinking from this cup

00:30:27.545 --> 00:30:35.585
so the actions uh all these all these uh levels are are in a sense they they

00:30:35.585 --> 00:30:39.325
are they support one another in a sense so of Of course,

00:30:39.325 --> 00:30:46.145
the more abstract actions have to be finally specified in more fine-grained

00:30:46.145 --> 00:30:48.025
terms of the movements of the fingers.

00:30:48.165 --> 00:30:53.765
But this is a structure, a cognitive structure that has different layers probably.

00:30:53.945 --> 00:30:59.405
Or we don't know exactly, but at least you have a big structure in which you

00:30:59.405 --> 00:31:03.925
can really specify actions at different levels of complexity,

00:31:04.025 --> 00:31:06.625
a different level of abstraction with more complex intentions.

00:31:06.625 --> 00:31:12.525
And the arguments that we do is not that we always use the lowest level,

00:31:12.645 --> 00:31:13.905
the lowest possible levels.

00:31:13.945 --> 00:31:19.225
At some point, you can really think and do some imagery by using actions and

00:31:19.225 --> 00:31:23.245
their associated intentions at intermediate or even at abstract levels.

00:31:24.185 --> 00:31:28.965
So that's probably answers. Right. So now one way to test some of these ideas,

00:31:29.305 --> 00:31:33.005
you were performing and presenting experiments on joint action.

00:31:33.145 --> 00:31:43.125
Yes. So how does joint action now help you to understand this model of cognition and action in the end?

00:31:43.385 --> 00:31:50.565
Well, the reason why I went into joint action in my presentation is that, as I told before,

00:31:51.025 --> 00:31:55.725
I would like to come out with a convincing story or more complex cognitive abilities

00:31:55.725 --> 00:31:59.345
developed based on simpler cognitive abilities.

00:31:59.345 --> 00:32:04.245
So here the steps that I presented are, okay, I start doing my action,

00:32:04.365 --> 00:32:12.185
but then I am in a social domain, and so maybe I need to do some action together with you.

00:32:12.942 --> 00:32:17.242
So in this case, I can reuse a lot of what I know about my action system to

00:32:17.242 --> 00:32:21.122
anticipate you and to get into your intentions.

00:32:21.162 --> 00:32:26.222
So to get rid of your motor system or your motor action of your intentions.

00:32:26.542 --> 00:32:32.782
In this case, this is just the beginning of a story in which I reuse my sensory

00:32:32.782 --> 00:32:38.942
motor knowledge or my action execution and prediction abilities to get into

00:32:38.942 --> 00:32:40.362
more and more complex situations.

00:32:40.362 --> 00:32:44.682
In this case, it is a social situation, but it could be even a non-social situation.

00:32:45.502 --> 00:32:52.622
Still another step in this passage is that, okay, now I'm able to use my predictive

00:32:52.622 --> 00:32:54.082
abilities to predict you.

00:32:54.262 --> 00:32:59.142
I use my body and my action system as a model to understand your body and your action system.

00:32:59.982 --> 00:33:05.422
And then I maybe use these abilities to plan long-term joint action.

00:33:05.422 --> 00:33:08.862
To do that then i have planned some more coordinating actions

00:33:08.862 --> 00:33:11.942
so i have to plan how to really achieve them in practice i

00:33:11.942 --> 00:33:14.842
have maybe to to invent some way to better

00:33:14.842 --> 00:33:17.602
coordinate with you and maybe at some

00:33:17.602 --> 00:33:20.542
point you invent more and more complex things you extend the

00:33:20.542 --> 00:33:23.742
boundaries of your control to the control of my body or

00:33:23.742 --> 00:33:27.202
to the control of our body or our combined actions

00:33:27.202 --> 00:33:30.162
and maybe also to the control of your internal

00:33:30.162 --> 00:33:33.202
states so your beliefs your intentions so

00:33:33.202 --> 00:33:36.102
i do some action to change your intention so not only

00:33:36.102 --> 00:33:38.962
i control my body i now control me and you and i

00:33:38.962 --> 00:33:41.942
control you and i know that uh even after

00:33:41.942 --> 00:33:46.462
this interaction you will have some new beliefs i extend my control abilities

00:33:46.462 --> 00:33:52.582
control capabilities from myself to you so that that's uh this the very beginning

00:33:52.582 --> 00:33:58.042
of a story in which we begin to to go into more and more complex cognitive abilities

00:33:58.042 --> 00:34:00.242
is starting from simpler ones. Right.

00:34:02.742 --> 00:34:09.322
So for the joint action experiment, you were looking at hand movements essentially. Right.

00:34:09.846 --> 00:34:12.966
You're tracking hand mostly yes and then you

00:34:12.966 --> 00:34:16.166
also build a model of that yes okay so so

00:34:16.166 --> 00:34:20.726
which aspect now of the joint action control of

00:34:20.726 --> 00:34:28.226
hands do you capture in this in this model well if we think the model specifically

00:34:28.226 --> 00:34:37.006
then uh well in a sense the key idea of the model is that it's not completely is not new.

00:34:37.066 --> 00:34:41.406
There are many similar models, but the idea in a sense is that I really use

00:34:41.406 --> 00:34:46.506
my own action abilities as a model to understand your action right now.

00:34:47.126 --> 00:34:54.086
So the model can predict for instance the hand trajectory, but at the same time

00:34:54.086 --> 00:34:59.086
it can also give some hint into your intentions and goals.

00:34:59.306 --> 00:35:03.446
Because I know what are the goals that are linked to my model actions.

00:35:03.786 --> 00:35:10.926
How does the model know that? Well, let's imagine I see you moving your arm towards a cup, okay?

00:35:11.126 --> 00:35:15.326
So now the hypothesis is that I run an internal simulation of the possible actions

00:35:15.326 --> 00:35:19.826
that I could be that are quite compatible with the actions that I'm seeing.

00:35:20.026 --> 00:35:24.446
I start predicting better and better your trajectory, and then let's see that

00:35:24.446 --> 00:35:27.586
this model fits quite well the data, so your hand trajectory.

00:35:28.126 --> 00:35:29.606
Because I know what is the goal

00:35:29.606 --> 00:35:34.086
of my hand action, I also can have some hypothesis of what is the goal.

00:35:34.086 --> 00:35:38.886
So because I know that my trajectory eventually reaches the cap and then grasps

00:35:38.886 --> 00:35:43.086
it, then I can also infer or at least hypothesize that you will do the same.

00:35:43.206 --> 00:35:44.666
So I also know what is your goal.

00:35:45.546 --> 00:35:53.186
That's the key idea. But then the intentional labeling of the sequence is then

00:35:53.186 --> 00:35:59.906
with reference to your own action or that is in reference in some way to observing the other?

00:36:00.786 --> 00:36:06.606
So the idea of these kind of models is that you use what you know about the

00:36:06.606 --> 00:36:11.206
link between the trajectory of your arms and your goal. You know them because you execute them.

00:36:11.366 --> 00:36:17.226
You use exactly the same link to get some understanding of what the other is doing.

00:36:17.386 --> 00:36:21.906
So that's the idea. So you have first some sort of self-monitoring system that

00:36:21.906 --> 00:36:24.826
develops a self-model that you then generalize to the other.

00:36:24.906 --> 00:36:29.586
This would be the step. Exactly. So I will not necessarily call it a self-monitoring

00:36:29.586 --> 00:36:33.806
in that you simply need a forward model and an inverse model.

00:36:33.946 --> 00:36:36.306
So you need some control capabilities for yourself.

00:36:37.086 --> 00:36:42.586
And then you transfer, you use this exactly as a model to understand the other.

00:36:42.766 --> 00:36:46.446
So that's more or less the key idea. And this idea is pursued by many people

00:36:46.446 --> 00:36:49.166
actually also. Right. Yeah. Okay.

00:36:51.292 --> 00:36:55.892
So what's, do you feel confident then that this original idea you had about

00:36:55.892 --> 00:36:59.532
this more, let's say, embodied problem solving, action as grounding, cognition,

00:36:59.872 --> 00:37:04.912
what tells you in this joint action task where you now are generalizing this

00:37:04.912 --> 00:37:07.692
model to social interaction, that this is working?

00:37:07.812 --> 00:37:10.992
Because in some sense, I could also argue, well, it works because you have been

00:37:10.992 --> 00:37:13.832
imposing, you have really made the task pretty abstract, right?

00:37:13.892 --> 00:37:18.472
Because I'm just observing these hands moving in one plane. plane,

00:37:18.472 --> 00:37:24.292
I don't have to do anything about, let's say, invariant recognition of posture, of limbs and so on.

00:37:24.532 --> 00:37:30.332
So to really now generalize this to, let's say, a realistic scenario where you

00:37:30.332 --> 00:37:35.632
have to observe another human moving about in space, is that just a small matter

00:37:35.632 --> 00:37:39.032
of programming or are there some fundamental steps still missing?

00:37:40.532 --> 00:37:44.812
Well, to answer this question, probably we have to go into a bigger picture.

00:37:44.812 --> 00:37:50.292
The bigger picture to me is that now, although I have emphasized these motor

00:37:50.292 --> 00:37:52.072
predictions, that's not the whole story.

00:37:53.032 --> 00:37:56.472
In a sense, my opinion is that the brain is a smart guy.

00:37:56.772 --> 00:38:02.372
The brain always uses all the knowledge that it has, at least in principle, to do the job.

00:38:03.092 --> 00:38:08.612
The point here is that in more realistic joint action or action observation

00:38:08.612 --> 00:38:10.932
scenarios, there is a lot of information

00:38:10.932 --> 00:38:15.012
available. There is perceptual information, which is available.

00:38:15.212 --> 00:38:18.932
There is some context information, such as, for instance, I see what are the

00:38:18.932 --> 00:38:22.192
objects within your pre-personal space. So that's my prior.

00:38:23.192 --> 00:38:31.172
I can have a lot of prior information on you. For instance, I know that you like very much the beer.

00:38:31.372 --> 00:38:35.752
So if there are in front of you, I see some beer and I see some Coke,

00:38:35.912 --> 00:38:39.012
I would maybe anticipate that you will grasp the beer.

00:38:39.012 --> 00:38:46.392
And then I also use this motor simulation that I told, because it is also useful.

00:38:46.552 --> 00:38:50.432
So I'm not assuming that the brain only uses this motor simulation part.

00:38:50.652 --> 00:38:55.092
It uses the motor simulation to support this action understanding,

00:38:55.492 --> 00:38:59.292
because in this context, it is very salient and very useful.

00:39:00.192 --> 00:39:04.292
Why is it very useful? Because, well, I have a very good model of my body.

00:39:05.084 --> 00:39:08.684
So it's a very good model, very good predictive model for your movements.

00:39:08.924 --> 00:39:15.264
Of course, I have also some purely perceptual predictive mechanisms that can help me predicting you.

00:39:15.364 --> 00:39:21.764
But I would say that in the case of human understanding, the model of myself

00:39:21.764 --> 00:39:24.604
is a very, very good model of you in most cases.

00:39:24.744 --> 00:39:28.144
So this is what the brain will use the most.

00:39:28.424 --> 00:39:32.424
But if you go into the biggest picture, you can think of it as a big Bayesian

00:39:32.424 --> 00:39:36.844
process happening. So in a Bayesian process, you basically use all the information

00:39:36.844 --> 00:39:41.644
you have, information which comes first, then can be also used as a prior to

00:39:41.644 --> 00:39:42.724
run the rest of the process.

00:39:42.884 --> 00:39:48.284
And then all the new information that is collected is fused in an intelligent way.

00:39:48.864 --> 00:39:52.484
Intelligent here means that it is weighted depending on its uncertainty.

00:39:52.724 --> 00:39:59.404
So if the motor system is a very good model, the information it provides will be used very much.

00:39:59.404 --> 00:40:04.324
Much if this information is not good or it is failing then it will not be used

00:40:04.324 --> 00:40:08.524
very much yeah but you seem to now to drift away a little bit from your original.

00:40:10.564 --> 00:40:15.824
Proposal because i thought you started out by saying look cognition is really

00:40:15.824 --> 00:40:19.984
predicated on action yeah right but now you seem to say something like well

00:40:19.984 --> 00:40:24.704
motor the motor system is one of many sources of information you could consider

00:40:24.704 --> 00:40:27.824
because the bayesian system basically use all the knowledge it has,

00:40:27.904 --> 00:40:30.644
and it has knowledge about the motor system, but also about,

00:40:30.744 --> 00:40:34.764
let's say, perceptual systems, or from memory.

00:40:35.124 --> 00:40:38.484
So aren't you a bit drifting now in that argument?

00:40:38.984 --> 00:40:44.544
Well, I don't think so, because as I told before, the way you collect this information

00:40:44.544 --> 00:40:46.644
in the first place is through learning.

00:40:46.904 --> 00:40:50.284
And through learning, the motor system has really influenced you a lot.

00:40:51.064 --> 00:40:55.104
So also through learning, this perceptual information that I have,

00:40:55.244 --> 00:40:58.104
I have acquired it by also interacting with the world.

00:40:58.224 --> 00:41:02.664
So the motor system really is used very much not only online,

00:41:02.864 --> 00:41:04.984
as I told now, but also during learning.

00:41:05.832 --> 00:41:10.752
Of course, we don't have to be radical on that. So we don't think that only

00:41:10.752 --> 00:41:13.492
the motor system does something in the brain.

00:41:13.612 --> 00:41:15.852
There are many parts of the brain that are used.

00:41:16.212 --> 00:41:20.632
And probably the point is slightly different.

00:41:20.832 --> 00:41:24.912
The point is not all about the motor system. So the point is that the ability

00:41:24.912 --> 00:41:27.172
that you have to interact with the external world,

00:41:27.332 --> 00:41:30.312
which can be supported by many systems in the brain,

00:41:30.432 --> 00:41:33.412
prominently by the motor system, but well many system

00:41:33.412 --> 00:41:37.032
you reuse them consistently to achieve new

00:41:37.032 --> 00:41:40.012
cognitive abilities so to develop and and

00:41:40.012 --> 00:41:45.692
then to achieve new new goals specified at different levels okay but um so then

00:41:45.692 --> 00:41:50.292
you're saying well action let's say impregnates all other aspects yes of of

00:41:50.292 --> 00:41:54.992
cognition and perception through the cause of learning okay and this so it has

00:41:54.992 --> 00:41:58.852
a direct and otherwise an indirect impact yeah yeah Yeah, there are both.

00:41:59.032 --> 00:42:03.052
I would say that the first one is through learning. The second one is during

00:42:03.052 --> 00:42:05.052
overt interaction with the external world.

00:42:05.212 --> 00:42:10.432
And then through the internalization of this action and prediction process,

00:42:10.512 --> 00:42:11.832
you can imagine new scenarios.

00:42:12.092 --> 00:42:15.792
But this doesn't mean that when you can have some motor simulation,

00:42:16.032 --> 00:42:18.992
you close your eyes, you don't use percentile information.

00:42:19.232 --> 00:42:22.492
That would be foolish from the point of view of the brain. and and

00:42:22.492 --> 00:42:25.252
there is an important uh point of this view

00:42:25.252 --> 00:42:28.172
is that uh as you know in action understanding

00:42:28.172 --> 00:42:31.132
there are many views one view is that you

00:42:31.132 --> 00:42:34.292
i call this the mirrorless view is that

00:42:34.292 --> 00:42:38.652
the first thing that is active is the recognition of the goal then the recognition

00:42:38.652 --> 00:42:43.912
of the goal in the mirror system of course it eventually helps also predicting

00:42:43.912 --> 00:42:49.452
the action there is a second view which is related but in some sense also very

00:42:49.452 --> 00:42:53.312
different because this view is that what you do first is prediction.

00:42:53.672 --> 00:42:58.612
So it is prediction which comes first and prediction helps then recognizing the goal.

00:42:59.917 --> 00:43:04.397
But still another view is that all these motor simulation things,

00:43:04.677 --> 00:43:09.037
they are not so important because we really use much more abstract knowledge

00:43:09.037 --> 00:43:12.577
of what the other person should do, given the context.

00:43:12.657 --> 00:43:15.197
So that's more theological knowledge, if you want.

00:43:15.837 --> 00:43:24.177
Well, in the view I'm supporting, it is not so useful to say what is more used and what is less used.

00:43:24.557 --> 00:43:28.497
So what comes first can be used as a prior for the rest of the things.

00:43:28.497 --> 00:43:33.457
And what is more reliable is used more than the other things.

00:43:33.757 --> 00:43:35.757
It happens, at least to me, that

00:43:35.757 --> 00:43:39.257
this motor simulation in that specific context, they are very reliable.

00:43:39.697 --> 00:43:44.857
This is why I think they play a prominent role, not because of some categorical distinctions.

00:43:45.357 --> 00:43:49.777
And also relating to this issue of what comes first, goal recognition or prediction,

00:43:50.457 --> 00:43:52.317
that's really depending on the task.

00:43:52.477 --> 00:43:56.557
So if I have a big prior knowledge of what the most likely goals will be,

00:43:56.557 --> 00:44:01.917
then it is probable that I will force some hypothesis or some probability distribution

00:44:01.917 --> 00:44:05.977
over those goals much before starting predicting you.

00:44:06.637 --> 00:44:11.497
But if I cannot see what are the goals, let's say the objects you can grasp,

00:44:11.677 --> 00:44:17.877
I will probably start by predicting and by prediction I will get into some hypothesis of what is your goal.

00:44:18.237 --> 00:44:20.477
So everything can be very flexible.

00:44:21.337 --> 00:44:26.977
Okay, but you are saying ideally you take a model-based approach towards yes

00:44:26.977 --> 00:44:31.837
in this case also social perception you want to say yes you want to get to the

00:44:31.837 --> 00:44:36.457
goal of the intentional state of the other agent as quickly as possible and

00:44:36.457 --> 00:44:40.057
from there you just reconstruct the rest of the action,

00:44:41.077 --> 00:44:44.737
yeah yes in some sense so yeah.

00:44:45.793 --> 00:44:49.793
Typically, I want to go into the goal when this is useful for my task.

00:44:49.953 --> 00:44:57.013
So if my task is understanding which cup will you grasp, that's the goal information that I need.

00:44:57.173 --> 00:45:01.373
Whereas if I don't want to hurt you, it's more the trajectory that I want to know.

00:45:01.453 --> 00:45:08.553
So for example, when we are driving, I don't really need to know where are you going.

00:45:09.053 --> 00:45:15.953
I don't care. I don't want to cross you or to hurt you. So it's not so much

00:45:15.953 --> 00:45:22.433
the goal of what you're doing that is important, but it is the trajectory that you are doing right now.

00:45:22.993 --> 00:45:27.013
Of course, also the goal is interesting because the goal tells me a lot about your trajectory.

00:45:27.333 --> 00:45:35.453
So I would say that what you ask, what you want to infer is very much task dependent.

00:45:36.613 --> 00:45:42.213
And as a side effect, you will also infer many other things that give you useful

00:45:42.213 --> 00:45:45.473
priors or useful hints. experiments. So that's my view.

00:45:46.033 --> 00:45:52.253
You don't only run experiments, right? To test these ideas, you also use robots.

00:45:52.533 --> 00:45:57.693
Yes. So how have robots helped you to make progress on these issues?

00:45:58.973 --> 00:46:04.953
Well, yes, we are running many computational and also robotic experiments.

00:46:05.573 --> 00:46:14.873
So So one thing that we have recently seen is that probably this is very interesting

00:46:14.873 --> 00:46:18.213
again for the argument that more and more complex cognitive abilities can be

00:46:18.213 --> 00:46:20.453
developed on top of this sensory-motor interaction.

00:46:20.673 --> 00:46:24.573
One thing that we are investigating with computational models and robots is

00:46:24.573 --> 00:46:27.093
the ability to signal in social context.

00:46:27.393 --> 00:46:32.073
So just to explain this very quickly, the point is that if we are supposed to

00:46:32.073 --> 00:46:39.933
do one joint action okay let's say building some tower together but of red and

00:46:39.933 --> 00:46:44.033
blue blocks okay one red one blue one red one blue okay now we are interacting

00:46:44.033 --> 00:46:45.453
we are supposed to do that,

00:46:46.033 --> 00:46:52.173
and we are able to coordinate to do that but now let's imagine only i know.

00:46:53.121 --> 00:46:55.741
What is the goal? What is the tower to be built? You don't know.

00:46:56.201 --> 00:47:00.041
So one thing that we are investigating with robots is how we come out with some

00:47:00.041 --> 00:47:05.241
good coordination with only one person knowing the task.

00:47:05.641 --> 00:47:09.981
In that case, without using overt or linguistic communication.

00:47:10.521 --> 00:47:14.181
So what we have seen also in robot, that's a very hard problem.

00:47:14.441 --> 00:47:19.081
Because when you run computational robotic models, then you put your ideas into practice practice,

00:47:19.241 --> 00:47:23.401
and you see that by simply predicting the action of the others doesn't work

00:47:23.401 --> 00:47:27.201
too much because then I start predicting you,

00:47:27.321 --> 00:47:31.961
but so the guy who doesn't know the job has to predict the actions of the other,

00:47:32.061 --> 00:47:36.261
but it is always one step later, it's too late.

00:47:36.441 --> 00:47:41.741
So it doesn't work very well. So we come out with the idea that the guy who

00:47:41.741 --> 00:47:46.361
knows the job can also in some sense support the predictive processes of the

00:47:46.361 --> 00:47:48.261
other or help the other in some sense.

00:47:49.521 --> 00:47:56.701
So how would I help my other robot? Well, one simple example is that I can make

00:47:56.701 --> 00:48:02.981
my behavior more predictable, or I can make my behavior more informative for you.

00:48:03.581 --> 00:48:10.581
Such as I can use my choice of action to letting you understand very well what are my intentions.

00:48:11.461 --> 00:48:15.441
So, you typically try to understand my intentions, but if the uncertainty is

00:48:15.441 --> 00:48:21.121
too high, I can in some sense help you with the wise choice of actions.

00:48:21.321 --> 00:48:22.761
That can be very intuitive.

00:48:23.021 --> 00:48:26.041
But it could also be a way to test hypotheses about the other.

00:48:26.581 --> 00:48:30.361
It's not just making your behavior more predictable. It must be a way to test

00:48:30.361 --> 00:48:32.441
whether the other has the right model of you.

00:48:32.681 --> 00:48:37.581
Absolutely. Yes, that's the idea. So, we design… But it's not the same thing, right? No, no, no.

00:48:37.621 --> 00:48:41.341
Well, they are related. So, in one sense,

00:48:42.337 --> 00:48:48.237
Also, in my example, let's call the guy who has the knowledge the leader and the other the follower.

00:48:48.797 --> 00:48:54.657
So in a sense, the leader can have some uncertainty on what are the models of the follower.

00:48:55.697 --> 00:49:02.517
So, of course, he can use some actions and monitor the reactions of the other

00:49:02.517 --> 00:49:08.417
person just to know what are the good models, what are the models of the other

00:49:08.417 --> 00:49:10.137
agents, what is he doing.

00:49:10.297 --> 00:49:17.737
Right. And when the leader infers that the follower does not have good models, then he will fail.

00:49:17.857 --> 00:49:22.337
Then he can, in a sense, try to help the follower.

00:49:22.617 --> 00:49:24.617
So the two processes are interconnected.

00:49:25.057 --> 00:49:32.537
So by monitoring the process and by always trying to keep track of the uncertainty

00:49:32.537 --> 00:49:36.737
of the other person, you really can plan helping actions. Right.

00:49:37.017 --> 00:49:41.037
But now the thing was interesting that, so we go from action now to interaction.

00:49:41.317 --> 00:49:48.177
Yes. And in your research plan, that's really a very crucial step, right?

00:49:48.197 --> 00:49:53.197
Because you really seem to think that or propose that it's by getting a handle

00:49:53.197 --> 00:49:56.957
on interaction that you really can get to, let's say, a more generalized understanding

00:49:56.957 --> 00:50:00.477
of social behavior and also cultures.

00:50:00.717 --> 00:50:03.497
Yeah. So how should I see that generalization exactly?

00:50:05.217 --> 00:50:11.657
The point is that I think that for many disciplines, such as the study of language

00:50:11.657 --> 00:50:15.937
and communication, the natural starting point would be naturalistic joint action.

00:50:16.708 --> 00:50:21.168
That will be a nice starting point because now the hypothesis is that the mechanism

00:50:21.168 --> 00:50:25.848
that we use for interacting with the others in sensory motor domains,

00:50:26.148 --> 00:50:27.808
then they provide a scaffold.

00:50:28.088 --> 00:50:34.688
So they provide some help for developing, let's say, for instance,

00:50:34.928 --> 00:50:35.828
linguistic communication.

00:50:35.828 --> 00:50:40.948
The hypothesis, which has been called by Levinson this more interaction engine hypothesis,

00:50:41.288 --> 00:50:47.748
is that by simply interacting in the external world, we really have some strong

00:50:47.748 --> 00:50:52.148
pragmatic abilities to infer the action of the other, to do some truth-taking,

00:50:52.388 --> 00:50:53.328
to do some joint attention,

00:50:53.728 --> 00:50:56.308
to anticipate the action and the intentions of the others.

00:50:56.488 --> 00:51:01.028
And this is a strong universal basis also for language communication.

00:51:01.028 --> 00:51:05.828
So, he hypothesizes, for instance, the children use this interaction engine

00:51:05.828 --> 00:51:12.168
to develop, so to understand and to learn their natural language.

00:51:12.348 --> 00:51:15.068
And that's exactly a point that we want to make.

00:51:16.208 --> 00:51:21.728
So we started studying this kind of joint actions. Now we come out with a formulation

00:51:21.728 --> 00:51:26.768
of the problem in which the two persons happen to solve a joint action optimization process.

00:51:27.948 --> 00:51:33.408
This joint action optimization process means that it is not only so I don't

00:51:33.408 --> 00:51:37.728
have to care only about my action, but also the joint outcome of the action.

00:51:38.508 --> 00:51:43.088
Because I have to care about the joint action and the joint outcome of our action,

00:51:43.188 --> 00:51:46.168
which is a joint goal, then I also have to care about you.

00:51:46.728 --> 00:51:51.868
This is why in some cases, as I told before, I help you, not because I'm altruistic.

00:51:52.008 --> 00:51:54.928
I'm helping you because by helping you, I help the joint goal.

00:51:54.928 --> 00:51:59.988
All. So I'm sorry but I'm not so much altruistic, that's all individualistic

00:51:59.988 --> 00:52:05.828
but at the same time I think that this helping action,

00:52:06.589 --> 00:52:09.209
is is really the first primitive form of a

00:52:09.209 --> 00:52:12.069
communicative action because as i told before i

00:52:12.069 --> 00:52:15.049
can make my action more observable by

00:52:15.049 --> 00:52:18.389
you more predictable by you more understandable or more

00:52:18.389 --> 00:52:21.349
diagnostic for you diagnostic here simple

00:52:21.349 --> 00:52:25.229
means simply means that if you have two hypotheses on what i'm doing i make

00:52:25.229 --> 00:52:29.769
my action uh let's say for instance i exaggerate the movement such as to make

00:52:29.769 --> 00:52:34.009
you understand very quickly what i what is my right intentions that's diagnostic

00:52:34.009 --> 00:52:39.109
but that That exaggeration that I do, it's really a form of communication.

00:52:40.009 --> 00:52:44.189
That's really the bootstrap also of all other forms of communication,

00:52:44.409 --> 00:52:45.649
including linguistic communication.

00:52:46.129 --> 00:52:51.829
So the general hypothesis here, this is why I'm interested in this passage from

00:52:51.829 --> 00:52:56.189
joint action to more complex form of cognition, is that by interacting,

00:52:56.389 --> 00:52:58.149
because there are some constraints,

00:52:58.389 --> 00:53:01.349
some joint constraints that we have to fulfill, feel

00:53:01.349 --> 00:53:04.329
then in this joint optimization

00:53:04.329 --> 00:53:07.029
framework it's quite obvious that i have

00:53:07.029 --> 00:53:10.669
to do something also for you for your sake right and

00:53:10.669 --> 00:53:15.409
in our formulation that that that means that in addition to having some motor

00:53:15.409 --> 00:53:20.289
intention we also have some communicative intention the communicative intention

00:53:20.289 --> 00:53:26.609
is has a cost because for instance for exaggerating the movement i pay a cost

00:53:26.609 --> 00:53:28.689
in terms of biomechanical a cost of something.

00:53:28.809 --> 00:53:34.009
So such as, for instance, when I am in a noisy pub, I have to over-articulate.

00:53:34.409 --> 00:53:39.549
Or if you think the mother is, for example, when the child directed speech,

00:53:39.809 --> 00:53:42.909
so you exaggerate, that's done for the sake of the other person.

00:53:43.955 --> 00:53:47.375
Ultimately, we say that's done for the sake of achieving a good joint goal,

00:53:47.535 --> 00:53:48.875
which is understanding one another.

00:53:49.115 --> 00:53:55.355
But that's really the first form of motor of communicative intention that I have.

00:53:55.535 --> 00:53:59.255
So it's for you, it's really the signaling mechanisms that would allow the bootstrapping

00:53:59.255 --> 00:54:01.815
of communication and language, right?

00:54:02.015 --> 00:54:05.875
But that's still a project that will be realized in the future.

00:54:06.135 --> 00:54:11.935
Yes, but I would say that the important component is always the missing one

00:54:11.935 --> 00:54:14.915
from the current studies, which is the more pragmatic component of language.

00:54:15.095 --> 00:54:21.175
So the pragmatics of language as the early pragmatists, which study language know very, very well.

00:54:21.355 --> 00:54:25.715
So the pragmatic component is prominent because then the ground ego symbols,

00:54:25.855 --> 00:54:28.855
the grammar, of course, they are hard problems, but the pragmatics,

00:54:28.855 --> 00:54:35.715
they are the most important part, such as for instance, knowing Knowing what requesting means,

00:54:35.955 --> 00:54:44.255
knowing what telling means or promising means or all these joint action and joint attention,

00:54:44.575 --> 00:54:49.635
tour-taking mechanism, all of them provide a big scaffolding for the emergence

00:54:49.635 --> 00:54:51.315
of linguistic communication. Right.

00:54:51.775 --> 00:54:55.655
So I got that. But now, so to finish up, two questions.

00:54:57.115 --> 00:55:01.115
So you have this really big ambitious program, and in the end,

00:55:01.155 --> 00:55:07.075
you'll generate a whole of culture from action, which is going to be still a

00:55:07.075 --> 00:55:10.155
long trajectory, but you're well on your way.

00:55:10.655 --> 00:55:13.615
But in doing this, what would be Giovanni's Law?

00:55:14.615 --> 00:55:22.335
Giovanni's law? Okay. Well, my law would be maybe starting understanding cognitions

00:55:22.335 --> 00:55:27.515
from the primitive equipment of our ancestors.

00:55:27.875 --> 00:55:32.935
So that would be more an evolutionary route towards the more complex cognitive skills.

00:55:33.275 --> 00:55:35.955
Well, I can probably give three

00:55:35.955 --> 00:55:39.535
examples that are very simple in the three domains that we target now.

00:55:39.695 --> 00:55:41.855
One is the social domain that we have discussed.

00:55:42.055 --> 00:55:47.215
So you start from controlling your body, your actions, to controlling our actions,

00:55:47.335 --> 00:55:49.055
to controlling then your mental states.

00:55:49.235 --> 00:55:53.255
That's communication in a sense. Communication is really putting something into your brain.

00:55:53.675 --> 00:55:57.215
And then you also develop culture and pedagogy, for instance,

00:55:57.315 --> 00:56:00.295
for controlling that at a very, very longer timescale.

00:56:00.355 --> 00:56:05.775
So by developing some culture, the human beings can really control the behavior

00:56:05.775 --> 00:56:09.555
of many people over millennia probably, and also by pedagogy.

00:56:09.555 --> 00:56:10.895
So that's in the social domain.

00:56:11.375 --> 00:56:16.715
You can see this also in the control of the external world, on this environmental

00:56:16.715 --> 00:56:19.575
scaffolding or the environmental shaping.

00:56:19.655 --> 00:56:24.195
So we start from the control of our body to the control maybe of the reality.

00:56:25.269 --> 00:56:28.249
Our peripersonal space with some objects and then

00:56:28.249 --> 00:56:31.169
you start controlling tools for instance

00:56:31.169 --> 00:56:34.609
for doing more complex actions and but then

00:56:34.609 --> 00:56:37.589
tools you have also to build tools maybe at some

00:56:37.589 --> 00:56:40.329
point you build the city of barcelona so then in a sense

00:56:40.329 --> 00:56:44.149
you extend the boundaries of your control from the control your body to the

00:56:44.149 --> 00:56:48.749
control of the whole environment we humans really modify our environment very

00:56:48.749 --> 00:56:53.149
much such that then they support much better our cognition so that's another

00:56:53.149 --> 00:56:57.769
story not only in the social domain also in the environmental domain the third

00:56:57.769 --> 00:57:00.069
the third and last one is in the more,

00:57:00.669 --> 00:57:04.969
thought domain or the cognitive control domain so you just start from controlling

00:57:04.969 --> 00:57:10.129
your action and then you end up in some way by controlling your thought processes

00:57:10.129 --> 00:57:15.549
and by controlling your cognition over long long long time scales for instance

00:57:15.549 --> 00:57:17.009
one example that i do well,

00:57:17.649 --> 00:57:21.149
let's imagine i want to become a prominent scientist

00:57:21.149 --> 00:57:24.029
which apparently doesn't doesn't happen right

00:57:24.029 --> 00:57:27.209
now so i have to control my behavior for 20 years

00:57:27.209 --> 00:57:33.009
maybe so the way i do that is by well by cognitive control in a sense by setting

00:57:33.009 --> 00:57:37.429
some goal states into my mind then then control my behavior on longer longer

00:57:37.429 --> 00:57:42.589
longer time scales then again you have this control of the body the or the control

00:57:42.589 --> 00:57:45.929
of my short-term actions that become the control of my entire life.

00:57:48.669 --> 00:57:53.509
I have given three examples of how this research program, I would say,

00:57:53.669 --> 00:57:57.829
would start from very simple sensory motor abilities to very,

00:57:57.909 --> 00:58:02.329
very complex abilities that cross the boundaries of many disciplines.

00:58:02.489 --> 00:58:05.789
Sociology on the one hand, and let's say architecture on the other hand,

00:58:05.869 --> 00:58:08.469
or maybe the study of consciousness on the other hand.

00:58:08.649 --> 00:58:11.689
And you have even integrated it in your own career planning.

00:58:12.089 --> 00:58:16.449
Ah, of course. so the last thing the very last question then is prediction so

00:58:16.449 --> 00:58:21.529
if we come back then five years from now as a famous scientist to the summer school again.

00:58:23.308 --> 00:58:27.088
So what's the one prediction we can test on you by then?

00:58:27.188 --> 00:58:31.408
So what's the one prediction you would like to make today that you feel very strongly about?

00:58:31.708 --> 00:58:34.068
And we can ask you about five years from now.

00:58:34.668 --> 00:58:39.928
Okay, so about the future of this field or about the… Your research program. Okay.

00:58:41.088 --> 00:58:46.468
So about my research program. Well, I think I elucidated the principle,

00:58:46.828 --> 00:58:52.328
my key goal, which is understanding higher cognition, how our cognition is based

00:58:52.328 --> 00:58:55.248
on sensory motor and predictive abilities.

00:58:55.528 --> 00:58:59.468
So the prediction is that at least we will have some nice demonstration in these

00:58:59.468 --> 00:59:04.388
three fields that I've mentioned before, much stronger demonstration than the ones that we have now.

00:59:04.648 --> 00:59:09.928
And from that, we can really try to build an understanding of how the brain

00:59:09.928 --> 00:59:11.348
implements higher cognitive skills.

00:59:11.648 --> 00:59:16.168
So at least one demonstration for each of the three fields. That's very good.

00:59:16.248 --> 00:59:19.428
So Giovanna Pizzuto, thank you very much for this conversation. Thank you, Paul.

00:59:22.148 --> 00:59:28.148
The csn podcast was produced by the convergent science network of biometrics

00:59:28.148 --> 00:59:34.628
and biohybrid systems a project funded by the european sevens research framework program.

00:59:36.240 --> 01:00:02.886
Music.