WEBVTT

00:00:00.000 --> 00:00:02.740
So really, when it comes to strategizing, it

00:00:02.740 --> 00:00:05.320
could be an instance of, okay, let's open you

00:00:05.320 --> 00:00:07.620
at this number. We know what you're capable of.

00:00:07.639 --> 00:00:09.460
Let's open you at this. This will give you the

00:00:09.460 --> 00:00:12.580
best opportunity to get 300 or more, which might

00:00:12.580 --> 00:00:15.000
lead to a medal. Naturally, when you take that

00:00:15.000 --> 00:00:17.730
information. to a competition, as you see the

00:00:17.730 --> 00:00:20.289
board start to change, that's when you can make

00:00:20.289 --> 00:00:23.750
a little bit more of a freer choice of what load

00:00:23.750 --> 00:00:27.649
to put on because this information is there to

00:00:27.649 --> 00:00:30.629
just provide some form of strategy moving into

00:00:30.629 --> 00:00:33.170
competition. Once you're in competition, it's

00:00:33.170 --> 00:00:35.869
then down to intuition and how to read the board.

00:00:36.030 --> 00:00:38.350
So I think that's one of the steps that would

00:00:38.350 --> 00:00:40.909
be nice to look at in the future. Before I kind

00:00:40.909 --> 00:00:43.289
of get onto the performance zones, I think it's

00:00:43.289 --> 00:00:45.210
important to understand this started from something.

00:00:45.320 --> 00:00:48.600
quite simple so the paper is quite complex even

00:00:48.600 --> 00:00:51.259
when we developed it i had the help of a data

00:00:51.259 --> 00:00:54.020
scientist who happens to be one of my weightlifters

00:00:54.020 --> 00:00:57.799
you know an incredible help in providing information

00:00:57.799 --> 00:01:02.729
on how to make this sort of bus setup Welcome

00:01:02.729 --> 00:01:05.549
to Evidence Strong Show. It's my pleasure to

00:01:05.549 --> 00:01:07.810
have you the second time. The previous time I

00:01:07.810 --> 00:01:11.010
had you, we talked about your paper on stable

00:01:11.010 --> 00:01:13.629
and variable components in Olympic weightlifting

00:01:13.629 --> 00:01:16.450
technique. And today we'll be talking about predicting

00:01:16.450 --> 00:01:18.709
performance. But for three people who haven't

00:01:18.709 --> 00:01:20.530
watched the previous one, could you briefly introduce

00:01:20.530 --> 00:01:23.549
yourself? Firstly, thank you for having me on

00:01:23.549 --> 00:01:26.590
again. I appreciate it. I feel quite privileged.

00:01:27.010 --> 00:01:30.170
So my name is Sean Chavda. I'm the program lead

00:01:30.170 --> 00:01:32.049
for Strength and Conditioning. conditioning,

00:01:32.069 --> 00:01:34.849
distance education, a master's program at Middlesex

00:01:34.849 --> 00:01:36.969
University. I'm also head of weightlifting coach

00:01:36.969 --> 00:01:40.230
at Middlesex University where I coach Silver

00:01:40.230 --> 00:01:42.849
Chachit for Great Britain and Jessica De Silva

00:01:42.849 --> 00:01:46.370
for Portugal. And alongside my academic endeavors,

00:01:46.609 --> 00:01:50.109
I'm also a senior performance scientist at British

00:01:50.109 --> 00:01:52.510
Weightlifting. Awesome. So these are your qualifications.

00:01:52.849 --> 00:01:56.189
You also do a PhD and this study we'll be talking

00:01:56.189 --> 00:01:58.870
about today is a part of your PhD. This is the

00:01:58.870 --> 00:02:01.010
first part of the PhD. Yes, almost. I forgot.

00:02:01.069 --> 00:02:02.709
to mention that I'm doing that. Awesome. It's

00:02:02.709 --> 00:02:06.569
important. So why would we be interested in predicting

00:02:06.569 --> 00:02:09.590
performance? I mean, ultimately, if we can predict

00:02:09.590 --> 00:02:12.849
anything, it's always quite useful to be able

00:02:12.849 --> 00:02:15.250
to do that, right? And, you know, let's not beat

00:02:15.250 --> 00:02:17.150
around the bush around predictions. If there's

00:02:17.150 --> 00:02:19.110
anything we could predict in life, it'd be lottery

00:02:19.110 --> 00:02:21.509
numbers because it's far more lucrative. But

00:02:21.509 --> 00:02:24.090
predicting weightlifting performance actually

00:02:24.090 --> 00:02:26.830
came about because of a real life performance

00:02:26.830 --> 00:02:29.169
problem, which I encountered in the build up

00:02:29.169 --> 00:02:32.039
to Tokyo. And, you know, we weren't coming through

00:02:32.039 --> 00:02:34.740
this traditional route, having qualified, but

00:02:34.740 --> 00:02:36.879
instead through the International Olympic Committee's

00:02:36.879 --> 00:02:39.780
refugee Olympic team. So not many people knew

00:02:39.780 --> 00:02:42.180
about Sil's abilities, as he didn't have the

00:02:42.180 --> 00:02:44.669
privilege to compete at. international events

00:02:44.669 --> 00:02:47.129
for like five years, but we knew his capabilities.

00:02:47.389 --> 00:02:50.370
But what we didn't know is whether those capabilities

00:02:50.370 --> 00:02:54.129
would be worthy of this unspoken target, which

00:02:54.129 --> 00:02:56.849
we'd kind of set for him, which was finishing

00:02:56.849 --> 00:03:00.189
within top 10. So with this ranking in mind,

00:03:00.250 --> 00:03:02.169
in the buildup, I figured it was really important

00:03:02.169 --> 00:03:05.110
to try and find out what total would be required

00:03:05.110 --> 00:03:08.050
to achieve that top 10 ranking. And that's where

00:03:08.050 --> 00:03:09.650
I ended up questioning whether I could actually

00:03:09.650 --> 00:03:13.409
try and predict this. So this is the why. Now,

00:03:13.409 --> 00:03:17.990
you use something, a concept previously not discussed

00:03:17.990 --> 00:03:21.310
that widely. It's called performance zones. So

00:03:21.310 --> 00:03:24.009
could you explain what it's all about? Yeah.

00:03:24.069 --> 00:03:26.349
So before I kind of get onto the performance

00:03:26.349 --> 00:03:28.689
zones, I think it's important to understand this

00:03:28.689 --> 00:03:31.430
started from something quite simple. So the paper

00:03:31.430 --> 00:03:34.050
is quite complex. Even when we developed it,

00:03:34.129 --> 00:03:37.169
I had the help of a data scientist who happens

00:03:37.169 --> 00:03:39.590
to be one of my weightlifters, you know, an incredible

00:03:39.590 --> 00:03:42.550
help. in providing information on how to make

00:03:42.550 --> 00:03:44.490
this sort of a bus setup. But essentially everything

00:03:44.490 --> 00:03:46.590
was done in Excel at the beginning with scatter

00:03:46.590 --> 00:03:49.669
plots and regression lines. And what I did for

00:03:49.669 --> 00:03:52.469
each of the competitors in the men's 96 group

00:03:52.469 --> 00:03:56.569
is I predicted what they might do at the games.

00:03:56.710 --> 00:03:58.870
And if we fast forward a little bit, then it's

00:03:58.870 --> 00:04:00.669
something I'll go into more depth on later. I

00:04:00.669 --> 00:04:04.710
then predicted what total is likely to be achieved

00:04:04.710 --> 00:04:07.370
between the places of ninth and 10th. And that

00:04:07.370 --> 00:04:09.629
ninth and 10th can be termed a performance zone.

00:04:09.819 --> 00:04:11.960
So to answer your original question, what is

00:04:11.960 --> 00:04:15.060
a performance zone? It's a cluster of ranks.

00:04:15.300 --> 00:04:18.980
So in the paper, I take the rank from one to

00:04:18.980 --> 00:04:21.439
15, but I break them down into a medal zone.

00:04:21.579 --> 00:04:23.639
So that would be a performance zone. Fourth to

00:04:23.639 --> 00:04:27.060
fifth, which was a performance zone two. Essentially,

00:04:27.100 --> 00:04:29.560
that's like the outside shot at achieving a medal.

00:04:29.839 --> 00:04:33.639
Sixth to eighth, which I believe top eight is

00:04:33.639 --> 00:04:36.860
the standard. you need to achieve to qualify

00:04:36.860 --> 00:04:39.939
for the Games, but also you achieve an Olympic

00:04:39.939 --> 00:04:43.350
diploma. as well which i didn't find out until

00:04:43.350 --> 00:04:46.689
the after the games um and then ninth to tenth

00:04:46.689 --> 00:04:49.730
is again your outside shot to achieve a top eight

00:04:49.730 --> 00:04:53.350
and then 11th to 15th which was the maximum number

00:04:53.350 --> 00:04:56.509
of lifters within um the olympic categories so

00:04:56.509 --> 00:04:58.850
that's how i split the performance zones but

00:04:58.850 --> 00:05:01.430
what's important is i consulted with the performance

00:05:01.430 --> 00:05:04.069
manager at british weightlifting to ensure that

00:05:04.069 --> 00:05:07.290
these performance zones were highly contextual

00:05:07.290 --> 00:05:09.889
to what we need them for but it's important to

00:05:09.889 --> 00:05:12.189
note that in other Other sports or other governing

00:05:12.189 --> 00:05:14.230
bodies, they might deem performance zones to

00:05:14.230 --> 00:05:16.629
be slightly different. So, for example, you might

00:05:16.629 --> 00:05:19.009
have first and second as a performance zone,

00:05:19.149 --> 00:05:22.290
third and fifth as a performance zone. And whatever

00:05:22.290 --> 00:05:24.470
you decide to use as a performance zone just

00:05:24.470 --> 00:05:26.290
needs to be justified and worked within your

00:05:26.290 --> 00:05:28.110
own practice. But these are the ones that we

00:05:28.110 --> 00:05:30.810
decided were the most logical and the most useful.

00:05:31.089 --> 00:05:34.250
So anytime where the placing matters for qualification

00:05:34.250 --> 00:05:37.629
for other events or making a team or getting

00:05:37.629 --> 00:05:41.949
a scholarship. or whatever, that would be the

00:05:41.949 --> 00:05:44.069
zone qualification you would or classification

00:05:44.069 --> 00:05:47.149
you would use. Right, yeah. Okay, so you tried

00:05:47.149 --> 00:05:50.930
to predict what is the likely total required

00:05:50.930 --> 00:05:54.050
for each zone. How did you go about it? Yeah,

00:05:54.089 --> 00:05:58.250
so the weight class change that happened prior

00:05:58.250 --> 00:06:02.269
to November 2018, I think it was July 2018, might

00:06:02.269 --> 00:06:03.970
have been the last competition at the old weight

00:06:03.970 --> 00:06:05.970
categories. This was kind of the foundation of

00:06:05.970 --> 00:06:09.560
the paper because we have loads of historic information

00:06:09.560 --> 00:06:12.240
on these old weight categories but we have nothing

00:06:12.240 --> 00:06:15.480
on the new ones so one of the objectives of the

00:06:15.480 --> 00:06:19.560
paper was to see if we could predict what totals

00:06:19.560 --> 00:06:23.399
would be achieved at specific performance zones

00:06:23.399 --> 00:06:27.240
in each weight category or well in each new weight

00:06:27.240 --> 00:06:30.360
category which had been introduced by the IWF

00:06:30.360 --> 00:06:34.420
using the old weight category data so essentially

00:06:34.420 --> 00:06:38.069
the new category data didn't exist and when they

00:06:38.069 --> 00:06:40.529
did exist there's far too small of a sample to

00:06:40.529 --> 00:06:43.310
make a prediction so we had to use historic data

00:06:43.310 --> 00:06:45.670
and then once we developed these predictions

00:06:45.670 --> 00:06:48.250
we could then check if those predictions were

00:06:48.250 --> 00:06:52.500
close or not to what actually happened at Tokyo,

00:06:52.660 --> 00:06:55.079
European Champs, and the World Champs, for example.

00:06:55.420 --> 00:06:57.800
So what competitions did you take into account?

00:06:58.180 --> 00:07:03.839
So just the major three for us in Great Britain,

00:07:03.959 --> 00:07:06.899
which would be the Olympic Games, World Championships,

00:07:07.000 --> 00:07:10.639
and European Championships, so continental. Okay,

00:07:10.660 --> 00:07:14.379
and which years? So this was part of the methods,

00:07:14.379 --> 00:07:18.699
which I think was quite important. So if I go

00:07:18.699 --> 00:07:20.639
through the methods, that might help answer the

00:07:20.639 --> 00:07:23.720
question. what years we used and kind of came

00:07:23.720 --> 00:07:26.980
about it. So the data was scraped from the IWF

00:07:26.980 --> 00:07:29.600
website from 1998 all the way through to 2021.

00:07:30.040 --> 00:07:33.019
I took all the totals of all competitors. Then

00:07:33.019 --> 00:07:35.740
I reduced the data down to the top 15 at each

00:07:35.740 --> 00:07:38.240
competition. So World Challenge, European Challenge

00:07:38.240 --> 00:07:40.459
and Olympic Games. With regards to the years,

00:07:40.720 --> 00:07:43.500
what I did is I ran some effect sizes between

00:07:43.500 --> 00:07:47.740
each year within each competition type. So European

00:07:47.740 --> 00:07:51.899
Championships, 1998, 99, so on. And I compared

00:07:51.899 --> 00:07:54.519
those years. So I just took the average achieved

00:07:54.519 --> 00:07:57.199
and just compared those years to ensure that

00:07:57.199 --> 00:08:00.000
in general, there wasn't a large difference between

00:08:00.000 --> 00:08:04.480
each of these years. And this essentially helps

00:08:04.480 --> 00:08:08.490
us identify it. like an outlier year where performances

00:08:08.490 --> 00:08:11.149
might have just completely dropped or just gone

00:08:11.149 --> 00:08:14.370
up by you know a significant amount so once we're

00:08:14.370 --> 00:08:17.089
happy that there are either small or moderate

00:08:17.089 --> 00:08:20.110
changes between them just decide okay that's

00:08:20.110 --> 00:08:23.290
acceptable let's pull the data together for each

00:08:23.290 --> 00:08:26.290
competition so it helps increase the sample that

00:08:26.290 --> 00:08:28.560
we're using for the prediction okay i have two

00:08:28.560 --> 00:08:31.399
questions about the data so one would be how

00:08:31.399 --> 00:08:34.120
did you decide to transform the previous weight

00:08:34.120 --> 00:08:37.179
classes into the new ones if you did or how did

00:08:37.179 --> 00:08:39.720
you kind of group them together and then the

00:08:39.720 --> 00:08:43.559
other would be some athletes will reappear from

00:08:43.559 --> 00:08:46.480
one competition to the other so some data come

00:08:46.480 --> 00:08:50.200
from the same athlete so how how did you handle

00:08:50.200 --> 00:08:52.440
that yeah excellent questions actually questions

00:08:52.440 --> 00:08:55.360
that came up in the review process when prior

00:08:55.360 --> 00:08:58.159
to this being published so the The first question,

00:08:58.220 --> 00:09:01.240
correct me if I misinterpret the question, was

00:09:01.240 --> 00:09:02.759
what was the first question? No, no, the first

00:09:02.759 --> 00:09:05.200
one was white glasses because you mentioned before.

00:09:05.820 --> 00:09:09.120
July 2018, the weight classes changed. We have

00:09:09.120 --> 00:09:13.419
that data and you included data in this analysis

00:09:13.419 --> 00:09:18.019
from 1998, which means you included also the

00:09:18.019 --> 00:09:21.159
data from the previous weight classes scheme.

00:09:21.340 --> 00:09:26.360
So how did you manage? So it was all the data

00:09:26.360 --> 00:09:30.820
from 1998 to 2018. Every single competition that

00:09:30.820 --> 00:09:32.919
had this, they would have the same weight categories,

00:09:33.100 --> 00:09:35.980
right? They were all put in to, they were all

00:09:35.980 --> 00:09:39.039
aggregated for each performance zone, right?

00:09:39.419 --> 00:09:42.220
Now with that, we split them into rank zones.

00:09:42.320 --> 00:09:44.500
So you've got your aggregate for each rank zone.

00:09:44.679 --> 00:09:47.240
We then ran something called a K -fold cross

00:09:47.240 --> 00:09:50.019
validation, which essentially just evaluates

00:09:50.019 --> 00:09:53.580
the model's ability to predict when given unseen

00:09:53.580 --> 00:09:56.539
data. which is what we're going to be putting

00:09:56.539 --> 00:09:58.659
into it at some point. But also what it does,

00:09:58.679 --> 00:10:00.639
it stops us overfitting this regression line.

00:10:00.779 --> 00:10:03.740
So the regression line we fit in was a polynomial

00:10:03.740 --> 00:10:06.740
regression, basically curves through. Similar

00:10:06.740 --> 00:10:09.600
to weightlifting totals and body weight, it's

00:10:09.600 --> 00:10:11.940
polynomial. It goes up, it'll plateau and it'll

00:10:11.940 --> 00:10:15.639
start to come down. Same as strength in relation

00:10:15.639 --> 00:10:18.919
to body weight, it's curve linear, not linear.

00:10:19.200 --> 00:10:21.200
So anyway, we run this polynomial regression.

00:10:21.440 --> 00:10:23.799
Because we're using the old data, that's all.

00:10:23.909 --> 00:10:27.710
plotted and the curve means at any body weight

00:10:27.710 --> 00:10:31.190
category on the bottom axis, we can just go at

00:10:31.190 --> 00:10:34.090
this point. which is a new weight category, go

00:10:34.090 --> 00:10:36.690
up, find the intercept, and then we can predict

00:10:36.690 --> 00:10:42.090
what the total would be for that new weight category

00:10:42.090 --> 00:10:44.789
in that specific performance zone within that

00:10:44.789 --> 00:10:48.350
specific competition type. So you took the actual

00:10:48.350 --> 00:10:52.789
data of the body weight data of each athlete

00:10:52.789 --> 00:10:55.649
in each weight class, and when you predicted

00:10:55.649 --> 00:10:59.049
the total, then you were able to then go back

00:10:59.049 --> 00:11:03.669
and kind of you had a linear. relationship polynomial

00:11:03.669 --> 00:11:07.250
but kind of not by classes so not by categories

00:11:07.250 --> 00:11:10.370
more continuous yes no it was by category so

00:11:10.370 --> 00:11:12.149
if you look at the paper there's some supplemental

00:11:12.149 --> 00:11:16.350
supplementary information which are graphs and

00:11:16.350 --> 00:11:20.269
in those graphs show um on the bottom axis it

00:11:20.269 --> 00:11:23.610
has each olympic weight category or each new

00:11:23.610 --> 00:11:26.090
weight category i should say and on the vertical

00:11:26.090 --> 00:11:28.990
axis it's got the total and it's a simple it's

00:11:28.990 --> 00:11:31.519
a simple case of just intercepting vertically

00:11:31.519 --> 00:11:34.039
and then horizontally to that regression line

00:11:34.039 --> 00:11:37.440
to then find what the predicted total would be

00:11:37.440 --> 00:11:40.740
for that weight category and the new weight category

00:11:40.740 --> 00:11:44.059
we use instead of just putting in the number

00:11:44.059 --> 00:11:49.419
89 or putting in 102 what we did is we took the

00:11:49.419 --> 00:11:52.779
new weight category information that had been

00:11:52.779 --> 00:11:56.080
competed in which was i think only two european

00:11:56.080 --> 00:11:59.100
champs maybe one or two world championships we

00:11:59.100 --> 00:12:01.440
just took the average weight as opposed to just

00:12:01.440 --> 00:12:03.399
taking that body weight the reason we took the

00:12:03.399 --> 00:12:05.960
average body weight is because on the top end

00:12:05.960 --> 00:12:08.779
of the spectrum where you have no end cap to

00:12:08.779 --> 00:12:11.139
be as heavy as you want we felt we felt like

00:12:11.139 --> 00:12:14.220
that would influence the predictability so to

00:12:14.220 --> 00:12:16.379
keep things fair and logical we just went take

00:12:16.379 --> 00:12:18.539
the average of each new weight category because

00:12:18.539 --> 00:12:21.039
we know that we have that information and let's

00:12:21.039 --> 00:12:24.559
put that in because It represents the mean of

00:12:24.559 --> 00:12:27.320
the group of their body weight. So we're not

00:12:27.320 --> 00:12:30.919
just going to put 89 in if most of them fall

00:12:30.919 --> 00:12:34.779
around 88 .7 or 88 .6, for example. So that's

00:12:34.779 --> 00:12:37.320
how we predict it. So old weight category data

00:12:37.320 --> 00:12:41.159
plotted and new weight category data used as

00:12:41.159 --> 00:12:44.320
the intercept. Okay. I'm looking through them

00:12:44.320 --> 00:12:46.820
because I thought maybe we could just put a graph

00:12:46.820 --> 00:12:50.100
on top of the video. The version I have is before

00:12:50.100 --> 00:12:53.350
publication. publisher shouldn't but it's like

00:12:53.350 --> 00:12:56.570
every zone will have a separate i can share my

00:12:56.570 --> 00:12:58.830
screen if you like yeah i've got so i've got

00:12:58.830 --> 00:13:01.210
a graph here for like the olympic medal zone

00:13:01.210 --> 00:13:05.029
which i'm i'm happy to to show you awesome so

00:13:05.029 --> 00:13:08.700
this is part of the supplementary materials.

00:13:09.000 --> 00:13:13.139
For example, let's say this regression line that

00:13:13.139 --> 00:13:15.419
runs through the middle of these squares were

00:13:15.419 --> 00:13:18.580
developed using the old weight category data.

00:13:18.899 --> 00:13:21.279
We've then got some confidence intervals around

00:13:21.279 --> 00:13:23.899
it, 95 % confidence intervals. So if we were

00:13:23.899 --> 00:13:26.000
to run this on, say, a homogeneous population,

00:13:26.360 --> 00:13:30.179
run a regression, 100 times 95 of those regression

00:13:30.179 --> 00:13:32.679
lines will likely fall within this range. We

00:13:32.679 --> 00:13:34.960
have a predictive interval on the outside. Predictive

00:13:34.960 --> 00:13:37.419
intervals are a little bit wider. but these essentially

00:13:37.419 --> 00:13:42.399
account for any future individual cases which

00:13:42.399 --> 00:13:44.779
might come in. And that's why they're extremely

00:13:44.779 --> 00:13:48.639
wide. So for example, in the future, if one person

00:13:48.639 --> 00:13:51.539
lifts in the 96, they're probably going to be

00:13:51.539 --> 00:13:53.779
between this bound and this bound here, the dash

00:13:53.779 --> 00:13:57.120
line. So we can see for the men's Olympic medal

00:13:57.120 --> 00:14:00.659
zone, so first and third, the prediction is this

00:14:00.659 --> 00:14:03.139
value here. So approximately, let's just say

00:14:03.139 --> 00:14:06.000
400 kilos, if you want the actual value. values

00:14:06.000 --> 00:14:09.639
that are available in the tables above paralyzed

00:14:09.639 --> 00:14:12.360
people with loads of numbers. So let's just take

00:14:12.360 --> 00:14:15.399
it as 400. So that's our prediction. And then

00:14:15.399 --> 00:14:18.279
after the Olympics, I took an average of the

00:14:18.279 --> 00:14:21.500
medal zone and I took a standard deviation and

00:14:21.500 --> 00:14:24.460
I simply plotted it on top. So this shows that

00:14:24.460 --> 00:14:27.279
actually it was an overprediction compared to

00:14:27.279 --> 00:14:30.000
what actually happened. However, the top end

00:14:30.000 --> 00:14:34.610
almost reaches that 95%. confidence range so

00:14:34.610 --> 00:14:37.289
while you might look at this and think or somebody

00:14:37.289 --> 00:14:39.190
may look at this and think well that's an over

00:14:39.190 --> 00:14:42.190
prediction it's it's rubbish in actual fact you'd

00:14:42.190 --> 00:14:43.830
rather have an over prediction than an under

00:14:43.830 --> 00:14:46.090
prediction because it means you're trying to

00:14:46.090 --> 00:14:49.350
gear gear up to lift like that weight which is

00:14:49.350 --> 00:14:51.649
you know far heavier than what was actually achieved

00:14:51.649 --> 00:14:54.470
so it might mean on game day if you're expecting

00:14:54.470 --> 00:14:57.950
this to happen and you lift close to it and this

00:14:57.950 --> 00:15:00.529
actually happens, you end up overachieving and

00:15:00.529 --> 00:15:03.230
potentially getting a medal. Whereas if we look

00:15:03.230 --> 00:15:06.230
at the 73, the prediction's almost spot on in

00:15:06.230 --> 00:15:09.509
terms of the average. So second place was really,

00:15:09.669 --> 00:15:11.850
really close. But if you achieve anything between

00:15:11.850 --> 00:15:14.029
these red lines, you're likely to then have got

00:15:14.029 --> 00:15:16.809
a medal. In the same breath with predictive intervals

00:15:16.809 --> 00:15:19.870
here, if you were to achieve this value, it would

00:15:19.870 --> 00:15:22.429
have put you outside the middle zone. So with

00:15:22.429 --> 00:15:25.029
these confidence intervals that we have, we're

00:15:25.029 --> 00:15:26.850
just trying to account for as much error as possible.

00:15:26.889 --> 00:15:29.850
But being at a competition is extremely dynamic

00:15:29.850 --> 00:15:32.490
and anything can happen, especially at highly

00:15:32.490 --> 00:15:35.629
competitive competitions. But with the ability

00:15:35.629 --> 00:15:38.529
to predict, at least we have some opportunity

00:15:38.529 --> 00:15:42.110
to strategize longitudinally or even just going

00:15:42.110 --> 00:15:44.289
into competition. So I think it certainly has

00:15:44.289 --> 00:15:47.769
a utility in sport. Should we have a look at

00:15:47.769 --> 00:15:52.649
the 9th to 10th place zone or to 11th to 15th?

00:15:52.710 --> 00:15:55.750
How in comparison will it be wider? How it will

00:15:55.750 --> 00:16:00.190
look? yeah so as we expected the lower down the

00:16:00.190 --> 00:16:02.549
ranks you get the wider the interval becomes

00:16:02.549 --> 00:16:05.549
and that's for pretty much most competitions

00:16:05.549 --> 00:16:08.750
more so for olympic than the world champs i believe

00:16:08.750 --> 00:16:12.269
and the reason being is olympics is about equality

00:16:12.269 --> 00:16:15.529
and inclusion and representation so you know

00:16:15.529 --> 00:16:19.710
while you are getting very very good weightlifters

00:16:19.710 --> 00:16:22.590
your top eight are the ones that are really world

00:16:22.590 --> 00:16:24.529
leading because they're the ones that have qualified

00:16:24.529 --> 00:16:26.679
by being the best in the world. The rest then

00:16:26.679 --> 00:16:30.539
get allocated to continents. Now, they might

00:16:30.539 --> 00:16:33.399
be the best in the continent. However, in the

00:16:33.399 --> 00:16:35.840
overall world ranking, they may not be so high,

00:16:35.980 --> 00:16:40.419
which means naturally from maybe ninth down,

00:16:40.659 --> 00:16:42.779
you're going to start to see these confidence

00:16:42.779 --> 00:16:45.720
intervals spread a lot more than you would at

00:16:45.720 --> 00:16:49.320
the higher echelons of performance, which is

00:16:49.320 --> 00:16:52.879
quite evident when you get to 11th to 15th. In

00:16:52.879 --> 00:16:55.220
terms of prediction and And the strength of the

00:16:55.220 --> 00:16:58.139
predictions, would you say that they are a little

00:16:58.139 --> 00:17:01.279
bit less accurate for these places too, the prediction

00:17:01.279 --> 00:17:04.000
model? It depends how you view it, because in

00:17:04.000 --> 00:17:06.460
theory, if you were to say, how good is your

00:17:06.460 --> 00:17:08.839
prediction into how good is your predictability?

00:17:09.160 --> 00:17:11.059
Everything that's happened has fallen within

00:17:11.059 --> 00:17:13.099
the predictive intervals. I see. So actually

00:17:13.099 --> 00:17:15.200
it's a hundred percent. Yes, we can predict.

00:17:15.339 --> 00:17:18.700
The problem is, is the actual value itself. The

00:17:18.700 --> 00:17:21.660
range is far too large that we're predicting.

00:17:21.859 --> 00:17:25.660
So, you know, it's... I suppose it'd be the equivalent

00:17:25.660 --> 00:17:29.440
of saying, think of a number between 1 and 3.

00:17:29.599 --> 00:17:32.500
I'm more likely to get the number correct, right?

00:17:32.839 --> 00:17:35.599
But if I open up, think of a number between 1

00:17:35.599 --> 00:17:38.299
and 100, I've made my chance a lot smaller to

00:17:38.299 --> 00:17:41.579
predict that. So it's similar to there. Whereas

00:17:41.579 --> 00:17:44.160
if I just scoot up to, I mean, the medal zone

00:17:44.160 --> 00:17:47.960
here, you can see the 11th to 15th zone is really,

00:17:48.119 --> 00:17:51.500
really wide. And a lot of the actual, actually

00:17:51.500 --> 00:17:53.670
what happened was an underprediction. in comparison,

00:17:54.029 --> 00:17:56.730
sorry, what actually happened was less than what

00:17:56.730 --> 00:17:59.009
was predicted. Whereas when we look at the world

00:17:59.009 --> 00:18:02.789
champs. first and third, a lot of them are very

00:18:02.789 --> 00:18:05.869
close, or the standard deviations cross the prediction,

00:18:06.069 --> 00:18:08.529
right, with the exception of the 89 and the 102.

00:18:08.869 --> 00:18:11.250
And you can see the prediction interval is smaller,

00:18:11.490 --> 00:18:14.289
confidence interval is smaller, and what was

00:18:14.289 --> 00:18:17.269
predicted as a point estimate and what actually

00:18:17.269 --> 00:18:19.809
happened were very close. Would you say it's

00:18:19.809 --> 00:18:22.670
kind of a ceiling effect with the metal zone

00:18:22.670 --> 00:18:27.269
versus 11s to 15s? Yeah, the margins up top are

00:18:27.269 --> 00:18:30.269
a lot smaller, yeah, for sure. sure one of the

00:18:30.269 --> 00:18:33.549
things we have discussed from a practical and

00:18:33.549 --> 00:18:36.109
an applied perspective is looking at first second

00:18:36.109 --> 00:18:39.170
and third and looking at how they may cross over

00:18:39.170 --> 00:18:42.990
using similar models but as soon as you do that

00:18:42.990 --> 00:18:46.490
your sample just gets a lot smaller anyway so

00:18:46.490 --> 00:18:49.710
the predict the ability to predict and generalize

00:18:49.710 --> 00:18:52.289
might be reduced but if that's what you have

00:18:52.289 --> 00:18:55.470
available at least again you have something that

00:18:55.470 --> 00:18:57.849
enables you to strategize as opposed to going

00:18:57.849 --> 00:19:00.450
well it's not robust statistically so let's not

00:19:00.450 --> 00:19:03.029
use it at all i think that's where practice and

00:19:03.029 --> 00:19:06.369
application and statistics and academia need

00:19:06.369 --> 00:19:09.150
to kind of come together and go right this is

00:19:09.150 --> 00:19:11.089
what we should do and this is how we should be

00:19:11.089 --> 00:19:14.009
as robust as possible these are the confinements

00:19:14.009 --> 00:19:16.450
we have or the barriers that we have what's the

00:19:16.450 --> 00:19:19.970
next best thing that we can do so also as we

00:19:19.970 --> 00:19:22.329
will progress there will be more data because

00:19:22.329 --> 00:19:24.930
there will be more competitions with the new

00:19:24.930 --> 00:19:27.720
weight classes so yes exactly and that That's

00:19:27.720 --> 00:19:30.839
one of the things that we're looking to do for

00:19:30.839 --> 00:19:33.599
the women's weight categories. So this was just

00:19:33.599 --> 00:19:35.660
done for the men. But what we'll do this time

00:19:35.660 --> 00:19:38.819
is because these new weight category totals exist,

00:19:39.160 --> 00:19:41.839
we're going to add them to the old data. So you

00:19:41.839 --> 00:19:45.799
then have an entire spectrum of old data and

00:19:45.799 --> 00:19:49.420
new data in. So you'll have a range of body weights

00:19:49.420 --> 00:19:52.319
and then see if our predictability gets better

00:19:52.319 --> 00:19:54.180
that way, because there still won't be enough

00:19:54.180 --> 00:19:57.299
data of the new weight categories. Because, you

00:19:57.299 --> 00:19:59.720
know, we're confining it down to weight categories

00:19:59.720 --> 00:20:02.500
and we're further confining it down to competition

00:20:02.500 --> 00:20:05.119
type. So every time, and then performance zone.

00:20:05.240 --> 00:20:07.059
So every time you're getting smaller and smaller

00:20:07.059 --> 00:20:10.539
and smaller. So that's the next step that we're

00:20:10.539 --> 00:20:12.839
planning to do over the next few months. That's

00:20:12.839 --> 00:20:16.900
exciting. I forgot to get the answer to the question.

00:20:17.079 --> 00:20:20.299
How did you handle the reappearance of the athletes?

00:20:20.440 --> 00:20:24.539
Did you reduce the data set because it took maybe

00:20:24.539 --> 00:20:27.799
the best one or did you... as it was? How did

00:20:27.799 --> 00:20:29.640
you handle that? Yeah, really good question.

00:20:29.859 --> 00:20:31.579
I feel like you might have been the one that

00:20:31.579 --> 00:20:34.299
reviewed my paper because it began with a question.

00:20:34.900 --> 00:20:37.680
And it's a really valid point, especially as

00:20:37.680 --> 00:20:40.579
it affects the generalizability of the models.

00:20:40.740 --> 00:20:44.579
But what we did is we treat each total as an

00:20:44.579 --> 00:20:47.319
individual case and not each athlete. And we

00:20:47.319 --> 00:20:50.099
did that because performance changes over time.

00:20:50.200 --> 00:20:52.980
So it either go up or down and therefore so will

00:20:52.980 --> 00:20:55.529
the rank that. the total achieves or that person

00:20:55.529 --> 00:20:59.009
achieves and given we're trying to predict totals

00:20:59.009 --> 00:21:01.569
in specific performance zones it's important

00:21:01.569 --> 00:21:03.569
to capture this information if we want to ensure

00:21:03.569 --> 00:21:06.829
our model is representative of what might actually

00:21:06.829 --> 00:21:09.690
happen but also like you mentioned if we use

00:21:09.690 --> 00:21:12.849
if we select the athlete and what competition

00:21:12.849 --> 00:21:15.809
and what total we use to analyze then we run

00:21:15.809 --> 00:21:18.089
into selection bias you know which one did we

00:21:18.089 --> 00:21:20.390
choose and why would we have chosen that And

00:21:20.390 --> 00:21:22.869
as a result of that, you also significantly reduce

00:21:22.869 --> 00:21:24.930
the sample. So not only have you got selection

00:21:24.930 --> 00:21:28.069
bias, you've got sample reduction as well. And

00:21:28.069 --> 00:21:30.970
with that in turn, you then reduce your ability

00:21:30.970 --> 00:21:32.849
to predict, which is our point of the paper.

00:21:33.049 --> 00:21:37.309
So using athlete reappearance was just lesser

00:21:37.309 --> 00:21:40.049
of the two evils, I suppose. And we felt like

00:21:40.049 --> 00:21:43.509
it was justified enough to be able to do that.

00:21:43.710 --> 00:21:46.029
So statistically speaking, we would love each

00:21:46.029 --> 00:21:49.549
data point. So each total be from a separate...

00:21:49.549 --> 00:21:51.849
separate different person each time different

00:21:51.849 --> 00:21:54.609
person that's not how the competitions run though

00:21:54.609 --> 00:21:58.849
so probably leaving the total as it appeared

00:21:58.849 --> 00:22:02.390
in the competition was more close to the reality

00:22:02.390 --> 00:22:06.890
than selecting one or the other yeah and i'm

00:22:06.890 --> 00:22:10.390
certainly no statistician so my depth of understanding

00:22:10.390 --> 00:22:12.730
of statistics wouldn't be as definitely nowhere

00:22:12.730 --> 00:22:14.589
near as good as someone who's a data scientist

00:22:14.589 --> 00:22:17.549
but this is where as i mentioned before like

00:22:17.549 --> 00:22:20.319
the robustness of this but making sure we're

00:22:20.319 --> 00:22:22.599
getting enough information that can be utilised

00:22:22.599 --> 00:22:25.680
within our field is. It's really important to

00:22:25.680 --> 00:22:28.680
kind of come to an intermediary conclusion where

00:22:28.680 --> 00:22:30.980
we're like, okay, it's robust enough. It's not

00:22:30.980 --> 00:22:33.700
statistically, it might not be perfect, but because

00:22:33.700 --> 00:22:38.079
of what the data is and how the data is collected.

00:22:38.400 --> 00:22:40.259
Because you could have somebody that medals at

00:22:40.259 --> 00:22:43.900
Europeans and then comes 10th in a world championships,

00:22:44.039 --> 00:22:46.779
for example. So which one are you now taking?

00:22:46.900 --> 00:22:48.299
Are you taking their best one? Are you taking

00:22:48.299 --> 00:22:50.740
their worst one? I think it pulls out more problems

00:22:50.740 --> 00:22:53.759
than it does. solved to be honest yeah fair enough

00:22:53.759 --> 00:22:57.019
i think it was statistics you always are if everything

00:22:57.019 --> 00:23:00.000
is perfect then the prediction is 100 but it

00:23:00.000 --> 00:23:02.759
is far away from reality because reality is never

00:23:02.759 --> 00:23:05.500
perfect so you have to i guess find this balance

00:23:05.500 --> 00:23:09.299
so you can find that exactly and if if if data

00:23:09.299 --> 00:23:12.180
was always so perfect and so easy to use especially

00:23:12.180 --> 00:23:15.039
for predictions like i said before time and cheap

00:23:15.039 --> 00:23:16.839
you know i wouldn't be predicting weightlifting

00:23:16.839 --> 00:23:20.269
i'd be predicting lottery numbers But it gives

00:23:20.269 --> 00:23:22.589
us a good starting point at the very least. And

00:23:22.589 --> 00:23:24.869
I'm sure somebody far smarter than me will come

00:23:24.869 --> 00:23:27.930
up with a model that's... even better and you

00:23:27.930 --> 00:23:31.450
know is more maybe more malleable so it can change

00:23:31.450 --> 00:23:33.769
as the dynamics of weightlifting changes as well

00:23:33.769 --> 00:23:37.029
who knows ai is developing quickly maybe you

00:23:37.029 --> 00:23:40.269
will know definitely yeah very quickly is there

00:23:40.269 --> 00:23:43.650
anything from the methods and setup of the study

00:23:43.650 --> 00:23:46.109
i didn't ask about but you think it's important

00:23:46.109 --> 00:23:50.289
really no i feel like i'm i i've covered feel

00:23:50.289 --> 00:23:52.450
like i've covered it to be honest okay so are

00:23:52.450 --> 00:23:55.150
we ready for the for the results i would like

00:23:55.150 --> 00:23:59.450
us to go on more you to tell us what were the

00:23:59.450 --> 00:24:02.769
prediction across the competitions and across

00:24:02.769 --> 00:24:07.309
the weight classes okay so that's quite a there's

00:24:07.309 --> 00:24:10.049
going to be a quite a big one to to answer because

00:24:10.049 --> 00:24:14.069
there's so many sub categories of of it because

00:24:14.069 --> 00:24:17.859
you know We're predicting category. We're predicting

00:24:17.859 --> 00:24:21.380
competition type, category. And then under those

00:24:21.380 --> 00:24:23.619
categories, we've got subcategories of performance

00:24:23.619 --> 00:24:26.779
zones. Yes. So if I were to answer it in the

00:24:26.779 --> 00:24:29.279
most simplistic way as an overarching answer,

00:24:29.460 --> 00:24:32.700
yes, the predictive models were pretty good at

00:24:32.700 --> 00:24:34.880
predicting weightlifting performance at different

00:24:34.880 --> 00:24:37.799
performance zones. However, these will vary depending

00:24:37.799 --> 00:24:41.000
on the competition type and depending on the

00:24:41.000 --> 00:24:43.440
performance zone and the weight category that

00:24:43.440 --> 00:24:46.619
you're looking at. What we did see, like we mentioned

00:24:46.619 --> 00:24:49.819
before, is as you come down the performance zones,

00:24:49.980 --> 00:24:53.160
so to the lower ranked lifters, so 11th to 15th,

00:24:53.259 --> 00:24:57.240
the predictability there and the fit of the data

00:24:57.240 --> 00:25:00.019
for the regression line was reduced quite a bit.

00:25:00.279 --> 00:25:04.259
So the R -squared value for the four performance

00:25:04.259 --> 00:25:07.299
zones at the top were all pretty much above 0

00:25:07.299 --> 00:25:11.180
.9. Some were close to 1, like 0 .99, 0 .96,

00:25:11.400 --> 00:25:14.160
0 .97. As soon as you get to zone 4. five, 11th

00:25:14.160 --> 00:25:16.700
to 15th, there's a significant drop, you know,

00:25:16.720 --> 00:25:22.539
0 .6, 0 .89, 0 .77 for Olympics, World and European

00:25:22.539 --> 00:25:26.559
champs. I mean, if we were to just pick one at...

00:25:26.750 --> 00:25:32.789
At random, let's say 67 kilo men's category.

00:25:33.130 --> 00:25:36.970
Performance zone two, for example, had a prediction

00:25:36.970 --> 00:25:40.089
of, so between fourth and fifth, the prediction

00:25:40.089 --> 00:25:44.410
was 321 kilos. What actually happened was 321

00:25:44.410 --> 00:25:47.529
kilos plus or minus one kilo. So that's less

00:25:47.529 --> 00:25:52.789
than 0 .05%, 0 .04 % of the difference. And the

00:25:52.789 --> 00:25:54.809
absolute difference was zero. Keep in mind when

00:25:54.809 --> 00:25:58.000
I present the standard DVR, in this paper, it's

00:25:58.000 --> 00:26:00.859
not to any decimal. It's always rounded up because

00:26:00.859 --> 00:26:03.599
in weightlifting, we lift to a kilo or more.

00:26:03.759 --> 00:26:07.579
So actually, that was pretty much on the nose.

00:26:07.839 --> 00:26:11.240
Yeah, it was a bullseye. Whereas if we, let's

00:26:11.240 --> 00:26:15.839
say, move to the 109 plus for zone two, fourth

00:26:15.839 --> 00:26:18.119
to fifth at the Olympic Games, the prediction

00:26:18.119 --> 00:26:22.319
was 442. What actually happened was 414. So the

00:26:22.319 --> 00:26:25.539
difference is just under 6 .5%, which is... 28

00:26:25.539 --> 00:26:27.880
kilo difference that's quite large so in that

00:26:27.880 --> 00:26:30.519
instance if you've got an athlete that's trying

00:26:30.519 --> 00:26:33.519
to achieve fourth to fifth and you're using that

00:26:33.519 --> 00:26:36.519
as a predictive model it's certainly over achieving

00:26:36.519 --> 00:26:40.259
or over predicting so if they go in lifting like

00:26:40.259 --> 00:26:44.319
420 at the very minimum and they're like oh i'm

00:26:44.319 --> 00:26:47.059
never gonna get fourth or fifth you'd be surprised

00:26:47.059 --> 00:26:51.000
because in this instance 414 was the total which

00:26:51.000 --> 00:26:53.200
kind of brings me on to something else which

00:26:53.200 --> 00:26:55.859
i think would like to to look into in more depth

00:26:55.859 --> 00:26:59.799
is how these performance zones cross over. I

00:26:59.799 --> 00:27:03.880
think it's really important. For example, we

00:27:03.880 --> 00:27:06.940
have our standard deviation. We have our average.

00:27:07.099 --> 00:27:09.599
We have our standard deviation. And let's say

00:27:09.599 --> 00:27:12.960
this is our medal zone. Then we'll have our fourth

00:27:12.960 --> 00:27:15.579
to fifth and so on and so forth. So we'll keep

00:27:15.579 --> 00:27:17.720
that down. But they cross over at some point.

00:27:17.859 --> 00:27:20.440
So it just changes the conversation you have.

00:27:20.910 --> 00:27:24.150
like look if you achieve a very random number

00:27:24.150 --> 00:27:27.390
if you achieve 300 kilos you're going to be above

00:27:27.390 --> 00:27:30.130
the average of fourth to fifth but it also means

00:27:30.130 --> 00:27:33.210
you might meddle so really when it comes to strategizing

00:27:33.210 --> 00:27:36.289
it could be an instance of okay let's open you

00:27:36.289 --> 00:27:38.569
at this number we know what you're capable of

00:27:38.569 --> 00:27:40.410
let's open you at this this will give you the

00:27:40.410 --> 00:27:43.549
best opportunity to get 300 or more which might

00:27:43.549 --> 00:27:45.970
lead to a medal naturally when you take that

00:27:45.970 --> 00:27:48.700
information to a competition, as you see the

00:27:48.700 --> 00:27:51.259
board start to change, that's when you can make

00:27:51.259 --> 00:27:54.720
a little bit more of a freer choice of what load

00:27:54.720 --> 00:27:58.619
to put on because this information is there to

00:27:58.619 --> 00:28:01.579
just provide some form of strategy moving into

00:28:01.579 --> 00:28:04.119
competition. Once you're in competition, it's

00:28:04.119 --> 00:28:06.839
then down to intuition and how to read the board.

00:28:07.000 --> 00:28:09.319
So I think that's one of the steps that would

00:28:09.319 --> 00:28:11.680
be nice to look at in the future. So the zone

00:28:11.680 --> 00:28:15.059
will have the number and then confidence and

00:28:15.059 --> 00:28:18.309
then standard deviation. so it's the plus minus

00:28:18.309 --> 00:28:22.750
however many kilos so that's the number in theory

00:28:22.750 --> 00:28:26.470
you would need to total to be in that zone to

00:28:26.470 --> 00:28:29.930
so to place in that zone but it can be less or

00:28:29.930 --> 00:28:33.190
more depending who is coming to the competition

00:28:33.190 --> 00:28:36.329
and exactly it and it all depends on the day

00:28:36.329 --> 00:28:40.069
and i think That's really important. And to come

00:28:40.069 --> 00:28:43.509
back round to the original reason why this idea

00:28:43.509 --> 00:28:46.670
kind of came up is we wanted to know the realities

00:28:46.670 --> 00:28:50.160
of Finnish. finishing within the top 10. But

00:28:50.160 --> 00:28:52.980
we also know that on the day, anything can happen.

00:28:53.180 --> 00:28:54.900
So while we'll have a strategy of, look, if you're

00:28:54.900 --> 00:28:57.759
going to attempt 200 kilo clean and jerk, we

00:28:57.759 --> 00:28:59.900
need to start you at this. Then we look back

00:28:59.900 --> 00:29:02.640
at that individual's information. Okay, the last

00:29:02.640 --> 00:29:06.460
time you hit 200, you started at this. This is

00:29:06.460 --> 00:29:08.180
how training has been going. And it becomes a

00:29:08.180 --> 00:29:11.180
conversation and it has to be very much a relationship

00:29:11.180 --> 00:29:13.440
-based conversation to have because it's a big

00:29:13.440 --> 00:29:16.099
decision to make. I'm not going to go off, completely

00:29:16.099 --> 00:29:18.980
go off of it. predictive model. But like I said,

00:29:19.000 --> 00:29:21.740
it provides a starting point, provides us an

00:29:21.740 --> 00:29:24.680
ability to strategize. And also from a funding

00:29:24.680 --> 00:29:27.920
perspective, models like this might help identify

00:29:27.920 --> 00:29:30.940
trajectories of individuals and their development.

00:29:31.369 --> 00:29:34.329
And if you have a relatively robust predictive

00:29:34.329 --> 00:29:38.269
model and their trajectory is moving to a performance

00:29:38.269 --> 00:29:42.029
zone, which is aligned to funding, then you can

00:29:42.029 --> 00:29:44.789
say, look, this is what we're predicting is going

00:29:44.789 --> 00:29:47.109
to be the middle zone. This is the trajectory

00:29:47.109 --> 00:29:49.650
of the athlete. In the next four years, it's

00:29:49.650 --> 00:29:52.130
very likely they're able to achieve this. So

00:29:52.130 --> 00:29:54.910
it's got multiple applications. Okay, now in

00:29:54.910 --> 00:29:58.849
terms of how the predictions you made were true

00:29:58.849 --> 00:30:02.569
or accurate or close to. accurate in terms of

00:30:02.569 --> 00:30:05.269
type of the competition was it easier to predict

00:30:05.269 --> 00:30:08.690
the totals for olympics awards or for europeans

00:30:08.690 --> 00:30:11.450
it was all easy to predict it's just how strong

00:30:12.089 --> 00:30:14.390
Well, how good were those predictions? For the

00:30:14.390 --> 00:30:16.670
first four performance zones, pretty decent.

00:30:16.869 --> 00:30:19.930
For the fifth performance zone, 11th to 15th,

00:30:19.950 --> 00:30:23.710
they weren't good. But generally speaking, there

00:30:23.710 --> 00:30:26.450
were more overestimations than underestimations.

00:30:26.490 --> 00:30:29.589
A potential reason for that could be the impact

00:30:29.589 --> 00:30:32.549
of COVID and people now just returning back into

00:30:32.549 --> 00:30:35.329
competition. So I'm very, very much excited to

00:30:35.329 --> 00:30:37.880
look at. what's happened this year and comparing

00:30:37.880 --> 00:30:40.420
it to our prediction model but also as i mentioned

00:30:40.420 --> 00:30:42.660
for the women's one where we developed that adding

00:30:42.660 --> 00:30:45.880
the current existing new category data and seeing

00:30:45.880 --> 00:30:48.279
if that has an even better ability to predict

00:30:48.279 --> 00:30:52.980
so was it better for 11 to 15 for the words because

00:30:52.980 --> 00:30:55.480
i i would assume that words would be the most

00:30:55.480 --> 00:31:00.299
stable in terms of yes yeah i'm in on the same

00:31:00.299 --> 00:31:02.619
opinion of that and i'm looking at it now it

00:31:02.619 --> 00:31:06.650
was the confidence intervals smaller okay but

00:31:06.650 --> 00:31:08.670
you still have wide predictive intervals but

00:31:08.670 --> 00:31:10.809
if we forget the predictive intervals for a second

00:31:10.809 --> 00:31:12.710
they were still narrower than they were for the

00:31:12.710 --> 00:31:15.170
olympic one but you're right in saying that they

00:31:15.170 --> 00:31:18.230
in theory should be more stable for worlds because

00:31:18.230 --> 00:31:21.809
you actually have the top 15 in the world so

00:31:21.809 --> 00:31:24.250
the variation will be far less in comparison

00:31:24.250 --> 00:31:27.470
to like i mentioned the olympics where you're

00:31:27.470 --> 00:31:30.720
not necessarily getting the top 15 in the world

00:31:30.720 --> 00:31:33.539
and i say top 15 is because that's the sample

00:31:33.539 --> 00:31:37.200
that we extracted for analysis i had played around

00:31:37.200 --> 00:31:40.640
with commonwealth championships and i mean that

00:31:40.640 --> 00:31:45.059
was incredibly varied but you expect commonwealth

00:31:45.059 --> 00:31:47.799
championships was extremely varied it's not something

00:31:47.799 --> 00:31:50.759
i i ended up pursuing because after doing it

00:31:50.759 --> 00:31:53.579
a couple of times I realized there's probably

00:31:53.579 --> 00:31:56.720
not a huge amount of value in doing it for this

00:31:56.720 --> 00:31:59.799
because the variations were so large, the confidence

00:31:59.799 --> 00:32:02.000
intervals and predictive intervals. So the only

00:32:02.000 --> 00:32:04.180
other way of maybe looking at that one would

00:32:04.180 --> 00:32:07.460
be just the medal zone, possibly. Certainly for

00:32:07.460 --> 00:32:11.180
England, where we are eyeing up medals at that

00:32:11.180 --> 00:32:14.619
specific competition. So possibly it becomes

00:32:14.619 --> 00:32:18.000
a bit more useful. Yeah. The sample is way smaller,

00:32:18.039 --> 00:32:20.619
even comparing to Europeans, less countries.

00:32:21.200 --> 00:32:23.799
it makes sense but you would hope that it's at

00:32:23.799 --> 00:32:26.359
least if you could only predict the metal zone

00:32:26.359 --> 00:32:29.640
it would still be worth it yeah i think so i

00:32:29.640 --> 00:32:32.160
think it would be and again it's only to open

00:32:32.160 --> 00:32:34.799
up the conversation of if first second and third

00:32:34.799 --> 00:32:37.380
will cross over you know it's like okay if you

00:32:37.380 --> 00:32:41.460
hit a 200 total that's a potential bronze medal

00:32:41.460 --> 00:32:45.200
but it also is the lower bound of the silver

00:32:45.200 --> 00:32:48.420
medal so hit 200 and you should be in focus out

00:32:48.420 --> 00:32:50.630
for a medal so you know it drops conversations

00:32:50.630 --> 00:32:53.410
and i think that's that becomes really important

00:32:53.410 --> 00:32:57.970
what a motivation yeah Yeah, exactly. And I think

00:32:57.970 --> 00:33:01.690
sometimes it also helps, potentially can help

00:33:01.690 --> 00:33:05.710
reaffirm what a coach is thinking in terms of

00:33:05.710 --> 00:33:07.849
what they should be achieving going into a comp

00:33:07.849 --> 00:33:09.809
or what they'd like to achieve going into a comp

00:33:09.809 --> 00:33:12.029
or what they think the medal zone will look like.

00:33:12.150 --> 00:33:14.630
Can we talk a little bit about the weight classes?

00:33:14.829 --> 00:33:17.769
Because weight classes are not every five kilos.

00:33:17.910 --> 00:33:20.930
They are differently organized in weightlifting

00:33:20.930 --> 00:33:24.490
because the population is spread. different like

00:33:24.490 --> 00:33:28.889
curves so were there any differences in prediction

00:33:28.889 --> 00:33:31.609
and strength of the prediction in terms of weight

00:33:31.609 --> 00:33:35.630
classes? Yes. So one of the issues we considered

00:33:35.630 --> 00:33:38.609
that we would have is on the two ends of the

00:33:38.609 --> 00:33:40.710
weight spectrum, the lightest and the heaviest.

00:33:40.910 --> 00:33:42.970
But then we realized the lightest probably doesn't

00:33:42.970 --> 00:33:45.049
matter as much because you can be as light as

00:33:45.049 --> 00:33:47.029
you want, but nobody is as light as they want.

00:33:47.130 --> 00:33:51.049
They're as close to the, let's say, 55 kilo threshold

00:33:51.049 --> 00:33:54.910
as possible. Whereas your 109 plus is you're

00:33:54.910 --> 00:33:57.779
going to have people weighing 115 kilos. and

00:33:57.779 --> 00:34:01.900
170 kilos. There's a huge span. That's why we

00:34:01.900 --> 00:34:05.599
opted to take the average weight of those that

00:34:05.599 --> 00:34:08.559
competed in the heavier top -end weight classes,

00:34:08.780 --> 00:34:10.880
because it's the best representation of that

00:34:10.880 --> 00:34:16.119
group average. So what we did see, I mean, I've

00:34:16.119 --> 00:34:18.440
got the tables in front of me only because I'm

00:34:18.440 --> 00:34:19.980
not going to remember these numbers off the top

00:34:19.980 --> 00:34:23.940
of my head, but the World Championships 109s,

00:34:24.000 --> 00:34:27.539
within that category, the performance zone prediction

00:34:27.539 --> 00:34:30.840
became... slightly worse. When you look at the

00:34:30.840 --> 00:34:33.900
same performance zone between different categories,

00:34:34.099 --> 00:34:37.300
it just varies. So for example, the medal zone

00:34:37.300 --> 00:34:39.800
for the 89 kilos, the difference between the

00:34:39.800 --> 00:34:42.829
prediction and the actual was three points. 40%.

00:34:42.829 --> 00:34:46.969
The prediction for the 109 plus was, I mean,

00:34:47.010 --> 00:34:51.510
just over 0 .05%, 0 .06. So it's pretty much,

00:34:51.530 --> 00:34:53.489
again, on the money. And remember, these are

00:34:53.489 --> 00:34:56.150
averages. So if anything, it tells you silver

00:34:56.150 --> 00:34:58.730
medal, not gold or bronze. So yeah, it does vary.

00:34:58.869 --> 00:35:02.590
There's no pattern I could see between the different

00:35:02.590 --> 00:35:05.230
weight categories at different performance zones.

00:35:05.389 --> 00:35:11.190
But there are less lifters in the 55 from what

00:35:11.190 --> 00:35:13.079
I remember. remember the lighter categories which

00:35:13.079 --> 00:35:15.739
means the data is kind of a bit more bit more

00:35:15.739 --> 00:35:18.639
spread oh could you could you repeat um i didn't

00:35:18.639 --> 00:35:22.380
understand the point in the 55 there wasn't in

00:35:22.380 --> 00:35:25.099
the world champs so what i didn't notice so in

00:35:25.099 --> 00:35:28.159
the world champs the 55s had the prediction to

00:35:28.159 --> 00:35:30.679
the actual the percentages were all above five

00:35:30.679 --> 00:35:32.739
percent difference which is quite big but the

00:35:32.739 --> 00:35:34.860
reason being it's not an olympic weight category

00:35:34.860 --> 00:35:39.199
right so and neither was 89 and neither was was

00:35:39.199 --> 00:35:42.769
102 at the time the data was collected. And then

00:35:42.769 --> 00:35:44.469
because people are changing weight category,

00:35:44.590 --> 00:35:47.289
naturally these values will end up changing.

00:35:47.610 --> 00:35:49.909
And that's why when you look at the paper, there's

00:35:49.909 --> 00:35:52.869
no difference between the predicted and the actual

00:35:52.869 --> 00:35:55.789
for the new weight categories at the Olympics.

00:35:56.090 --> 00:36:00.469
Because the 55, the 89 and the 102 will be contested

00:36:00.469 --> 00:36:02.630
in the up and coming Olympics, but they weren't

00:36:02.630 --> 00:36:04.429
in the last Olympics. I don't know. If you're

00:36:04.429 --> 00:36:07.050
a fan of weightlifting, like your head is spinning

00:36:07.050 --> 00:36:09.809
because every Olympic cycle we have changed.

00:36:09.929 --> 00:36:13.690
changes, how others qualify, what the weight

00:36:13.690 --> 00:36:18.730
classes are and so on. So strap in for the ride.

00:36:18.909 --> 00:36:20.889
No, I think it's just important to highlight

00:36:20.889 --> 00:36:24.309
that the predictive model paper is hopefully

00:36:24.309 --> 00:36:27.550
a starting point that can be developed further.

00:36:27.789 --> 00:36:30.739
It's certainly... a good opportunity to help

00:36:30.739 --> 00:36:33.719
strategize going into major competitions opens

00:36:33.719 --> 00:36:37.280
up the doors to try and apply this to other areas

00:36:37.280 --> 00:36:40.900
other sports but also you know masters competitions

00:36:40.900 --> 00:36:45.260
youth juniors and look at how that changes longitudinally

00:36:45.260 --> 00:36:48.559
as well and yeah i think the more the more data

00:36:48.559 --> 00:36:52.099
that we we end up getting hopefully the predictability

00:36:52.099 --> 00:36:55.199
ends up being a lot more refined and a lot more

00:36:55.199 --> 00:36:57.500
accurate and it's also really important to note

00:36:57.500 --> 00:37:01.260
that those had sanctions or were banned were

00:37:01.260 --> 00:37:05.099
extracted from the data when it was known that

00:37:05.099 --> 00:37:08.159
they had a ban in some cases there might be athletes

00:37:08.159 --> 00:37:11.000
in there who haven't yet been sanctioned or might

00:37:11.000 --> 00:37:14.039
be in the future but taking that one person out

00:37:14.039 --> 00:37:17.019
if it's in a medal zone for example could influence

00:37:17.019 --> 00:37:19.320
predictability but there's no controlling that

00:37:19.320 --> 00:37:22.699
yes so like i like i said earlier hopefully some

00:37:22.699 --> 00:37:23.980
you can develop something it's a little bit more

00:37:23.980 --> 00:37:27.219
malleable and as things change within the field

00:37:27.219 --> 00:37:29.889
the data changes in the process of the prediction

00:37:29.889 --> 00:37:33.550
ends up updating as uh as results update but

00:37:33.550 --> 00:37:36.869
i hope i hope people enjoyed the read and found

00:37:36.869 --> 00:37:39.989
it if not useful then at least interesting i

00:37:39.989 --> 00:37:43.070
can't ask you about your favorite color because

00:37:43.070 --> 00:37:45.829
i did it last time but i will ask you about your

00:37:45.829 --> 00:37:49.090
favorite lift and why oh that's a good question

00:37:49.090 --> 00:37:53.070
my favorite lift to do myself or to yes no you

00:37:54.179 --> 00:37:57.539
um i've got to say back squat because i i don't

00:37:57.539 --> 00:37:59.980
weight lift anymore i think i like back squat

00:37:59.980 --> 00:38:02.820
you know it's a good good exercise for for low

00:38:02.820 --> 00:38:05.119
body strength and you can you know add a bit

00:38:05.119 --> 00:38:06.940
of weight weight to it but it also humbles you

00:38:06.940 --> 00:38:10.360
very quickly as well might be a day where you

00:38:10.360 --> 00:38:13.659
don't feel so good and you'll find out very quickly

00:38:13.659 --> 00:38:16.320
so it's got to be a back squat a boring answer

00:38:16.320 --> 00:38:19.980
i'm afraid but no no in terms of coaching do

00:38:19.980 --> 00:38:21.980
you prefer coaching snatch or clean and jerk

00:38:22.989 --> 00:38:27.489
I suppose. Whatever the problem is in front of

00:38:27.489 --> 00:38:33.449
me, both are fun to coach and they both have

00:38:33.449 --> 00:38:35.610
their similarities, but it always depends on

00:38:35.610 --> 00:38:38.969
the person in front, for sure. So politically

00:38:38.969 --> 00:38:46.170
correct. I know, I know. If I were to choose,

00:38:46.489 --> 00:38:49.750
I'd say snatch, if I were to choose, because

00:38:49.750 --> 00:38:51.610
of the speed of movement and the intricacy. And

00:38:51.610 --> 00:38:55.769
I also think, Once someone understands the snatch,

00:38:56.050 --> 00:38:59.190
the clean tends to be developed a little bit

00:38:59.190 --> 00:39:01.150
easier. And that's, I'm talking about a complete

00:39:01.150 --> 00:39:06.789
novice here. Okay, I'm satisfied. We can move

00:39:06.789 --> 00:39:09.940
on to the next question. Okay, where people can

00:39:09.940 --> 00:39:12.480
find you if they want to connect or learn more

00:39:12.480 --> 00:39:17.739
about what you do? Yeah, so I'm contactable via

00:39:17.739 --> 00:39:22.219
email. So my email is s .chavda, that's spelled

00:39:22.219 --> 00:39:26.440
C -H -A -V -D -A, Violet Delta Alpha, at mdx

00:39:26.440 --> 00:39:31.500
.ac .uk. I'm also available on most social media

00:39:31.500 --> 00:39:34.760
platforms or academic platforms, so LinkedIn,

00:39:35.059 --> 00:39:37.849
ResearchGate. just stick my name in i'm sure

00:39:37.849 --> 00:39:41.510
i'll come up twitter my twitter is shy the number

00:39:41.510 --> 00:39:45.449
two tweet underscore to tweet was it shy underscore

00:39:45.449 --> 00:39:49.070
to tweet yep shy underscore to tweet and instagram

00:39:49.070 --> 00:39:53.510
is coach underscore awesome thank you so much

00:39:53.510 --> 00:39:55.750
for today you're welcome thanks for having me

00:39:55.750 --> 00:39:58.190
on a second time as soon as you publish the next

00:39:58.190 --> 00:40:00.510
one there will be a third one so be prepared

00:40:02.760 --> 00:40:06.519
I'm glad somebody's finding the research interesting.

00:40:06.860 --> 00:40:09.519
Yeah, of course. I live for research.