1
00:00:00,000 --> 00:00:14,400
I think it would be interesting to talk about some of the waste data you've put together,

2
00:00:14,400 --> 00:00:15,400
Charles.

3
00:00:15,400 --> 00:00:16,400
And so...

4
00:00:16,400 --> 00:00:17,400
Okay.

5
00:00:17,400 --> 00:00:18,400
So did you download...

6
00:00:18,400 --> 00:00:19,400
Yeah, okay.

7
00:00:19,400 --> 00:00:20,400
This is the latest one.

8
00:00:20,400 --> 00:00:31,400
I think I just updated it yesterday.

9
00:00:31,400 --> 00:00:33,080
Yes, exactly.

10
00:00:33,080 --> 00:00:38,960
So I was tinkering on this throughout the past week, and yes, I saw you just made a

11
00:00:38,960 --> 00:00:42,760
commit that looks like yesterday at some point.

12
00:00:42,760 --> 00:00:48,880
So I've been looking over it, but I may not have all of your latest work.

13
00:00:48,880 --> 00:00:57,360
So if you want to share any of that while we go through it, then that would be awesome.

14
00:00:57,360 --> 00:01:11,280
So essentially, just to give you some background here, Sela, you can do Freedom of Information

15
00:01:11,280 --> 00:01:20,640
requests from the... here in Washington from the Liquor and Cannabis Board, and you'll

16
00:01:20,640 --> 00:01:30,640
get a data dump that looks like this, where they'll... if you're requesting traceability

17
00:01:30,640 --> 00:01:31,640
data.

18
00:01:31,640 --> 00:01:39,720
So they'll return essentially zipped folders of the different endpoints.

19
00:01:39,720 --> 00:01:45,280
So these are the various endpoints of the traceability API.

20
00:01:45,280 --> 00:01:55,320
So for example, licensees record their sales, they record how many plants they have, they

21
00:01:55,320 --> 00:02:04,280
record their lab results, their inventory, and tucked away down here in batches, we've

22
00:02:04,280 --> 00:02:12,640
found that they actually record how much waste they generate.

23
00:02:12,640 --> 00:02:28,560
And there's a company, Better Carbon Solutions, and they're... as far as I know, the only...

24
00:02:28,560 --> 00:02:36,120
well, actually, there's maybe a second one up in Bellevue, Bellingham, but these are

25
00:02:36,120 --> 00:02:44,240
the ones that I personally know that are working on a solution for hemp and cannabis waste.

26
00:02:44,240 --> 00:02:51,360
And they have no idea how much waste there is.

27
00:02:51,360 --> 00:02:57,960
So we're going to try to make some estimates.

28
00:02:57,960 --> 00:03:10,200
So really just to find the right...

29
00:03:10,200 --> 00:03:18,920
So we'll need this data.

30
00:03:18,920 --> 00:03:30,440
Let me actually pause for one second while I load in the data here.

31
00:03:30,440 --> 00:03:31,920
One second here.

32
00:03:31,920 --> 00:03:42,040
I forgot to get the data.

33
00:03:42,040 --> 00:03:56,680
One second.

34
00:03:56,680 --> 00:04:17,160
Okay, so...

35
00:04:17,160 --> 00:04:21,600
Downloaded some data sets here.

36
00:04:21,600 --> 00:04:31,760
The batches, although it looks like it's about one gigabyte of data in the zip folder, and

37
00:04:31,760 --> 00:04:40,640
when you unzip it, you'll see that the data file here, the batches, is 20 gigabytes.

38
00:04:40,640 --> 00:04:43,080
So that's a big...

39
00:04:43,080 --> 00:04:56,120
So there's a large amount of data, and my laptop has 16 gigabytes of RAM, so I cannot

40
00:04:56,120 --> 00:05:02,400
read in this entire data file in one go.

41
00:05:02,400 --> 00:05:11,120
So there's a couple tools we can use to approach this.

42
00:05:11,120 --> 00:05:18,640
So Charles showed me Dask, which is...

43
00:05:18,640 --> 00:05:24,960
It implements the Pandas data frame, so it's similar to working with Pandas.

44
00:05:24,960 --> 00:05:34,360
And so you can think about it as instead of just reading in everything all at once, you're

45
00:05:34,360 --> 00:05:40,000
sort of chunking this.

46
00:05:40,000 --> 00:05:44,320
I...

47
00:05:44,320 --> 00:05:52,240
It may be that Dask hasn't implemented everything, but I actually didn't have the greatest luck

48
00:05:52,240 --> 00:05:57,040
reading in this data with Dask.

49
00:05:57,040 --> 00:06:00,840
Did you run into similar issues, Charles?

50
00:06:00,840 --> 00:06:01,840
Yes.

51
00:06:01,840 --> 00:06:11,280
I had to read it in with Pandas, save it as a parquet file, and then Dask would read it

52
00:06:11,280 --> 00:06:14,440
in.

53
00:06:14,440 --> 00:06:21,560
But I was lucky enough that my Mac has an infinite swap file.

54
00:06:21,560 --> 00:06:26,400
It will just keep expanding it until apparently it uses up the entire disk.

55
00:06:26,400 --> 00:06:28,400
Where on Windows...

56
00:06:28,400 --> 00:06:29,960
You're using Windows, right?

57
00:06:29,960 --> 00:06:30,960
Yes.

58
00:06:30,960 --> 00:06:37,320
So maybe your swap file isn't big enough or Windows has some limit.

59
00:06:37,320 --> 00:06:38,320
I don't know.

60
00:06:38,320 --> 00:06:43,760
Yeah, on my Linux machine, I had to change the size of the swap file in order to get

61
00:06:43,760 --> 00:06:46,160
it to work.

62
00:06:46,160 --> 00:06:48,480
But yeah, it does not like that file.

63
00:06:48,480 --> 00:06:53,240
It will not read it as a CSV.

64
00:06:53,240 --> 00:06:55,280
I don't know if I can...

65
00:06:55,280 --> 00:07:00,320
I'm not sure how big those parquet files are, if I can share them easily.

66
00:07:00,320 --> 00:07:12,280
So it's not the end of the world because that's why we've got Python here, is to sort of make

67
00:07:12,280 --> 00:07:18,440
the unmanageable manageable.

68
00:07:18,440 --> 00:07:22,240
So this is the code that Charles has written here.

69
00:07:22,240 --> 00:07:33,760
So I've just snagged Charles's code from here.

70
00:07:33,760 --> 00:07:37,040
Charles has actually done an awesome analysis here.

71
00:07:37,040 --> 00:07:43,200
And so, Stella, you may want to go through here because here Charles has actually really

72
00:07:43,200 --> 00:07:48,580
just walked through the steps you need to as you're going about data.

73
00:07:48,580 --> 00:07:56,640
So for example, you've read in the data and you started to look at it.

74
00:07:56,640 --> 00:07:58,720
And I love what you did here.

75
00:07:58,720 --> 00:08:05,580
Almost one of the very first things you did was simply describe the data.

76
00:08:05,580 --> 00:08:12,260
So just to see what are we working with here.

77
00:08:12,260 --> 00:08:22,680
So we're working with, what's this, 37 million plants.

78
00:08:22,680 --> 00:08:25,440
That's a lot of plants.

79
00:08:25,440 --> 00:08:30,920
And so that's real cool data that...

80
00:08:30,920 --> 00:08:36,320
They're not really recording 37 million tomato plants out there.

81
00:08:36,320 --> 00:08:46,080
And so it's so cool that we've got such granular data on the cannabis industry.

82
00:08:46,080 --> 00:08:54,800
So enough fawning over Charles's analysis here because we can revisit this here in a

83
00:08:54,800 --> 00:08:55,800
second.

84
00:08:55,800 --> 00:09:06,240
So just to make this manageable here just for the time being, just for the presentation

85
00:09:06,240 --> 00:09:07,240
essentially.

86
00:09:07,240 --> 00:09:15,520
What you can do with pandas is you can just essentially just read in a number of rows

87
00:09:15,520 --> 00:09:18,200
at a time.

88
00:09:18,200 --> 00:09:27,960
And so the way I was going to approach this problem, since I've got a limited bandwidth

89
00:09:27,960 --> 00:09:34,280
here, limited memory, and I'm having problems with Dask.

90
00:09:34,280 --> 00:09:38,720
I was essentially just going to chunk it.

91
00:09:38,720 --> 00:09:48,800
So just read in 10,000 observations at a time or can play with that number.

92
00:09:48,800 --> 00:10:05,320
And essentially the data point we're after is what Charles has calculated here as the

93
00:10:05,320 --> 00:10:11,760
amount of waste by producer by date.

94
00:10:11,760 --> 00:10:24,040
And I was thinking if we group them in 10,000, we can calculate the waste by producer by

95
00:10:24,040 --> 00:10:29,960
date and then save that.

96
00:10:29,960 --> 00:10:39,360
And then we can basically then read in all those files and then essentially combine that

97
00:10:39,360 --> 00:10:40,360
data.

98
00:10:40,360 --> 00:10:46,640
Because there won't be any data loss.

99
00:10:46,640 --> 00:10:54,000
We're just looking for the total so we can break the total up.

100
00:10:54,000 --> 00:10:59,240
So it doesn't really even necessarily matter how the data is sorted.

101
00:10:59,240 --> 00:11:08,640
It'll be nice if the data is kind of nicely in order, but the chunks can split things

102
00:11:08,640 --> 00:11:17,160
up because we're at the end of the day, we're just looking for the total.

103
00:11:17,160 --> 00:11:26,040
So the long story short, just for today's demonstration, we'll just do one with a chunk

104
00:11:26,040 --> 00:11:36,600
of 10,000 and then what I'm going to be working with, unless Charles has a more elegant solution,

105
00:11:36,600 --> 00:11:40,920
but Charles does have a more elegant solution.

106
00:11:40,920 --> 00:11:46,080
I just don't have quite enough memory for it.

107
00:11:46,080 --> 00:12:04,000
So basically I'm just going to kind of walk through this stack overflow.

108
00:12:04,000 --> 00:12:06,960
So how can I read in a huge CSV file?

109
00:12:06,960 --> 00:12:13,720
Well basically we're just going to have to iterate over the data frame, read in chunk

110
00:12:13,720 --> 00:12:25,280
by chunk, calculate the waste by producer by date, save it, aggregate it, and then hopefully

111
00:12:25,280 --> 00:12:29,280
at that point we'll have just a nice series.

112
00:12:29,280 --> 00:12:38,200
In fact, we may even aggregate it further and just do total waste by date.

113
00:12:38,200 --> 00:12:43,640
That's I think an informative number we'll do.

114
00:12:43,640 --> 00:12:47,560
We'll revisit the producer by date as well.

115
00:12:47,560 --> 00:12:54,280
But long story short, let's start here with this chunk.

116
00:12:54,280 --> 00:13:06,400
Just for expedience sake, I just gone ahead and read it in here.

117
00:13:06,400 --> 00:13:23,400
So basically I just, this will just save us a tidbit of time, but basically what Charles

118
00:13:23,400 --> 00:13:30,320
has done here is he's defined the columns that we actually need to read in.

119
00:13:30,320 --> 00:13:37,480
This is sort of a little bit of advanced work, or a little bit of a, you know, thanks to

120
00:13:37,480 --> 00:13:44,480
Charles's foresight, we can kind of leapfrog a little bit.

121
00:13:44,480 --> 00:13:48,040
So we already know like what columns are in the data.

122
00:13:48,040 --> 00:13:53,320
But if you were reading this in for the first time, you may not know what columns you even

123
00:13:53,320 --> 00:13:54,320
know.

124
00:13:54,320 --> 00:13:58,160
But long story short, we're going to read in 10,000.

125
00:13:58,160 --> 00:14:10,240
I'm skipping two million observations because for whatever reason, a lot of the observations

126
00:14:10,240 --> 00:14:13,720
at the beginning of the data, there was no waste.

127
00:14:13,720 --> 00:14:23,200
So just for today, I wanted to get a chunk that actually had waste.

128
00:14:23,200 --> 00:14:30,960
So we've got the waste data here.

129
00:14:30,960 --> 00:14:37,520
We can describe it as Charles did.

130
00:14:37,520 --> 00:14:50,320
Here we've just got 10,000 plants and we've got the amount of, you know, well let's actually

131
00:14:50,320 --> 00:15:00,720
see what columns we have here.

132
00:15:00,720 --> 00:15:06,320
For example, we've got the flower dry weight, which of course is what the producers care

133
00:15:06,320 --> 00:15:07,320
about.

134
00:15:07,320 --> 00:15:12,920
But the other half of that is the waste.

135
00:15:12,920 --> 00:15:16,720
And so that's what Better Carbon Solutions cares about.

136
00:15:16,720 --> 00:15:23,480
So they're trying to get that waste off of their hands and compost it essentially on

137
00:15:23,480 --> 00:15:26,320
your pyrolysis.

138
00:15:26,320 --> 00:15:36,120
So they can make products with it instead of it going into the landfill.

139
00:15:36,120 --> 00:15:41,360
And so Charles noted here that we've got some negative values.

140
00:15:41,360 --> 00:15:50,480
And so Charles, I ran into just an error when I ran this line for some reason.

141
00:15:50,480 --> 00:16:01,040
So I just approached this as just essentially taking the absolute value of the waste.

142
00:16:01,040 --> 00:16:08,880
So let me know if that achieves the same outcome.

143
00:16:08,880 --> 00:16:11,600
But basically there were negative values for waste.

144
00:16:11,600 --> 00:16:13,720
Yeah, actually that's a good solution.

145
00:16:13,720 --> 00:16:18,000
I didn't think about that.

146
00:16:18,000 --> 00:16:21,280
Well yours was perfectly fine.

147
00:16:21,280 --> 00:16:23,320
But I just ran into an error.

148
00:16:23,320 --> 00:16:24,320
We'll see here.

149
00:16:24,320 --> 00:16:27,320
I'll show you.

150
00:16:27,320 --> 00:16:33,120
Well that time it worked.

151
00:16:33,120 --> 00:16:39,880
For some reason the first time I ran into an error of some sort.

152
00:16:39,880 --> 00:16:56,360
Anyways, either way works.

153
00:16:56,360 --> 00:17:07,240
So I actually haven't really walked through this line here.

154
00:17:07,240 --> 00:17:08,240
So let's see here.

155
00:17:08,240 --> 00:17:09,240
So let's see.

156
00:17:09,240 --> 00:17:20,880
Okay, so here you're filling in the plant stage.

157
00:17:20,880 --> 00:17:22,800
Oh yeah.

158
00:17:22,800 --> 00:17:28,920
There were some places where it was NA and that will cause problems later on.

159
00:17:28,920 --> 00:17:38,200
So basically we're assuming if there's no plant stage given then it's harvested essentially.

160
00:17:38,200 --> 00:17:39,200
Right.

161
00:17:39,200 --> 00:17:45,640
Oh yeah, and I filled it in with the most common value.

162
00:17:45,640 --> 00:17:53,240
Which is, you know, that's a technique that's pretty common.

163
00:17:53,240 --> 00:17:56,840
Excellent.

164
00:17:56,840 --> 00:18:02,320
Well this is why it's nice to have you here while we walk through this.

165
00:18:02,320 --> 00:18:05,080
Find the amount of waste.

166
00:18:05,080 --> 00:18:10,120
Okay, so now we're just looking at waste for harvested products.

167
00:18:10,120 --> 00:18:11,800
Is that correct?

168
00:18:11,800 --> 00:18:12,800
Yeah.

169
00:18:12,800 --> 00:18:19,560
Because if you take all the different stages, you know, the analysis you get is kind of

170
00:18:19,560 --> 00:18:27,760
murky so if you break it up by the stages you get a sort of a clearer picture.

171
00:18:27,760 --> 00:18:40,160
Can we describe just the waste?

172
00:18:40,160 --> 00:18:50,080
Okay so do we know what the unit of measure is by any chance?

173
00:18:50,080 --> 00:18:53,600
So at this point it should be grams.

174
00:18:53,600 --> 00:19:00,840
There's a point where I converted to kilograms but I don't think, yeah that's not here yet.

175
00:19:00,840 --> 00:19:07,800
Because some of the numbers are really huge and it just sort of made more sense to present

176
00:19:07,800 --> 00:19:12,560
it in kilograms.

177
00:19:12,560 --> 00:19:14,160
Here we are with your analysis.

178
00:19:14,160 --> 00:19:19,800
So we've sort of skipped over some things but like I said you've done some thorough

179
00:19:19,800 --> 00:19:29,040
analysis here.

180
00:19:29,040 --> 00:19:36,280
So here are some charts Charles has presented or prepared.

181
00:19:36,280 --> 00:19:38,480
You have the full data set here.

182
00:19:38,480 --> 00:19:47,160
So there's a lot of data that's, you know, where all the dates are 11, 1900 which of

183
00:19:47,160 --> 00:19:52,720
course doesn't make any sense, doesn't really help.

184
00:19:52,720 --> 00:20:02,520
There's some pre-2012 dates and there's some pre-2018 dates and this data is sort of supposed

185
00:20:02,520 --> 00:20:05,960
to be from 2018 forward.

186
00:20:05,960 --> 00:20:13,360
So I cleared that out and I just kind of went from 2018 forward.

187
00:20:13,360 --> 00:20:20,900
Question I see, did you eventually just basically abandon harvest date and just decide to go

188
00:20:20,900 --> 00:20:24,840
with updated at date essentially?

189
00:20:24,840 --> 00:20:30,640
Yes because if you go up I explain why I do that.

190
00:20:30,640 --> 00:20:41,520
Yeah there's 87, almost 88 percent of the data has a harvest date before 2018.

191
00:20:41,520 --> 00:20:47,560
So it's not really, you know, it's not a good date and there's a lot of harvested dates

192
00:20:47,560 --> 00:20:50,560
that are 11, 1900.

193
00:20:50,560 --> 00:20:57,280
So basically there's not really, they don't do any data checking when they add this data

194
00:20:57,280 --> 00:21:00,280
to the database.

195
00:21:00,280 --> 00:21:10,440
And Sala you may be able to speak to this but I think my experience with the traceability

196
00:21:10,440 --> 00:21:15,640
system I would go exactly with updated at date because this is essentially going to

197
00:21:15,640 --> 00:21:23,800
be a automatically generated number by the traceability system.

198
00:21:23,800 --> 00:21:31,880
As you noted if they, you know, revisited their harvest at a later date, just say they

199
00:21:31,880 --> 00:21:39,160
made an error and then they fix their error then the updated at time would change.

200
00:21:39,160 --> 00:21:50,520
But I think that's a smaller error than whatever's going on here, right?

201
00:21:50,520 --> 00:21:59,360
Because at the end of the day we're trying to minimize our sort of our bias or, you know,

202
00:21:59,360 --> 00:22:01,800
our forecasting.

203
00:22:01,800 --> 00:22:13,480
So when you have entry errors like this that actually introduces statistical bias into

204
00:22:13,480 --> 00:22:15,240
your calculations.

205
00:22:15,240 --> 00:22:18,440
So you know the closer you are to the true value the better.

206
00:22:18,440 --> 00:22:28,240
And so yes, the updated at may, there may still be errors in updated at times but I

207
00:22:28,240 --> 00:22:35,400
think as you settled on it's just going to be a much, I think it's going to be a much

208
00:22:35,400 --> 00:22:41,800
reliable measure of time than the harvest date.

209
00:22:41,800 --> 00:22:45,680
Yeah.

210
00:22:45,680 --> 00:22:49,480
And that's why, you know, rule number one is look at the data.

211
00:22:49,480 --> 00:22:57,600
So fantastic, fantastic work there Charles because all you really need to do is you plot

212
00:22:57,600 --> 00:23:05,600
those and now we know how we can keep track of time.

213
00:23:05,600 --> 00:23:11,560
So I think this is a real cool chart here where this is just showing the amount of waste

214
00:23:11,560 --> 00:23:20,120
generated I think essentially are these like essentially.

215
00:23:20,120 --> 00:23:21,120
By day.

216
00:23:21,120 --> 00:23:25,760
Okay, so these are totals by day or by producer by day?

217
00:23:25,760 --> 00:23:26,760
By day.

218
00:23:26,760 --> 00:23:33,960
We get down to by producer later on.

219
00:23:33,960 --> 00:23:35,400
Interesting.

220
00:23:35,400 --> 00:23:42,720
So maybe some of these are when the larger producers harvest and there's a lot of waste.

221
00:23:42,720 --> 00:23:45,640
Oh, cool charts here.

222
00:23:45,640 --> 00:23:48,520
Are these new?

223
00:23:48,520 --> 00:23:53,960
Probably I've been, you know, I've been, you know, sort of thinking about it and updating

224
00:23:53,960 --> 00:23:59,320
it, you know, and like how can I improve it?

225
00:23:59,320 --> 00:24:00,320
This is fantastic.

226
00:24:00,320 --> 00:24:03,680
I haven't seen this chart yet.

227
00:24:03,680 --> 00:24:11,040
So I think this you may have added this since I was looking at it because this is exactly

228
00:24:11,040 --> 00:24:14,000
this is exactly what needed to be plotted.

229
00:24:14,000 --> 00:24:16,160
So thank you Charles.

230
00:24:16,160 --> 00:24:19,720
Like awesome work here.

231
00:24:19,720 --> 00:24:23,720
This is essentially what everybody's been curious about.

232
00:24:23,720 --> 00:24:25,840
I don't think anybody's plotted this.

233
00:24:25,840 --> 00:24:35,920
In fact, I'll have to share an article with you, Charles.

234
00:24:35,920 --> 00:24:39,480
There's an article in The Stranger.

235
00:24:39,480 --> 00:24:46,440
Washington State has a I don't know like a I forget what the title is like Washington

236
00:24:46,440 --> 00:24:50,680
State has a million ton waste problem or something like that.

237
00:24:50,680 --> 00:24:52,800
I'll share the article with you.

238
00:24:52,800 --> 00:25:02,120
And so essentially the author was calling around to like recycling places trying to

239
00:25:02,120 --> 00:25:10,560
figure out like what the actual weight of waste is.

240
00:25:10,560 --> 00:25:16,360
And he just sort of made estimates because like this data well he actually wrote this

241
00:25:16,360 --> 00:25:20,200
article back in 2017.

242
00:25:20,200 --> 00:25:23,720
Especially before this data was there.

243
00:25:23,720 --> 00:25:28,520
As we found out the data is tough to wrangle.

244
00:25:28,520 --> 00:25:38,520
So although the data is there and as you showed we can calculate them I don't know if anybody

245
00:25:38,520 --> 00:25:39,520
has.

246
00:25:39,520 --> 00:25:46,800
And so this Charles you really may be the first person who's plotted harvest weight

247
00:25:46,800 --> 00:25:48,920
by month.

248
00:25:48,920 --> 00:25:55,840
So good work.

249
00:25:55,840 --> 00:26:00,200
And so this is it.

250
00:26:00,200 --> 00:26:04,080
I'll just give you my hot take my first look here.

251
00:26:04,080 --> 00:26:15,280
It's incredibly interesting to observe this this large amount in 2018.

252
00:26:15,280 --> 00:26:23,360
And I want to say that that may be producers essentially entering their essentially initial

253
00:26:23,360 --> 00:26:26,560
inventory into the traceability system.

254
00:26:26,560 --> 00:26:27,560
Okay.

255
00:26:27,560 --> 00:26:30,160
I wondered about that.

256
00:26:30,160 --> 00:26:35,080
Actually if you go down to the next chart I cut off February and March and then I also

257
00:26:35,080 --> 00:26:40,000
cut off January of 2021 because that was incomplete.

258
00:26:40,000 --> 00:26:43,280
And then I thought the other two were outliers.

259
00:26:43,280 --> 00:26:47,120
So I think this is more representative of the actual data.

260
00:26:47,120 --> 00:26:49,480
Yes.

261
00:26:49,480 --> 00:26:52,040
I think yeah I think I think you're right.

262
00:26:52,040 --> 00:26:59,440
This is and this way the scale you have a better idea of the scale.

263
00:26:59,440 --> 00:27:06,600
So just to explain I think so basically the traceability system was introduced in January

264
00:27:06,600 --> 00:27:18,440
or February of 2018 and they had a contingency period where they had to essentially enter

265
00:27:18,440 --> 00:27:29,120
all because these were companies that were already operational with the bio track traceability

266
00:27:29,120 --> 00:27:30,320
system.

267
00:27:30,320 --> 00:27:35,200
So the before leaf data systems there was bio track.

268
00:27:35,200 --> 00:27:43,360
They're already operational and they're given a contingency period through roughly April

269
00:27:43,360 --> 00:27:45,120
of 2018.

270
00:27:45,120 --> 00:27:52,400
I'll actually want to double check the I'll have to do some because if you look at the

271
00:27:52,400 --> 00:28:00,200
press releases they may have pushed back the contingency period but around April of 2018

272
00:28:00,200 --> 00:28:06,880
it was sort of the deadline to get all your inventory into the system.

273
00:28:06,880 --> 00:28:11,680
So that's real curious.

274
00:28:11,680 --> 00:28:19,160
So maybe people just had like a bunch of waste sitting around and they just had to get that

275
00:28:19,160 --> 00:28:21,520
entered.

276
00:28:21,520 --> 00:28:30,560
If you look there's definitely a lot of pre 2018 dates in some of this data.

277
00:28:30,560 --> 00:28:38,320
So I think yeah I think they just must have had entered data maybe from the previous year

278
00:28:38,320 --> 00:28:47,160
or but yeah anyway just kind of those couple months sort of skewed things so I just decided

279
00:28:47,160 --> 00:28:53,400
to drop them because if you drop them you sort of see kind of a trend.

280
00:28:53,400 --> 00:28:57,560
This is where you get to the art of data science.

281
00:28:57,560 --> 00:29:04,880
So one thing we're going to be getting to here is forecasting right.

282
00:29:04,880 --> 00:29:17,920
So we'll want to forecast this forward you know 12 months you know to kind of predict

283
00:29:17,920 --> 00:29:22,320
2021.

284
00:29:22,320 --> 00:29:29,000
My technique of for it well not my technique if technique of forecasting is you know a

285
00:29:29,000 --> 00:29:41,400
Rima forecasting box Jenkins where you basically you just take historic data and you basically

286
00:29:41,400 --> 00:29:47,520
use statistics to calculate the sort of like a moving average and you just basically continue

287
00:29:47,520 --> 00:29:50,380
the moving average out into the future.

288
00:29:50,380 --> 00:30:00,320
So it's a theoretical and it's it forecasts fairly well with time series surprisingly

289
00:30:00,320 --> 00:30:04,920
well so.

290
00:30:04,920 --> 00:30:11,540
But one thing if you were doing box Jenkins forecasting if you included all these observations

291
00:30:11,540 --> 00:30:18,040
in there you may get wonky forecasts because.

292
00:30:18,040 --> 00:30:25,080
If you conclude these observations the box Jenkins may like put like this negative trend

293
00:30:25,080 --> 00:30:35,280
on waste but if you really exclude those prior years and you just look at the more recent

294
00:30:35,280 --> 00:30:44,400
data you know it's not really obvious there's a negative trend per se.

295
00:30:44,400 --> 00:30:54,480
So you know it's not a big thing perhaps but not not like there was at first glance.

296
00:30:54,480 --> 00:31:04,160
So just to kind of I guess look through some more of your analysis here I guess before

297
00:31:04,160 --> 00:31:08,960
we get to forecasting.

298
00:31:08,960 --> 00:31:16,840
So this is so I guess let's just look at the totals here so.

299
00:31:16,840 --> 00:31:24,600
This is 30,000 kilograms.

300
00:31:24,600 --> 00:31:27,720
It's a lot.

301
00:31:27,720 --> 00:31:32,360
It's hard to quantify so.

302
00:31:32,360 --> 00:31:40,120
What would be interesting analysis is so actually somebody at better carbon solutions recommended

303
00:31:40,120 --> 00:31:42,120
this so.

304
00:31:42,120 --> 00:31:48,960
I forget the amount of weight he said but basically.

305
00:31:48,960 --> 00:31:56,120
To help visualize this you may want to think about like barrels so I forget the amount

306
00:31:56,120 --> 00:32:03,440
of weight but like let's say you can like fit like 50.

307
00:32:03,440 --> 00:32:08,480
Like say 50 pounds in a barrel or something like that it would be I think it would be

308
00:32:08,480 --> 00:32:18,960
easier to visualize if you said oh there were you know a thousand barrels of waste.

309
00:32:18,960 --> 00:32:26,120
In a given month because you know that way you know people could visualize sort of what

310
00:32:26,120 --> 00:32:31,960
you know a thousand barrels of waste would look like like that would be like you know

311
00:32:31,960 --> 00:32:34,400
fill up like a warehouse.

312
00:32:34,400 --> 00:32:40,480
But when you just say you know 30,000 kilograms.

313
00:32:40,480 --> 00:32:46,440
It's really hard to have any like I don't have I mean maybe some people they were a

314
00:32:46,440 --> 00:32:53,120
bit more metric mindset may have a bit more mental framework for that but I can't really

315
00:32:53,120 --> 00:32:55,120
visualize that.

316
00:32:55,120 --> 00:33:02,000
But any who.

317
00:33:02,000 --> 00:33:05,880
I was something else that just occurred to me that I don't know if it would contribute

318
00:33:05,880 --> 00:33:11,320
to the outliers but at least when I was working at North with cannabis solutions.

319
00:33:11,320 --> 00:33:16,640
The folks who finance and kind of like own everything were these Russian dudes and I

320
00:33:16,640 --> 00:33:19,840
remember that one year they lost like over half their crop because they didn't know what

321
00:33:19,840 --> 00:33:21,840
they were doing.

322
00:33:21,840 --> 00:33:27,000
And so I don't know if that would contribute at all to sort of like their waste production

323
00:33:27,000 --> 00:33:31,000
but that is something that just occurred to me is like I remember they had really serious

324
00:33:31,000 --> 00:33:35,680
issues with growing and a lot of their crop just like wouldn't make it and they would

325
00:33:35,680 --> 00:33:42,560
lose entire swaths of like plants at a time until they kind of dialed it in and got it

326
00:33:42,560 --> 00:33:44,560
figured out.

327
00:33:44,560 --> 00:33:48,120
And they're a pretty large producer as well from what I understand.

328
00:33:48,120 --> 00:33:53,800
So it's possible that kind of thing could contribute but definitely for sure 30,000

329
00:33:53,800 --> 00:34:00,240
kilograms is like a wild number even as somebody who has like worked in cannabis and seen literally

330
00:34:00,240 --> 00:34:09,080
like entire dumpster sized like containers of like plant matter and stuff it's pretty

331
00:34:09,080 --> 00:34:10,680
wild but.

332
00:34:10,680 --> 00:34:18,480
Wow, that's interesting that kind of helps make sense of some of this data because there

333
00:34:18,480 --> 00:34:28,480
are some like huge losses I can't, it's hard to picture but that does help.

334
00:34:28,480 --> 00:34:39,960
That's a good anecdotal story from the ground floor because it helps.

335
00:34:39,960 --> 00:34:46,280
That's what I think we were talking about last week is part of data science is telling

336
00:34:46,280 --> 00:34:49,520
us the story of the data.

337
00:34:49,520 --> 00:34:50,520
Right.

338
00:34:50,520 --> 00:34:51,520
Right.

339
00:34:51,520 --> 00:35:01,720
So that's like a real good story where you kind of highlight just some of these individual

340
00:35:01,720 --> 00:35:09,280
days where oh like there were maybe like hundreds or thousands of grams of waste in a given

341
00:35:09,280 --> 00:35:16,280
day and you're like you're trying to tell the story of what went on there.

342
00:35:16,280 --> 00:35:23,760
And one thing Charles and Selle that would be an interesting analysis would essentially

343
00:35:23,760 --> 00:35:32,380
you could essentially run a regression of this data on month.

344
00:35:32,380 --> 00:35:44,000
So you would you basically do dummy variables for each month.

345
00:35:44,000 --> 00:35:48,880
And so then when you run the regression you basically exclude one of the variables so

346
00:35:48,880 --> 00:35:54,600
you'd you'd exclude January.

347
00:35:54,600 --> 00:35:59,720
That way you would just basically be comparing everything to January but you can exclude

348
00:35:59,720 --> 00:36:10,720
what every month you'd want but essentially see what months are the statistically highest

349
00:36:10,720 --> 00:36:17,360
you know the statistically larger amount of waste.

350
00:36:17,360 --> 00:36:29,120
Because I mean my Bayesian hypothesis would have been like October November but just naively

351
00:36:29,120 --> 00:36:40,600
looking at the data like I mean these jump out to me here and this is like like late winter

352
00:36:40,600 --> 00:36:53,120
spring January February March and then again over here you see a spike and then this is

353
00:36:53,120 --> 00:37:02,640
this is like spring again where you've got but this is later spring like April May.

354
00:37:02,640 --> 00:37:06,400
So I'm just curious if there's any seasonality.

355
00:37:06,400 --> 00:37:12,680
You think it's related to 420?

356
00:37:12,680 --> 00:37:21,920
Well you see sales spike but I'm curious if that would just be so strange for like waste

357
00:37:21,920 --> 00:37:23,520
to spike.

358
00:37:23,520 --> 00:37:31,200
Well I mean obviously they have to ramp up inventory right and it's not an out I mean

359
00:37:31,200 --> 00:37:36,880
there's a lot of this is outdoors but there's so much indoor that it's basically you know

360
00:37:36,880 --> 00:37:42,840
12 months a year you probably have some sort you know every month somebody has some harvest

361
00:37:42,840 --> 00:37:45,000
assuming they've timed everything out.

362
00:37:45,000 --> 00:37:50,840
So like I said when you run the and that's why the regression is interesting here is

363
00:37:50,840 --> 00:37:59,320
because when you run the regression you know we may there may not even be there may not

364
00:37:59,320 --> 00:38:04,080
even be a correlation for given months.

365
00:38:04,080 --> 00:38:07,640
It's worth revisiting.

366
00:38:07,640 --> 00:38:13,800
I couldn't really see much of a pattern it just I think like September through December

367
00:38:13,800 --> 00:38:22,800
sort of were lower but like the peak months changed from year to year so I don't know

368
00:38:22,800 --> 00:38:26,240
why that is.

369
00:38:26,240 --> 00:38:29,240
It's worth revisiting.

370
00:38:29,240 --> 00:38:35,640
Seasonality still something we're trying to wrap their heads around but it's a pickle

371
00:38:35,640 --> 00:38:38,840
one.

372
00:38:38,840 --> 00:38:49,560
So you've got the top producers here once again I think and that was just the what the

373
00:38:49,560 --> 00:38:55,480
person at Better Carbon Solutions suggested and it was I think it was an interesting an

374
00:38:55,480 --> 00:39:02,480
interesting advice because it wasn't something I would have thought of but you know stressing

375
00:39:02,480 --> 00:39:11,760
on the story I think the barrels is a little bit of a you know almost have to have like

376
00:39:11,760 --> 00:39:18,680
a legend that would say oh you know assuming one barrel is a hundred pounds or something

377
00:39:18,680 --> 00:39:25,440
like that but that's how they they do gasoline right.

378
00:39:25,440 --> 00:39:44,080
So but anyways it looks like there's really this pack of one two three.

379
00:39:44,080 --> 00:39:50,480
It's really like this pack of seven who were like out in the lead and then you know it

380
00:39:50,480 --> 00:39:59,840
looks like these are well it looks like this is almost a secondary pack here you know another

381
00:39:59,840 --> 00:40:08,060
pack of six then you know these these may just be like you know these firms look similar

382
00:40:08,060 --> 00:40:17,760
in size you know this is real interesting.

383
00:40:17,760 --> 00:40:21,760
And just look at some of the rest of your analysis here while we've got a bit of time

384
00:40:21,760 --> 00:40:31,160
but this is my first time seeing it so pardon me for for sort of my hot takes but we'll

385
00:40:31,160 --> 00:40:37,280
see what you with the data you've wrangled up here.

386
00:40:37,280 --> 00:40:43,600
This is so cool Charles this is the first analysis I've ever seen.

387
00:40:43,600 --> 00:40:49,880
And it is like even to like you know a casual like me this is like I can still understand

388
00:40:49,880 --> 00:40:51,440
and glean like what you're getting at.

389
00:40:51,440 --> 00:40:55,800
So I think that speaks to your ability.

390
00:40:55,800 --> 00:40:57,800
Thank you.

391
00:40:57,800 --> 00:41:15,280
Interesting so here you basically have the top five.

392
00:41:15,280 --> 00:41:22,400
So I don't understand this concept of growing waste like what does that mean?

393
00:41:22,400 --> 00:41:30,160
Oh this is like that's basically like I mean so you may know better than I but I think

394
00:41:30,160 --> 00:41:38,320
that's basically when they're just sweeping up like plant matter off of the floor.

395
00:41:38,320 --> 00:41:43,520
Yeah there's when you grow cannabis one of the great enemies of cannabis like production

396
00:41:43,520 --> 00:41:49,560
is mold and so there's like a lot of defoliation that happens in the growing process to ensure

397
00:41:49,560 --> 00:41:54,640
the highest like exposure of bud sites to light because that's like you know where the

398
00:41:54,640 --> 00:41:56,520
money is.

399
00:41:56,520 --> 00:42:02,920
And so and sometimes you know they cull plants that just aren't making it or there is definitely

400
00:42:02,920 --> 00:42:08,480
like growing waste produced in the life cycle of the plant as well.

401
00:42:08,480 --> 00:42:13,580
Okay thanks that yeah I mean to me this is a bunch of data I don't always know what these

402
00:42:13,580 --> 00:42:18,120
terms mean and it's hard for me to wrap my head around what is it you know what am I

403
00:42:18,120 --> 00:42:23,080
actually looking at and what am I looking for?

404
00:42:23,080 --> 00:42:27,880
And I like I said like I wasn't super involved in growing or anything like that so it's possible

405
00:42:27,880 --> 00:42:33,280
that the growing waste is coming from like a specific practice industry practice or some

406
00:42:33,280 --> 00:42:37,480
kind of thing that they're doing but that's what I would assume that they're referring

407
00:42:37,480 --> 00:42:39,640
to anyway.

408
00:42:39,640 --> 00:42:52,080
It looks like the entry of it fell off so does it look like they're entering growing

409
00:42:52,080 --> 00:42:56,040
waste anymore doesn't it?

410
00:42:56,040 --> 00:43:01,960
Right although it's not listed as depreciated in the manual so but it looks like it's being

411
00:43:01,960 --> 00:43:02,960
depreciated.

412
00:43:02,960 --> 00:43:18,520
Okay so it may not be required to record it anymore so we'll have to double check.

413
00:43:18,520 --> 00:43:25,400
In flower waste there was like three entries and so again what is that and obviously they're

414
00:43:25,400 --> 00:43:37,640
not tracking it or those are mislabeled entries and then the seedling waste also falls off.

415
00:43:37,640 --> 00:43:45,800
It also falls off and then at one point there's like 30,000 kilograms of seedlings I mean

416
00:43:45,800 --> 00:43:48,720
how much does a seedling weigh?

417
00:43:48,720 --> 00:43:56,240
30,000 kilograms I mean that would be like that would be a lot of seedlings.

418
00:43:56,240 --> 00:43:59,640
Yeah for sure.

419
00:43:59,640 --> 00:44:14,280
It looks like okay this would make sense in July of 2019 they updated the traceability

420
00:44:14,280 --> 00:44:26,080
system and so at that time I think is when they deprecated certain fields and it looks

421
00:44:26,080 --> 00:44:35,280
like they essentially just deprecated needing to record that waste and my guess is some

422
00:44:35,280 --> 00:44:44,280
people just kept on recording it even though they didn't have to but then they kind of

423
00:44:44,280 --> 00:44:49,960
failed off.

424
00:44:49,960 --> 00:44:55,120
Okay that's helpful that explains a lot of what I'm seeing.

425
00:44:55,120 --> 00:45:05,760
And similar thing here so growing waste so they're recording it up to July of 2019 and

426
00:45:05,760 --> 00:45:12,200
then that one trails off.

427
00:45:12,200 --> 00:45:27,560
So really what we just need to focus in here is this pervist waste.

428
00:45:27,560 --> 00:45:37,120
Out of curiosity do you know how strict they are about requiring producers to, sorry low

429
00:45:37,120 --> 00:45:41,440
key cat emergency.

430
00:45:41,440 --> 00:45:46,360
Maybe try to catch a bird through the window that was dumb.

431
00:45:46,360 --> 00:45:52,000
Sorry do you know how strict they are about requiring producers to report these numbers

432
00:45:52,000 --> 00:45:55,360
and like whether or not they audit them or if there's like requirements by the state

433
00:45:55,360 --> 00:45:59,560
for that?

434
00:45:59,560 --> 00:46:04,840
I don't off the top of my head I don't know.

435
00:46:04,840 --> 00:46:11,160
That's an excellent question.

436
00:46:11,160 --> 00:46:14,080
Yeah I was just wondering it's not that I expect you to know that necessarily but I

437
00:46:14,080 --> 00:46:19,080
feel like you know people when their when homework is assigned and required and part

438
00:46:19,080 --> 00:46:21,160
of your grade it's a little different.

439
00:46:21,160 --> 00:46:26,680
So I know for sure that like at least when I was working for Northwest Canada Solutions

440
00:46:26,680 --> 00:46:32,160
they didn't always do it by the book you know and so I'm wondering if how reliable some

441
00:46:32,160 --> 00:46:38,280
of the numbers could be or if they don't like penalize any of the growers like or incentivize

442
00:46:38,280 --> 00:46:43,720
their reporting for certain numbers like how that would affect the overall outcome of the

443
00:46:43,720 --> 00:46:45,760
numbers.

444
00:46:45,760 --> 00:46:54,120
I think that's an excellent excellent point and I think that's and that sort of drives

445
00:46:54,120 --> 00:47:03,400
home sort of a recurring theme of the meetup group is so we haven't actually prepared any

446
00:47:03,400 --> 00:47:09,800
forecasts yet but that's one thing that I always always hinge my forecasts because as

447
00:47:09,800 --> 00:47:23,680
you noted it you know junk data in you know junk data out so it's sort of it's true.

448
00:47:23,680 --> 00:47:30,280
You can you know statistically show that if there is measurement error then that does

449
00:47:30,280 --> 00:47:37,560
bias your results it depends on the amount of measurement error so you know a small amount

450
00:47:37,560 --> 00:47:46,600
of measurement error doesn't really lead to too much bias but as you noted I mean if you

451
00:47:46,600 --> 00:47:51,880
know if people are just eyeballing this or if you have entire licensees who may not even

452
00:47:51,880 --> 00:48:01,520
be entering it I mean these numbers could be unreliable.

453
00:48:01,520 --> 00:48:06,600
So I think I think that's an excellent observation and should always be taken into consideration

454
00:48:06,600 --> 00:48:12,040
so if you were doing analysis of this or if you were presenting forecasts I think you

455
00:48:12,040 --> 00:48:20,120
should definitely edge this you should say okay producers are principally care you know

456
00:48:20,120 --> 00:48:23,960
they principally care about selling their cannabis so they're probably going to record

457
00:48:23,960 --> 00:48:36,600
their you know price, labor, weight.

458
00:48:36,600 --> 00:48:53,200
So I think if any good analysis should bring that up just to counter I just always feel

459
00:48:53,200 --> 00:48:57,480
like well I heard somewhere that you know if you're not measuring it you're not managing

460
00:48:57,480 --> 00:49:10,960
it so you know any you know sometimes any measure is better than no measure so I think

461
00:49:10,960 --> 00:49:19,920
you should take take it on face value so these numbers could be wild at least we have some

462
00:49:19,920 --> 00:49:26,080
numbers so that's where you know where you get your confidence intervals so at least

463
00:49:26,080 --> 00:49:34,560
we have a measure of how much waste may be being produced and we can kind of see where

464
00:49:34,560 --> 00:49:44,240
it fluctuates and so like if you were doing a monthly analysis perhaps the entry error

465
00:49:44,240 --> 00:49:51,800
is the same month to month to month and then you could just say oh well we can still estimate

466
00:49:51,800 --> 00:49:58,000
the month effect but we just don't know the magnitude right or we're biased in the magnitude

467
00:49:58,000 --> 00:50:06,240
but you know our month effect you know that wouldn't be biased so but that takes a little

468
00:50:06,240 --> 00:50:14,960
hand waving so I you know I think there's two sides to the coin you know it's basically

469
00:50:14,960 --> 00:50:25,800
yeah you I think you need to acknowledge that there's noise in the data incredibly imperfect

470
00:50:25,800 --> 00:50:36,520
data and the producers don't even really have the incentive to enter it in accurately but

471
00:50:36,520 --> 00:50:46,080
you know we need some measure of the waste. Having spent a lot of time you're going to

472
00:50:46,080 --> 00:50:55,800
chime in Charles. There's no way that anybody could audit anybody I mean it is really really

473
00:50:55,800 --> 00:51:02,520
messy I should I need to publish that the amount of time it takes for like lab results

474
00:51:02,520 --> 00:51:07,560
because I got to a point with it where I was like you know I'm just making stuff up because

475
00:51:07,560 --> 00:51:14,880
I'm just having to fiddle with this data so much to make something out of it but it's

476
00:51:14,880 --> 00:51:21,840
just you know it's just a made-up story yeah there's no way this is really the data is

477
00:51:21,840 --> 00:51:28,120
really inconsistent there doesn't seem to be anything that there doesn't seem to be

478
00:51:28,120 --> 00:51:33,720
any requirements of like you know like checking the data when it goes into the database I

479
00:51:33,720 --> 00:51:38,120
think people enter whatever and I think that they enter stuff like they don't enter it

480
00:51:38,120 --> 00:51:42,240
daily or some people might enter it daily some people might enter it weekly some people

481
00:51:42,240 --> 00:51:49,000
might enter it monthly which I think maybe explains some of those you know where the

482
00:51:49,000 --> 00:51:56,040
spikes in the amount of waste change like from April to May because people it's just

483
00:51:56,040 --> 00:52:02,760
depending on when they entered the data totally I feel like if anything it's an argument to

484
00:52:02,760 --> 00:52:09,200
you know encourage or advocate for that sort of like I guess incentivization of like reporting

485
00:52:09,200 --> 00:52:14,520
that information just because like I totally get that I think you it's really clear even

486
00:52:14,520 --> 00:52:19,560
to me based on like the really helpful charts and graphs that you've made that like yeah

487
00:52:19,560 --> 00:52:25,520
there is you know as Keegan said like a lot of noise in the data and it would probably

488
00:52:25,520 --> 00:52:31,800
be beneficial on a lot of different levels to require and incentivize producers to report

489
00:52:31,800 --> 00:52:35,800
that more accurately and like for sure based on my personal experience like I said just

490
00:52:35,800 --> 00:52:41,000
working for that one producer I think you can bet your bottom dollar that it's your

491
00:52:41,000 --> 00:52:46,480
your mileage will vary like I really can't imagine that they were reporting a lot of

492
00:52:46,480 --> 00:52:52,160
the stuff accurately or consistently based on kind of like how they were running things

493
00:52:52,160 --> 00:52:57,280
when I was there and how they have been running things before I was there so I think you're

494
00:52:57,280 --> 00:53:01,680
totally right and I feel like if anything it's an argument to kind of like advocate

495
00:53:01,680 --> 00:53:06,040
for that stuff to have better data in the future.

496
00:53:06,040 --> 00:53:12,120
Yeah so Oregon is supposed to be really strict but I've never seen their their actual raw

497
00:53:12,120 --> 00:53:13,120
data.

498
00:53:13,120 --> 00:53:26,640
So those are excellent points across the board and yeah like you said show us the data because

499
00:53:26,640 --> 00:53:34,800
you're right it would be like for Oregon it'd be interesting because it does seem like they

500
00:53:34,800 --> 00:53:48,440
have a good handle on their data but I'm curious yeah to the amount of noise but you know it

501
00:53:48,440 --> 00:53:52,680
just takes good data wranglers and so that's why Charles when you're doing your analysis

502
00:53:52,680 --> 00:54:01,600
it's so good for you to explain like you know why things are so messy and the effects of

503
00:54:01,600 --> 00:54:07,320
it like you said it you know it's going to be tough to audit anything because you know

504
00:54:07,320 --> 00:54:15,680
it doesn't seem like there's much validation of the data.

505
00:54:15,680 --> 00:54:26,480
You know for example like a like a like a valid harvest date for example.

506
00:54:26,480 --> 00:54:35,280
So I think your analysis here has been fruitful and I think there's you know a lot of lessons

507
00:54:35,280 --> 00:54:44,720
learned here with with with data analysis you know you spend you know 90 plus percent

508
00:54:44,720 --> 00:54:52,000
of your time just working with the data and you know you really need to be upfront with

509
00:54:52,000 --> 00:55:01,880
your audience about you know what steps you took you know like Charles was saying like

510
00:55:01,880 --> 00:55:13,920
with the lab result data like you know you can't just you know you can't just squeeze

511
00:55:13,920 --> 00:55:21,760
things into a box and like you know just you know wave your hands about the things that

512
00:55:21,760 --> 00:55:26,920
don't make sense you know you have to be upfront with your audience about okay we've got maybe

513
00:55:26,920 --> 00:55:35,400
not the best data entry here or you know not the best incentives to enter the right data

514
00:55:35,400 --> 00:55:41,880
you know the data entry may be delayed but I think I'm just just reiterating what you

515
00:55:41,880 --> 00:55:48,160
two have already said here.

516
00:55:48,160 --> 00:56:02,600
But anyways I think you two actually captured the heat of things the meat of things so I

517
00:56:02,600 --> 00:56:08,920
think I'll go ahead and wrap it up there for the day and just you know let people get on

518
00:56:08,920 --> 00:56:16,960
and start exploring the data on their own but for next week we can work on some forecasting

519
00:56:16,960 --> 00:56:24,160
and I'm going to have a look at you know wrangling this data myself because I want to sort of

520
00:56:24,160 --> 00:56:29,000
replicate Charles's analysis because you've done an excellent job.

521
00:56:29,000 --> 00:56:30,000
Thanks.

522
00:56:30,000 --> 00:56:36,440
Yeah thanks for sharing Charles this is super duper cool I learned a lot today y'all thanks

523
00:56:36,440 --> 00:56:42,520
so much for walking me through what y'all do on the regular.

524
00:56:42,520 --> 00:56:51,160
Oh yeah thanks for joining Selle we just sometimes we get a bit more hands on but like this data

525
00:56:51,160 --> 00:56:57,080
set is giant so we're a little limited in scope but it was it's awesome that Charles

526
00:56:57,080 --> 00:57:03,600
put together these charts because rule number one look at the data so that's what we did

527
00:57:03,600 --> 00:57:05,800
today.

528
00:57:05,800 --> 00:57:09,760
And they are beautiful charts but anyways in the future you know if things are more

529
00:57:09,760 --> 00:57:15,160
hands on I definitely don't mind observing you and Charles kind of in your like natural

530
00:57:15,160 --> 00:57:20,640
I guess environment working on stuff and collaborating because I would also find that helpful so

531
00:57:20,640 --> 00:57:26,440
don't please don't feel like you have to teach to the back of the class so to speak.

532
00:57:26,440 --> 00:57:32,840
I'm just really excited to kind of have exposure just because in my personal like tech journey

533
00:57:32,840 --> 00:57:38,400
a lot of it is just like studying JavaScript or studying whatever it is and studying I'm

534
00:57:38,400 --> 00:57:43,520
still trying to learn a passion for something so I think this is definitely a step in that

535
00:57:43,520 --> 00:57:47,520
right direction especially in terms of like exposure for specialization in the future

536
00:57:47,520 --> 00:57:49,680
so I really appreciate it.

537
00:57:49,680 --> 00:57:56,080
Well it's entirely open so if we want to look at charts and talk about Charles's analysis

538
00:57:56,080 --> 00:58:00,360
by all means are you going to chime in Charles?

539
00:58:00,360 --> 00:58:06,840
Yeah and really your insights from working you know in the industry that was really helpful

540
00:58:06,840 --> 00:58:11,600
I mean there's you know because when I look at this data I don't you know again I don't

541
00:58:11,600 --> 00:58:17,960
have anybody to go to and ask you know like why is this or but you know you gave me some

542
00:58:17,960 --> 00:58:23,880
good insight and now I kind of I can have a better understanding of what's going on.

543
00:58:23,880 --> 00:58:28,280
Well I'm glad I could help in my own way so you know let me know if there's anything else

544
00:58:28,280 --> 00:58:33,360
I can do on the sort of like off time in terms of like getting a little more familiar I'm

545
00:58:33,360 --> 00:58:38,560
definitely like kind of rubbing my hands together and looking at Python as like my next language

546
00:58:38,560 --> 00:58:44,960
I'd like to learn but given how JavaScript is going that may be in some time yet so we'll

547
00:58:44,960 --> 00:58:45,960
just have to see.

548
00:58:45,960 --> 00:58:49,760
I think Python is easier.

549
00:58:49,760 --> 00:58:58,200
I don't know but it looks super I love I honestly love optimizing stuff and I love automating

550
00:58:58,200 --> 00:59:05,480
things tasks as much as possible and so Python is looking pretty delectable these days so

551
00:59:05,480 --> 00:59:10,240
it's really cool to see the kind of heavy lifting that y'all are accomplishing with

552
00:59:10,240 --> 00:59:15,000
it so I definitely am starting to like feel those kinds of like connections hip happening

553
00:59:15,000 --> 00:59:19,560
and that's super exciting just because I'm not gonna lie just learning JavaScript in

554
00:59:19,560 --> 00:59:24,720
a vacuum is kind of a slog so getting to see actual practical applications for various

555
00:59:24,720 --> 00:59:32,720
languages is like super exciting and very energizing so I really really appreciate it.

556
00:59:32,720 --> 00:59:39,480
There's room in the world for both so they you know you know you got a hammer you got

557
00:59:39,480 --> 00:59:44,000
a screwdriver you know so it's you know they each have their own they have their own purposes

558
00:59:44,000 --> 00:59:51,120
definitely yeah very very cool y'all thank you so much again for your time awesome great

559
00:59:51,120 --> 00:59:53,080
job Charles this thing was so cool.

560
00:59:53,080 --> 00:59:55,080
Yeah thank you.

561
00:59:55,080 --> 00:59:58,280
All right y'all catch you next week.

562
00:59:58,280 --> 01:00:00,280
Have an awesome day until next week.

563
01:00:00,280 --> 01:00:01,280
Thanks you too.

564
01:00:01,280 --> 01:00:23,280
Bye.

