1
00:00:00,000 --> 00:00:07,280
My name is Keegan.

2
00:00:07,280 --> 00:00:14,500
Got into the cannabis space in 2018 at a laboratory and have now been working with cannabis data

3
00:00:14,500 --> 00:00:16,820
going on for plus years now.

4
00:00:16,820 --> 00:00:22,200
So love to hear about what you'd like in the industry, what you're doing and what you hope

5
00:00:22,200 --> 00:00:23,520
to get out of the group.

6
00:00:23,520 --> 00:00:26,600
So Daniel, would you mind kicking it off?

7
00:00:26,600 --> 00:00:27,600
Sure.

8
00:00:27,600 --> 00:00:35,360
I just completed a data analytics program over the past few months and was looking at

9
00:00:35,360 --> 00:00:38,840
different types of data events and this popped up at the top.

10
00:00:38,840 --> 00:00:40,480
So I figured I'd come check it out.

11
00:00:40,480 --> 00:00:41,480
Certainly.

12
00:00:41,480 --> 00:00:42,480
My name is Dean Pangelinan.

13
00:00:42,480 --> 00:00:46,000
I'm a senior business intelligence strategist for a company called Merge World.

14
00:00:46,000 --> 00:00:51,280
We're in the digital marketing space and some of our clients are actually in the cannabis

15
00:00:51,280 --> 00:00:52,280
industry.

16
00:00:52,280 --> 00:00:59,960
So I'm wanting to learn more about the scientific analysis of cannabis data because I am a statistician

17
00:00:59,960 --> 00:01:05,100
by training and work primarily in the visualization and representation of the data.

18
00:01:05,100 --> 00:01:11,040
So I want to be able to find out how I can best help our clients and candidates.

19
00:01:11,040 --> 00:01:12,960
You're definitely in the right place.

20
00:01:12,960 --> 00:01:16,560
Data visualization is a key part of what we do.

21
00:01:16,560 --> 00:01:19,880
So would love to have your expertise.

22
00:01:19,880 --> 00:01:22,400
I'm thrilled to pick your brain.

23
00:01:22,400 --> 00:01:26,120
TJ, would you be interested in introducing yourself?

24
00:01:26,120 --> 00:01:27,120
Sure.

25
00:01:27,120 --> 00:01:28,720
Hi, I'm TJ Goff.

26
00:01:28,720 --> 00:01:34,560
I'm a back end engineer working at Lantern, which is a cannabis delivery startup based

27
00:01:34,560 --> 00:01:36,960
out of Boston.

28
00:01:36,960 --> 00:01:37,960
Too cool.

29
00:01:37,960 --> 00:01:40,360
Cannabis delivery is a big, big thing.

30
00:01:40,360 --> 00:01:43,720
So you're well positioned for a profitable future.

31
00:01:43,720 --> 00:01:44,800
So awesome.

32
00:01:44,800 --> 00:01:45,800
That's the idea.

33
00:01:45,800 --> 00:01:50,560
We'll have to pick your brain on transportation because we've got a lot of transportation

34
00:01:50,560 --> 00:01:51,560
data.

35
00:01:51,560 --> 00:01:55,760
Ramni, would you be interested in introducing yourself?

36
00:01:55,760 --> 00:01:56,760
Yes.

37
00:01:56,760 --> 00:01:59,840
Hi, thank you for organizing this meetup.

38
00:01:59,840 --> 00:02:05,280
So I'm taking this program in data analytics and I'm learning all the machine learning

39
00:02:05,280 --> 00:02:10,120
and statistical modeling and data visualization and all the different techniques.

40
00:02:10,120 --> 00:02:14,400
So when I saw this, I found the data is interesting, like cannabis data.

41
00:02:14,400 --> 00:02:15,400
Wow.

42
00:02:15,400 --> 00:02:16,880
I've never seen that data.

43
00:02:16,880 --> 00:02:20,960
So hence I'm joining and let's see what I learned from here.

44
00:02:20,960 --> 00:02:21,960
Thank you.

45
00:02:21,960 --> 00:02:22,960
Awesome.

46
00:02:22,960 --> 00:02:25,200
You're also in the right place.

47
00:02:25,200 --> 00:02:26,200
Thanks for joining.

48
00:02:26,200 --> 00:02:30,320
Trav, would you be interested in introducing yourself?

49
00:02:30,320 --> 00:02:31,640
Yes, indeed.

50
00:02:31,640 --> 00:02:32,640
I'm Trav.

51
00:02:32,640 --> 00:02:36,880
I'm just interested in the industry and I want to learn more about data analytics.

52
00:02:36,880 --> 00:02:37,880
Too cool.

53
00:02:37,880 --> 00:02:39,960
It's going to be a big field.

54
00:02:39,960 --> 00:02:44,240
So once again, there's a high demand for people with data analytics skills.

55
00:02:44,240 --> 00:02:47,640
So you're well positioned too.

56
00:02:47,640 --> 00:02:53,320
Well, without further ado, I'll go ahead and share with you some of the work I've been

57
00:02:53,320 --> 00:03:01,340
doing because really it's just to guide a discussion.

58
00:03:01,340 --> 00:03:07,600
So don't let me steal the show, but hopefully it'll spark a good conversation here.

59
00:03:07,600 --> 00:03:09,840
Well, welcome to the meetup.

60
00:03:09,840 --> 00:03:12,800
Last week we looked at laboratories.

61
00:03:12,800 --> 00:03:18,040
And then this week I figured let's get a bit more to the production side.

62
00:03:18,040 --> 00:03:24,400
We'll get to cultivation next week, but we can start with processing.

63
00:03:24,400 --> 00:03:29,600
So I'm sure all of you are acquainted with processed cannabis.

64
00:03:29,600 --> 00:03:38,300
Up in the top left corner, we've got product known as Shatter testing at 74% THC.

65
00:03:38,300 --> 00:03:45,000
And so this is a new and an old process.

66
00:03:45,000 --> 00:03:50,080
So I was going to dive in a bit to the history before we get to the data because it's so

67
00:03:50,080 --> 00:03:56,240
fascinating to know where all these processes came from.

68
00:03:56,240 --> 00:04:05,200
So I said, you know, humans are basically just remarkably good at using plants as factories

69
00:04:05,200 --> 00:04:08,320
to produce complex chemicals.

70
00:04:08,320 --> 00:04:14,240
And people have been doing this with cannabis hemp for a long time now.

71
00:04:14,240 --> 00:04:22,240
So people think that cannabis was being processed in India as early as 1000 BC.

72
00:04:22,240 --> 00:04:30,200
So we're going on potentially 3,000 years of people processing and using cannabis.

73
00:04:30,200 --> 00:04:35,880
And they may not have had the high THC that we had today, so that may be why it was important

74
00:04:35,880 --> 00:04:39,340
to produce concentrates.

75
00:04:39,340 --> 00:04:43,080
So just fascinating history here.

76
00:04:43,080 --> 00:04:46,960
And we'll dive into some of the more modern history.

77
00:04:46,960 --> 00:04:55,600
So what we all know now as cannabis brownies, and sorry you got cut off here, we've got

78
00:04:55,600 --> 00:05:01,320
Alice B. Topeless, and I'll send out a reformatted presentation.

79
00:05:01,320 --> 00:05:08,200
But she was the first one to really write a cookbook with cannabis in chocolate.

80
00:05:08,200 --> 00:05:12,480
So you know, you can consider that processing.

81
00:05:12,480 --> 00:05:22,120
And the recipe was by Brian Dyson, who was a restaurateur.

82
00:05:22,120 --> 00:05:31,240
So interesting history there.

83
00:05:31,240 --> 00:05:34,600
So they were cooking with cannabis, concentrating it.

84
00:05:34,600 --> 00:05:41,520
And then beginning in 1940 and really accelerating into the 1960s is when people really started

85
00:05:41,520 --> 00:05:44,440
isolating the various cannabinoids.

86
00:05:44,440 --> 00:05:51,800
So you know, when you look at the grand scheme of things, we've only been, you know, concentrating

87
00:05:51,800 --> 00:06:01,680
these products and really, you know, analyzing them on a compound by compound basis for 60

88
00:06:01,680 --> 00:06:08,880
or so years, maybe 80, but about 60, you know, rigorously, scientifically.

89
00:06:08,880 --> 00:06:14,000
So that's pretty exciting that, you know, cannabis has been around, you know, 3000 years

90
00:06:14,000 --> 00:06:15,420
or so.

91
00:06:15,420 --> 00:06:21,240
And now we get to really get scientific at a real molecular level.

92
00:06:21,240 --> 00:06:30,120
So it's a real exciting, real exciting work, you know, still to be done.

93
00:06:30,120 --> 00:06:35,460
I'll kind of skim over this, but you know, these days just wanted to talk about how cannabis

94
00:06:35,460 --> 00:06:37,500
extraction is done.

95
00:06:37,500 --> 00:06:40,820
A lot of times it's done with organic solvents.

96
00:06:40,820 --> 00:06:47,320
You can do solventless, which is anecdotally becoming more popular.

97
00:06:47,320 --> 00:06:53,200
So some of the solvents that you may use are hydrocarbons.

98
00:06:53,200 --> 00:06:59,560
You also can use butane and carbon dioxide.

99
00:06:59,560 --> 00:07:06,720
But you know, you've got to be careful, right, because not only are these chemicals flammable,

100
00:07:06,720 --> 00:07:10,600
but you don't necessarily want to be ingesting them either.

101
00:07:10,600 --> 00:07:17,760
So there's, you know, a purging process where, you know, you try to get rid of all these

102
00:07:17,760 --> 00:07:18,760
solvents.

103
00:07:18,760 --> 00:07:24,540
And then, you know, we see the data at the laboratory where not only are you testing

104
00:07:24,540 --> 00:07:30,840
for CBD, THC, you also want to make sure you don't have these residual solvents left in

105
00:07:30,840 --> 00:07:34,240
your product.

106
00:07:34,240 --> 00:07:40,980
And in particular, what we'll be talking about today is processing variability.

107
00:07:40,980 --> 00:07:51,240
So we'll be looking at the various compounds that producers may be trying to isolate.

108
00:07:51,240 --> 00:07:56,260
And we'll look at essentially how good they are at doing that.

109
00:07:56,260 --> 00:08:05,280
So that was our original question is, you know, how to run a profitable processor.

110
00:08:05,280 --> 00:08:11,380
And one can argue that focusing on variability could be key.

111
00:08:11,380 --> 00:08:17,100
So for financial reasons, you're always trying to minimize risk.

112
00:08:17,100 --> 00:08:26,480
And in finance, you often just use risk as a synonym for, you know, variability, you

113
00:08:26,480 --> 00:08:29,440
know, your standard deviations in returns.

114
00:08:29,440 --> 00:08:36,160
So from a financial point of view, which drives a lot of the action here in the cannabis industry.

115
00:08:36,160 --> 00:08:39,780
So also for economic reasons, people are risk averse.

116
00:08:39,780 --> 00:08:48,280
So people are just predisposed to processes that aren't unpredictable.

117
00:08:48,280 --> 00:08:54,280
From an engineering point of view, you want to minimize your variance because it's going

118
00:08:54,280 --> 00:08:56,320
to be easier to design your system.

119
00:08:56,320 --> 00:09:05,320
It's going to be much harder to design your process if you've got variation in your extraction

120
00:09:05,320 --> 00:09:07,280
techniques.

121
00:09:07,280 --> 00:09:12,560
And then, of course, from a business marketing and public health perspective, you know, you

122
00:09:12,560 --> 00:09:16,080
want to maximize your consistency of your product.

123
00:09:16,080 --> 00:09:22,960
So marketers are anecdotally talking about how they want a consistent product for their

124
00:09:22,960 --> 00:09:24,640
customers.

125
00:09:24,640 --> 00:09:31,840
And then from a public health point of view, if people are using these concentrates for

126
00:09:31,840 --> 00:09:41,040
medicinal purposes or what have you, you know, you want to make sure that people are, you

127
00:09:41,040 --> 00:09:44,400
know, consuming what they think they're going to be consuming.

128
00:09:44,400 --> 00:09:49,600
So that's where, you know, consistency, you know, matters from, you know, a public health

129
00:09:49,600 --> 00:09:52,480
point of view.

130
00:09:52,480 --> 00:10:00,680
And for the newcomers, I'll let you introduce yourselves here in just a minute.

131
00:10:00,680 --> 00:10:08,280
I'll talk about the data in the code here, and then we can reintroduce ourselves for

132
00:10:08,280 --> 00:10:15,960
the newcomers and then just have open discussions.

133
00:10:15,960 --> 00:10:21,240
And please interrupt me if I'm, you know, moving a little fast here.

134
00:10:21,240 --> 00:10:23,720
Awesome.

135
00:10:23,720 --> 00:10:28,560
Well, on to the data.

136
00:10:28,560 --> 00:10:35,800
So always working with open source data here.

137
00:10:35,800 --> 00:10:40,160
And I'll share these links with you afterwards.

138
00:10:40,160 --> 00:10:47,720
But the Washington state has a very liberal Freedom of Information Act.

139
00:10:47,720 --> 00:10:57,080
And so entities like the Cannabis Observer will do periodic Freedom of Information Act

140
00:10:57,080 --> 00:11:02,760
requests to get access to this publicly available data.

141
00:11:02,760 --> 00:11:09,880
And so here is a release from November of 2021.

142
00:11:09,880 --> 00:11:12,160
So still quite recent data.

143
00:11:12,160 --> 00:11:19,520
And this data spans from 2018 till November of 2021.

144
00:11:19,520 --> 00:11:21,640
Caveat.

145
00:11:21,640 --> 00:11:25,200
This data can be notoriously hard to work with.

146
00:11:25,200 --> 00:11:33,360
So for example, we're looking at the lab results and, you know, unzipped, you know, the lab

147
00:11:33,360 --> 00:11:36,820
results can be quite sizable.

148
00:11:36,820 --> 00:11:53,120
So I've downloaded the these data sets here.

149
00:11:53,120 --> 00:11:59,720
We have clocking in at 31 gigabytes.

150
00:11:59,720 --> 00:12:04,920
We have the inventories data weighing in at 14 gigabytes.

151
00:12:04,920 --> 00:12:08,700
We've got the inventory types data.

152
00:12:08,700 --> 00:12:17,140
And then we have the various lab results data, which is about two and a half gigabytes total.

153
00:12:17,140 --> 00:12:26,100
So we have almost 50 gigabytes of data here just pertaining to lab results.

154
00:12:26,100 --> 00:12:33,920
So that can be a daunting task to get your mind around.

155
00:12:33,920 --> 00:12:44,000
But we've written a useful script here, which I'll actually go ahead and share with the

156
00:12:44,000 --> 00:12:50,440
group here.

157
00:12:50,440 --> 00:12:52,360
Probably should have committed this earlier.

158
00:12:52,360 --> 00:13:01,080
But here's the latest code for today's meetup, which you can find on the cannabis data science

159
00:13:01,080 --> 00:13:03,440
GitHub repository.

160
00:13:03,440 --> 00:13:13,820
So for those of you who are new, you can find all the source code here in the cannabis

161
00:13:13,820 --> 00:13:15,980
data science repository.

162
00:13:15,980 --> 00:13:20,760
So that way you can follow along from the code for today.

163
00:13:20,760 --> 00:13:25,640
And so today the code was quite long.

164
00:13:25,640 --> 00:13:29,940
So I'm not going to necessarily go through it line by line.

165
00:13:29,940 --> 00:13:35,760
But if you need Python snippets for how to parse data, then this is a useful script.

166
00:13:35,760 --> 00:13:41,880
But basically what I do is I read in all of these lab results.

167
00:13:41,880 --> 00:13:56,360
And just to give you an idea of some of the data points we're working with here.

168
00:13:56,360 --> 00:14:06,240
For each lab result, we have a handful of cannabinoids, solvents, and then of course,

169
00:14:06,240 --> 00:14:15,840
some of your other screenings, such as for foreign matter, microbes, moisture, and microtoxin,

170
00:14:15,840 --> 00:14:19,040
as well as some other IDs.

171
00:14:19,040 --> 00:14:24,440
We can then match this with the licensees data.

172
00:14:24,440 --> 00:14:29,320
So we can get fields such as the license number.

173
00:14:29,320 --> 00:14:34,360
We can even get things like the city and address.

174
00:14:34,360 --> 00:14:39,840
So zip code of where the licensee may be.

175
00:14:39,840 --> 00:14:47,160
So that way you can even do geographic analysis.

176
00:14:47,160 --> 00:14:51,320
Also have a rich inventories fields.

177
00:14:51,320 --> 00:14:55,360
So here we can get just a couple things of interest.

178
00:14:55,360 --> 00:14:59,560
So in a future work, I want to look at quantities.

179
00:14:59,560 --> 00:15:02,600
So the quantities being produced.

180
00:15:02,600 --> 00:15:06,400
Of course, we're looking at inventory types.

181
00:15:06,400 --> 00:15:11,440
And this is where we get a lot of our IDs.

182
00:15:11,440 --> 00:15:18,920
Then we have to match this with the inventory type to find out what type of product it is.

183
00:15:18,920 --> 00:15:28,720
And then finally, we can get the name of the strain that people gave their product.

184
00:15:28,720 --> 00:15:33,040
So this script essentially does that.

185
00:15:33,040 --> 00:15:44,600
So reads in your lab data, combines it with inventory data, combines it with inventory

186
00:15:44,600 --> 00:15:51,480
type data, combines the lab results with strain data.

187
00:15:51,480 --> 00:15:57,240
Then you combine the lab results with licensee data.

188
00:15:57,240 --> 00:16:03,360
And then you get all of the lab result fields that we're interested in here.

189
00:16:03,360 --> 00:16:06,960
So here are all of our fields we've collected to this point.

190
00:16:06,960 --> 00:16:09,800
A lot of identification fields.

191
00:16:09,800 --> 00:16:17,080
And then we add on all the cannabinoids and solvents and everything else that we would

192
00:16:17,080 --> 00:16:19,520
like.

193
00:16:19,520 --> 00:16:22,420
And I'll let you play around with this script.

194
00:16:22,420 --> 00:16:26,280
But the exciting thing is the data.

195
00:16:26,280 --> 00:16:33,240
So here we started with almost 50 gigabytes of data.

196
00:16:33,240 --> 00:16:48,120
On my first pass, I just shaved it down to the bare minimum with 30 megabytes of IDs

197
00:16:48,120 --> 00:16:52,680
and two gigabytes of names.

198
00:16:52,680 --> 00:16:56,180
And then you can start to clean that out.

199
00:16:56,180 --> 00:17:02,880
So if you add the IDs with the licensee data, actually first you add it with the strain

200
00:17:02,880 --> 00:17:04,160
name data.

201
00:17:04,160 --> 00:17:07,280
And then you're around 40 megabytes.

202
00:17:07,280 --> 00:17:12,440
You add that with the licensees data and you're around 90 megabytes.

203
00:17:12,440 --> 00:17:16,200
And then you add back the lab results data.

204
00:17:16,200 --> 00:17:26,160
And you're sitting less than 200 megabytes, which is something to be happy with when you're

205
00:17:26,160 --> 00:17:31,360
starting out with 50 gigabytes of data.

206
00:17:31,360 --> 00:17:40,020
So that just shows you how much cleaning can reduce the fingerprint of your data.

207
00:17:40,020 --> 00:17:48,100
So here we just stripped out all the extraneous fields and we removed all the null values.

208
00:17:48,100 --> 00:17:58,600
And what we're left with is a large data set, but less of a size than some apps that are

209
00:17:58,600 --> 00:18:01,260
out there.

210
00:18:01,260 --> 00:18:07,320
So it's still large, but it's something that we can work with here.

211
00:18:07,320 --> 00:18:17,160
So just to give you a bit of a tutorial of the data, you've got your IDs.

212
00:18:17,160 --> 00:18:23,540
You know what lab tested the cannabis.

213
00:18:23,540 --> 00:18:32,800
You've got the name that people attach to the inventory.

214
00:18:32,800 --> 00:18:36,680
As well as we now know what strain it is.

215
00:18:36,680 --> 00:18:45,060
And so you could do a incredible analysis of strains.

216
00:18:45,060 --> 00:18:54,280
So people have been really curious about what's in a strain, what's in Pangea, what's in train

217
00:18:54,280 --> 00:18:55,280
rack.

218
00:18:55,280 --> 00:19:02,660
So you can do an analysis here and try to find out what strains are people growing and

219
00:19:02,660 --> 00:19:05,640
which ones may look the most profitable.

220
00:19:05,640 --> 00:19:09,580
So that's a whole nother can of worms.

221
00:19:09,580 --> 00:19:13,360
But you can add a geographic layer.

222
00:19:13,360 --> 00:19:18,320
So maybe people are growing certain strains in certain parts of the state.

223
00:19:18,320 --> 00:19:24,600
So certain strains may be better in certain climates.

224
00:19:24,600 --> 00:19:27,880
You know what type of product it is.

225
00:19:27,880 --> 00:19:33,480
And so today we'll be specifically looking at concentrates.

226
00:19:33,480 --> 00:19:41,220
And then of course the gold, you have all of the cannabinoid data.

227
00:19:41,220 --> 00:19:44,820
So it would be awesome to have terpene data too.

228
00:19:44,820 --> 00:19:47,160
But we'll take what we're given.

229
00:19:47,160 --> 00:19:53,880
And in this case, we are given an awesome data set of cannabinoids.

230
00:19:53,880 --> 00:20:04,520
And you can also utilize data on residual solvents, microbes, mycotoxins.

231
00:20:04,520 --> 00:20:11,920
You know, just so that you can maybe see what other factors play into this, right?

232
00:20:11,920 --> 00:20:24,000
Because you also want to be growing strains that are, you know, handle stressors in your

233
00:20:24,000 --> 00:20:25,000
environment.

234
00:20:25,000 --> 00:20:30,320
So perhaps certain strains may be more likely to fail for micro or myco.

235
00:20:30,320 --> 00:20:31,320
I don't know.

236
00:20:31,320 --> 00:20:33,440
I'm just kind of conjecturing.

237
00:20:33,440 --> 00:20:40,080
But there could be many interesting aspects in the data.

238
00:20:40,080 --> 00:20:42,560
So there you have it.

239
00:20:42,560 --> 00:20:55,040
We've taken a 50 megabyte data set and flattened it down into 200 megabytes.

240
00:20:55,040 --> 00:21:01,600
So without further ado, I'll go ahead and jump into a bit of exploratory analysis.

241
00:21:01,600 --> 00:21:05,520
Are there any questions so far?

242
00:21:05,520 --> 00:21:08,960
Just checking real quick.

243
00:21:08,960 --> 00:21:10,520
I'm still with you all.

244
00:21:10,520 --> 00:21:11,520
All right.

245
00:21:11,520 --> 00:21:14,320
We'll must have you all entertained.

246
00:21:14,320 --> 00:21:19,160
So we'll get into some exploratory data analysis here, right?

247
00:21:19,160 --> 00:21:22,560
So we've got an awesome data set here.

248
00:21:22,560 --> 00:21:34,440
What's said and done, you know, we're looking at between 100,000 to, you know, 420,000 or

249
00:21:34,440 --> 00:21:37,200
so, 418,000.

250
00:21:37,200 --> 00:21:43,400
So observations, a lot of these may have missing values.

251
00:21:43,400 --> 00:21:46,960
And different people, so here's a good example.

252
00:21:46,960 --> 00:21:50,140
Different people have different naming conventions.

253
00:21:50,140 --> 00:21:58,800
So for example, this cultivator is really good about putting their strain name in, but

254
00:21:58,800 --> 00:22:03,960
they don't have any informative inventory name.

255
00:22:03,960 --> 00:22:17,200
Whereas other producers may have informative inventory names such as this producer.

256
00:22:17,200 --> 00:22:20,480
However, they've got less informative.

257
00:22:20,480 --> 00:22:24,760
Well, it's a give or take.

258
00:22:24,760 --> 00:22:35,520
So long story short, you need to look at both of these inventory names and strain names.

259
00:22:35,520 --> 00:22:37,920
But that's a whole other can of worms.

260
00:22:37,920 --> 00:22:40,640
We're going to get into cultivation next week.

261
00:22:40,640 --> 00:22:45,320
And this is definitely going to be something we look at.

262
00:22:45,320 --> 00:22:51,240
Because you know, the strain you grow, that's a critical choice.

263
00:22:51,240 --> 00:22:54,480
So we've got awesome lab result data there.

264
00:22:54,480 --> 00:22:59,600
And I'll make this available for the meetup group since we're all about open source, open

265
00:22:59,600 --> 00:23:02,600
data here.

266
00:23:02,600 --> 00:23:04,720
Cool.

267
00:23:04,720 --> 00:23:10,260
Well, let's get off to the races.

268
00:23:10,260 --> 00:23:13,920
So we don't need many tools for today.

269
00:23:13,920 --> 00:23:16,840
Probably just some of the basic Python functions.

270
00:23:16,840 --> 00:23:22,360
You know, well, just to recap, I was just saying, we're just going to start with some of these

271
00:23:22,360 --> 00:23:26,600
core packages, Matplotlib, pandas.

272
00:23:26,600 --> 00:23:28,840
You can go a long way with them.

273
00:23:28,840 --> 00:23:36,360
And I was just going to say, you can think of Seaborn as an extension of Matplotlib.

274
00:23:36,360 --> 00:23:40,680
So for me, it's easy to conceptualize it that way.

275
00:23:40,680 --> 00:23:44,360
That way, you just use Seaborn where Matplotlib falls short.

276
00:23:44,360 --> 00:23:47,160
But you know, that's sort of the nitty gritty.

277
00:23:47,160 --> 00:23:50,480
I use it for things like colors.

278
00:23:50,480 --> 00:24:00,160
And then quick tip is, if you just want nice looking plots, just throw everything in 538

279
00:24:00,160 --> 00:24:06,040
style and things look nice right out of the box.

280
00:24:06,040 --> 00:24:08,920
So I already have the data read in here.

281
00:24:08,920 --> 00:24:19,240
And here is a snippet that you'll be able to use once I share this data with you, where

282
00:24:19,240 --> 00:24:22,680
you can pick and choose which variables you want.

283
00:24:22,680 --> 00:24:30,280
So for example, if you don't need these variables, you can just comment them out and then read

284
00:24:30,280 --> 00:24:33,800
in the data that you do need.

285
00:24:33,800 --> 00:24:47,200
For example, the data I've read in, if you look at an observation here, we've got a big

286
00:24:47,200 --> 00:24:50,560
rich JSON data set here.

287
00:24:50,560 --> 00:25:01,080
So we have all the variables that I have already showed you in a nice JSON format.

288
00:25:01,080 --> 00:25:04,000
So awesome.

289
00:25:04,000 --> 00:25:08,120
Awesome, awesome, awesome.

290
00:25:08,120 --> 00:25:13,000
Well let's get to some of the interesting bits.

291
00:25:13,000 --> 00:25:23,240
So I realize now that we're also going to need to include, to say we're looking at processed

292
00:25:23,240 --> 00:25:25,040
products.

293
00:25:25,040 --> 00:25:35,040
You know, we'll also need to include products where they may have used solventless extractions.

294
00:25:35,040 --> 00:25:50,320
So let's see if we can't identify some of these lab results where they use solventless.

295
00:25:50,320 --> 00:25:58,520
So the way we can do this is first we can remember, let's look at some of these fields

296
00:25:58,520 --> 00:26:01,000
that we have here.

297
00:26:01,000 --> 00:26:06,160
So we've got inventory name, not quite what we need.

298
00:26:06,160 --> 00:26:08,160
We need the inventory type.

299
00:26:08,160 --> 00:26:12,440
I think we may even want sample.

300
00:26:12,440 --> 00:26:17,160
Here we are, intermediate type.

301
00:26:17,160 --> 00:26:27,360
Let's just list out here all of the different sample types that we have.

302
00:26:27,360 --> 00:26:32,640
So here they are.

303
00:26:32,640 --> 00:26:35,120
Today not going to be looking at flour.

304
00:26:35,120 --> 00:26:39,460
So next week we'll look exclusively at flour.

305
00:26:39,460 --> 00:26:44,600
For today, just kind of wanted to do a deep dive into concentrates.

306
00:26:44,600 --> 00:26:47,720
That way we're just comparing apples to apples.

307
00:26:47,720 --> 00:26:53,840
So let's see if we can't just say, okay, let's get everything where they either did a solvent

308
00:26:53,840 --> 00:27:10,560
test, or we've got these intermediate types that doesn't require

309
00:27:10,560 --> 00:27:15,960
solvent screening.

310
00:27:15,960 --> 00:27:23,880
So I believe if you're using a food grade solvent, then you don't need to get residual

311
00:27:23,880 --> 00:27:25,440
solvent screening.

312
00:27:25,440 --> 00:27:29,920
It should be included in the or statement nonetheless.

313
00:27:29,920 --> 00:27:36,920
And then of course you don't need residual solvent screening if you're using a non solventless

314
00:27:36,920 --> 00:27:39,840
extraction method.

315
00:27:39,840 --> 00:27:43,120
So I may have missed some.

316
00:27:43,120 --> 00:27:48,600
So this may exclude products such as edibles.

317
00:27:48,600 --> 00:27:58,320
We may want to repeat this analysis specifically for edibles to look at the average concentration

318
00:27:58,320 --> 00:27:59,720
in edibles.

319
00:27:59,720 --> 00:28:07,160
However, just going to say, I don't believe there's as much variability in edibles due

320
00:28:07,160 --> 00:28:09,840
to the regulations on edibles.

321
00:28:09,840 --> 00:28:17,920
I do believe there's a limit on the milligrams of THC and CBD that can be in an individual

322
00:28:17,920 --> 00:28:18,920
edible.

323
00:28:18,920 --> 00:28:28,040
So you see, for example, if the limit is just picking a number here, 10 milligrams per gram,

324
00:28:28,040 --> 00:28:33,680
then you're just going to see a lot of people trying to produce exactly 10.

325
00:28:33,680 --> 00:28:36,560
So it's worth a deep dive.

326
00:28:36,560 --> 00:28:43,280
And they just going to try to work here on processed products.

327
00:28:43,280 --> 00:28:48,840
So we basically are just going to start doing a bunch of conditional averages.

328
00:28:48,840 --> 00:28:53,220
I say you can go so far with that.

329
00:28:53,220 --> 00:28:54,620
It's extraordinary.

330
00:28:54,620 --> 00:29:01,840
It's almost like you just kind of keep layering on conditions and the data just becomes more

331
00:29:01,840 --> 00:29:04,280
and more interesting.

332
00:29:04,280 --> 00:29:12,480
So let's just once again, let's just look at our intermediate types here just to ensure

333
00:29:12,480 --> 00:29:16,600
that we didn't get anything odd in the mix.

334
00:29:16,600 --> 00:29:17,840
And we did.

335
00:29:17,840 --> 00:29:25,680
We've got some flowers and some products we couldn't identify here.

336
00:29:25,680 --> 00:29:30,200
So let's just see if we can't just exclude those real quick.

337
00:29:30,200 --> 00:29:49,280
So let's just say we want all the products where we don't have, let's just say everything

338
00:29:49,280 --> 00:29:56,840
where the intermediate type is there.

339
00:29:56,840 --> 00:30:00,600
We don't want to look at flowers for today.

340
00:30:00,600 --> 00:30:03,520
That's probably just a miscoding there.

341
00:30:03,520 --> 00:30:09,840
So let's see if we can refine our products a little better.

342
00:30:09,840 --> 00:30:14,400
OK, so we can't negate that.

343
00:30:14,400 --> 00:30:16,480
So we'll negate there.

344
00:30:16,480 --> 00:30:19,480
That's better.

345
00:30:19,480 --> 00:30:22,920
Awesome.

346
00:30:22,920 --> 00:30:32,560
So let's just double check that we just have.

347
00:30:32,560 --> 00:30:41,920
Well, somehow it looks like you may just have to proceed as is.

348
00:30:41,920 --> 00:30:45,760
That's for expediency's sake here.

349
00:30:45,760 --> 00:30:48,480
OK, so not getting those.

350
00:30:48,480 --> 00:31:08,800
Aha, just need to fix this to an and condition.

351
00:31:08,800 --> 00:31:17,040
We may not be able to get rid of the nans for now, but that's OK.

352
00:31:17,040 --> 00:31:23,080
We're able to at least exclude the flowers.

353
00:31:23,080 --> 00:31:24,080
Question?

354
00:31:24,080 --> 00:31:29,560
OK, OK, Dean, apologize though.

355
00:31:29,560 --> 00:31:36,000
It is a bit, but good to have you nonetheless.

356
00:31:36,000 --> 00:31:37,560
And sorry for all that.

357
00:31:37,560 --> 00:31:39,520
It may be a little scattered today.

358
00:31:39,520 --> 00:31:46,000
I spent most of my time just writing this big script to compile all the data.

359
00:31:46,000 --> 00:31:49,980
So next week, I'll try to have things a bit more organized.

360
00:31:49,980 --> 00:31:53,760
So that's why it's sort of just been from the top today.

361
00:31:53,760 --> 00:31:59,640
So my apologies, but we'll hone things in through the future.

362
00:31:59,640 --> 00:32:04,200
But without further ado, let's get to some interesting numbers and some figures here.

363
00:32:04,200 --> 00:32:09,500
We should have gotten to the figures before Dean had to go.

364
00:32:09,500 --> 00:32:16,080
So we'll go ahead and get the total cannabinoids here.

365
00:32:16,080 --> 00:32:18,520
Interesting.

366
00:32:18,520 --> 00:32:42,640
It looks like we may have missed some of these cannabinoids.

367
00:32:42,640 --> 00:32:45,880
Interesting.

368
00:32:45,880 --> 00:32:49,860
So it looks like we may have missed some of these cannabinoids.

369
00:32:49,860 --> 00:32:57,960
So I think I may have missed CBN.

370
00:32:57,960 --> 00:33:00,480
You've missed some of these.

371
00:33:00,480 --> 00:33:01,480
Interesting.

372
00:33:01,480 --> 00:33:10,920
Oh, well, we'll proceed with the data that we do have.

373
00:33:10,920 --> 00:33:17,000
OK, also don't have CBGA and THCV.

374
00:33:17,000 --> 00:33:18,000
So my apologies.

375
00:33:18,000 --> 00:33:23,800
I'll have to add those cannabinoids back to the data set.

376
00:33:23,800 --> 00:33:26,800
Awesome.

377
00:33:26,800 --> 00:33:41,560
So we now are down to a small data set here, only have about 800 observations.

378
00:33:41,560 --> 00:33:49,600
So want to relook at why that may be.

379
00:33:49,600 --> 00:33:55,160
So like I said, this is sort of the first go at analyzing this data.

380
00:33:55,160 --> 00:33:59,480
The different types of concentrates you see in the market.

381
00:33:59,480 --> 00:34:03,360
So this is where things actually get interesting.

382
00:34:03,360 --> 00:34:12,680
So we can see, wow, a bulk of the market is done with ethanol concentrates, followed by

383
00:34:12,680 --> 00:34:23,560
hydrocarbon, followed by CO2, with food grade and non-solvent lists making up just above

384
00:34:23,560 --> 00:34:27,200
10% of the whole market.

385
00:34:27,200 --> 00:34:31,360
So any thoughts or comments about this breakdown?

386
00:34:31,360 --> 00:34:32,360
Question?

387
00:34:32,360 --> 00:34:38,360
Yeah, for the solvent versus non-solvent base, would the non-solvent be a lot more expensive?

388
00:34:38,360 --> 00:34:40,160
I'm not too familiar with the processes.

389
00:34:40,160 --> 00:34:41,720
Is that why it would be lower?

390
00:34:41,720 --> 00:34:43,880
At least one of the reasons.

391
00:34:43,880 --> 00:34:46,840
Well, you raise a real interesting point.

392
00:34:46,840 --> 00:34:51,440
And so this is something that depends.

393
00:34:51,440 --> 00:35:00,200
So at the retail level, do consumers have a demand for non-solvent list extracts?

394
00:35:00,200 --> 00:35:02,880
So that's an awesome question.

395
00:35:02,880 --> 00:35:11,600
So we should combine this with retail data and see, does two products that are made with

396
00:35:11,600 --> 00:35:19,120
solvent list methods, do they retail at a higher price than other concentrates?

397
00:35:19,120 --> 00:35:24,320
So is there a premium place on non-solvent list?

398
00:35:24,320 --> 00:35:27,440
And it also may depend on the concentration.

399
00:35:27,440 --> 00:35:35,120
So I've been wanting to do an analysis on, does concentration of cannabinoids affect

400
00:35:35,120 --> 00:35:38,160
the retail price?

401
00:35:38,160 --> 00:35:41,200
I don't know if anybody's looked at that yet.

402
00:35:41,200 --> 00:35:46,080
And I think it would be so easy to do with this data that we have here in Washington.

403
00:35:46,080 --> 00:35:54,160
All we would do is run a regression of price, which we have.

404
00:35:54,160 --> 00:35:55,160
We're at sales.

405
00:35:55,160 --> 00:36:03,400
We have sales on the concentration of that product and see if there's a positive or negative

406
00:36:03,400 --> 00:36:04,400
correlation.

407
00:36:04,400 --> 00:36:11,680
All we're going to do is add a time component to our analysis.

408
00:36:11,680 --> 00:36:20,200
And then we're essentially going to track the breakdown of analyses used over time.

409
00:36:20,200 --> 00:36:29,620
So is solvent list becoming a greater or a smaller proportion of the market?

410
00:36:29,620 --> 00:36:32,320
We don't know, but we can easily answer that.

411
00:36:32,320 --> 00:36:35,960
Did that answer your question?

412
00:36:35,960 --> 00:36:39,120
Yeah, for the most part.

413
00:36:39,120 --> 00:36:43,200
I'm wondering about price and also if it's a lot more difficult to do, which I'm assuming

414
00:36:43,200 --> 00:36:44,760
it would be.

415
00:36:44,760 --> 00:36:45,760
Exactly.

416
00:36:45,760 --> 00:36:54,200
If you're doing a high pressure non-solvent list extraction, that may be challenging.

417
00:36:54,200 --> 00:36:55,760
There are people out there who are doing it.

418
00:36:55,760 --> 00:36:59,000
And then yes, the price point is critical.

419
00:36:59,000 --> 00:37:01,240
So let's look at it.

420
00:37:01,240 --> 00:37:06,360
Let's essentially combine this data with the price data.

421
00:37:06,360 --> 00:37:10,440
And like I said, it would just be interesting to see if there's any correlation between

422
00:37:10,440 --> 00:37:14,280
price and any of the other factors.

423
00:37:14,280 --> 00:37:20,960
Because it doesn't necessarily matter.

424
00:37:20,960 --> 00:37:25,000
It all depends on consumers' preferences.

425
00:37:25,000 --> 00:37:31,480
Too bad one of our Classic members isn't here today because she could talk about her preferences

426
00:37:31,480 --> 00:37:38,360
because consumers are hard to predict.

427
00:37:38,360 --> 00:37:41,320
So it's really hard to predict what they'd like.

428
00:37:41,320 --> 00:37:44,360
But we can look at the data.

429
00:37:44,360 --> 00:37:47,040
Awesome.

430
00:37:47,040 --> 00:37:52,520
Just to round it out here, we can start looking at some of the concentrations.

431
00:37:52,520 --> 00:37:57,960
So this is just sort of pseudo code.

432
00:37:57,960 --> 00:38:04,960
So still need to make sure that everything works correctly.

433
00:38:04,960 --> 00:38:15,720
But here we have the average concentration and the standard deviation found in various

434
00:38:15,720 --> 00:38:16,960
concentrates.

435
00:38:16,960 --> 00:38:24,280
So let's just look at the average concentration first.

436
00:38:24,280 --> 00:38:31,780
So here you have, topping the charts, this actually kind of surprises me.

437
00:38:31,780 --> 00:38:41,920
Topping the charts, you have ethanol coming in at an average of 81% total cannabinoids

438
00:38:41,920 --> 00:38:51,400
with hydrocarbon concentrates just behind at an average of 80%.

439
00:38:51,400 --> 00:39:03,160
And then quite a lower average, you've got CO2 concentrates and food grade solvents around

440
00:39:03,160 --> 00:39:08,200
60-61% for CO2.

441
00:39:08,200 --> 00:39:10,320
I find that interesting.

442
00:39:10,320 --> 00:39:19,360
And then here you see the non-solvent list actually has the lowest average cannabinoids.

443
00:39:19,360 --> 00:39:25,120
And I would conjecture that that's because I wouldn't be surprised if people who are

444
00:39:25,120 --> 00:39:34,840
doing non-solvent list extracts may be doing things like bubble hash, which I don't think

445
00:39:34,840 --> 00:39:44,120
necessarily will have the extraction.

446
00:39:44,120 --> 00:39:51,200
I'm not sure what the word is for it, efficiency perhaps, that some of the other extraction

447
00:39:51,200 --> 00:39:52,200
may-

448
00:39:52,200 --> 00:39:53,200
No, that makes sense.

449
00:39:53,200 --> 00:39:55,960
Do you think more product would have to be used because it's less efficient and that

450
00:39:55,960 --> 00:40:00,400
might be part of the high enterprise, I guess?

451
00:40:00,400 --> 00:40:07,000
Well, that's what I have in my prior is that it does.

452
00:40:07,000 --> 00:40:16,680
So my prior is that consumers have a preference for higher cannabinoid concentrations.

453
00:40:16,680 --> 00:40:19,280
So you'd get a price premium, right?

454
00:40:19,280 --> 00:40:27,560
They're basically paying per milligram of THC.

455
00:40:27,560 --> 00:40:37,960
So you'd have to sell something that's 53% at a lower price than something that's 80%

456
00:40:37,960 --> 00:40:47,240
THC for the consumer to purchase the same milligrams of THC.

457
00:40:47,240 --> 00:40:54,920
So if you abstracted that, oh, the consumer's in the market for milligrams of THC, then

458
00:40:54,920 --> 00:41:01,600
you would expect that something with 53% would have a lower price point.

459
00:41:01,600 --> 00:41:09,080
If it's got a lower price point, there may not be as many people in the market for it.

460
00:41:09,080 --> 00:41:22,960
But I think this is critical to look at the strain name on this slide because I think

461
00:41:22,960 --> 00:41:28,880
the strain and inventory name could be really informative because you may want to try to

462
00:41:28,880 --> 00:41:35,320
separate out if there's different types of these non-solventless extracts.

463
00:41:35,320 --> 00:41:42,160
So for example, some may specifically say there's bubble hatch and others may not.

464
00:41:42,160 --> 00:41:48,520
And it would be interesting to see if there's any groupings that you can make within these

465
00:41:48,520 --> 00:41:53,600
product categories.

466
00:41:53,600 --> 00:42:00,880
And then basically, just to finish out what we began with, essentially I was saying, OK,

467
00:42:00,880 --> 00:42:05,880
you've got your different extraction mechanisms and they're going to have their different

468
00:42:05,880 --> 00:42:08,400
efficacy rates.

469
00:42:08,400 --> 00:42:16,880
If you have two extraction mechanisms that are about the same, so that would be hydrocarbon

470
00:42:16,880 --> 00:42:24,080
concentrates and ethanol concentrates, and as well CO2 concentrates and food grade solvent

471
00:42:24,080 --> 00:42:33,000
concentrates, then from these reasons that we stated in the presentation, from financial

472
00:42:33,000 --> 00:42:39,200
reasons, economic reasons, engineering, business marketing, public health, you would gravitate

473
00:42:39,200 --> 00:42:48,200
towards the one that has a lower variance, lower standard deviation.

474
00:42:48,200 --> 00:42:52,040
So here, let's see if we can't pick the ones that would win.

475
00:42:52,040 --> 00:43:04,160
So if we're comparing hydrocarbons to ethanol extraction, ethanol just takes the cake.

476
00:43:04,160 --> 00:43:10,600
So hydrocarbon concentrates, and like I said, we want to look at this like a processor by

477
00:43:10,600 --> 00:43:19,760
processor basis and even see if we can't disentangle these into more subgroupings.

478
00:43:19,760 --> 00:43:27,920
But right out of the bat, our first impression is, oh, hydrocarbon concentrates have standard

479
00:43:27,920 --> 00:43:41,640
deviation of almost 10%, whereas ethanol concentrates by and far have the lowest variability, the

480
00:43:41,640 --> 00:43:47,980
lowest standard deviation, but their standard deviation being less than 4%.

481
00:43:47,980 --> 00:44:01,480
So that means you would expect around 95% of your products would be within 8% of 81%.

482
00:44:01,480 --> 00:44:14,880
So you would expect your products, they could be varying from around 73 to around 90% concentrate.

483
00:44:14,880 --> 00:44:20,360
And this is why you could also look at it by a processor by processor basis, which is

484
00:44:20,360 --> 00:44:27,680
exactly what we'll be doing on Saturday, because we can do this exact same analysis processor

485
00:44:27,680 --> 00:44:31,400
by processor and over time.

486
00:44:31,400 --> 00:44:39,320
And what that would show would be, okay, are the processors who are testing the most samples,

487
00:44:39,320 --> 00:44:49,560
do they have a high concentration and do they have a low standard deviation in their concentration?

488
00:44:49,560 --> 00:45:00,280
And I would naively predict that there's a positive correlation between average concentration

489
00:45:00,280 --> 00:45:07,880
and the total number of samples you test, as well as a negative correlation between

490
00:45:07,880 --> 00:45:12,560
your standard deviation in the samples you test.

491
00:45:12,560 --> 00:45:20,000
And so I would just basically say the people who are able to operate with the lower variability,

492
00:45:20,000 --> 00:45:24,480
I conjecture they're going to be more profitable.

493
00:45:24,480 --> 00:45:31,320
We'll let the data show on Saturday.

494
00:45:31,320 --> 00:45:37,880
And also, you know, we can look at this over time to see, okay, are people honing in their

495
00:45:37,880 --> 00:45:41,880
processes over time?

496
00:45:41,880 --> 00:45:47,120
Are people's standard deviations, is this decreasing over time?

497
00:45:47,120 --> 00:45:49,120
Is it increasing?

498
00:45:49,120 --> 00:45:51,200
How about concentration?

499
00:45:51,200 --> 00:45:57,360
Is concentration increasing, decreasing, staying constant?

500
00:45:57,360 --> 00:46:07,000
Is for example, if concentration starts to become pretty stable, everybody sort of, they've

501
00:46:07,000 --> 00:46:08,480
got their extraction techniques.

502
00:46:08,480 --> 00:46:10,440
That may not be the case.

503
00:46:10,440 --> 00:46:13,600
People may be getting better and better extraction techniques over time.

504
00:46:13,600 --> 00:46:16,560
Have a standard deviation of almost 18%.

505
00:46:16,560 --> 00:46:23,040
I was just going to say rounding out for today, if you were going to compare food grade solvents

506
00:46:23,040 --> 00:46:29,760
with CO2 concentrates, you would look at the standard deviation since their average concentration

507
00:46:29,760 --> 00:46:31,840
is about the same.

508
00:46:31,840 --> 00:46:38,640
And you would see people with who are producing with food grade solvent concentrates on average

509
00:46:38,640 --> 00:46:46,160
have a standard deviation of almost 18%, which is when you think about it, quite extraordinary.

510
00:46:46,160 --> 00:46:54,040
I would suggest, you know, it wouldn't be implausible to have your food grade concentrate,

511
00:46:54,040 --> 00:47:04,520
you know, almost 100% as well as almost 10% or 15%.

512
00:47:04,520 --> 00:47:07,120
So would that actually shake out?

513
00:47:07,120 --> 00:47:08,120
May or may not.

514
00:47:08,120 --> 00:47:15,440
But long story short, that's quite a, that's quite a high, quite a high variation.

515
00:47:15,440 --> 00:47:22,680
Same with the non solvent list, the people who are doing non solvent lists, big variation.

516
00:47:22,680 --> 00:47:31,560
So if, you know, if I were to suggest an extraction method, you know, well, I may not suggest

517
00:47:31,560 --> 00:47:32,560
this one.

518
00:47:32,560 --> 00:47:38,280
So, you know, so if you were going to do food grade solvent versus CO2 concentrate, you

519
00:47:38,280 --> 00:47:44,600
know, it does look like, you know, CO2 concentrates have about the same, if not higher average

520
00:47:44,600 --> 00:47:48,520
concentration.

521
00:47:48,520 --> 00:47:53,160
And they also have a lower standard deviation.

522
00:47:53,160 --> 00:47:56,800
Keep in mind, it matters what you measure.

523
00:47:56,800 --> 00:48:04,120
So remember here, we still have a whole lot of other factors to look at.

524
00:48:04,120 --> 00:48:08,520
So in particular, the solvents.

525
00:48:08,520 --> 00:48:16,680
So I would want to know, okay, our CO2, it's got it around the same concentration, it's

526
00:48:16,680 --> 00:48:19,280
got lower standard deviation.

527
00:48:19,280 --> 00:48:21,520
What's its failure rate?

528
00:48:21,520 --> 00:48:28,200
Because you know, if we're failing for solvents, left and right, that's not going to necessarily

529
00:48:28,200 --> 00:48:30,020
be profitable.

530
00:48:30,020 --> 00:48:38,200
So this is why, you know, running a profitable business, it just matters on so many metrics,

531
00:48:38,200 --> 00:48:42,480
on so many fronts, right, you've got to worry about your concentration, you've got to worry

532
00:48:42,480 --> 00:48:48,480
about your standard deviation, you've got to worry about your failure rate.

533
00:48:48,480 --> 00:48:52,760
You know, and for example, what are consumers preferences?

534
00:48:52,760 --> 00:48:59,560
So that's why it'd be interesting to see what the shakedown is in the market over time.

535
00:48:59,560 --> 00:49:07,640
So are consumers gravitating towards one type of product or another?

536
00:49:07,640 --> 00:49:18,200
You know, it's just just so many factors here that play out in running a profitable producer,

537
00:49:18,200 --> 00:49:20,840
processor.

538
00:49:20,840 --> 00:49:26,680
And I think the lesson of the day is, you know, variability matters.

539
00:49:26,680 --> 00:49:34,320
So you know, just if you're looking at variability, you can start to make educated decisions of,

540
00:49:34,320 --> 00:49:39,920
okay, do I choose hydrocarbon concentrate, do I choose ethanol?

541
00:49:39,920 --> 00:49:42,880
And like I said, this isn't the end of the story.

542
00:49:42,880 --> 00:49:46,760
You'll want to see if you can't make subgroups.

543
00:49:46,760 --> 00:49:57,560
So see, okay, can we use the strain name and inventory name to even further classify these

544
00:49:57,560 --> 00:49:59,240
samples?

545
00:49:59,240 --> 00:50:06,080
And then also, of course, you'll want to, you know, take a look at some of the variables

546
00:50:06,080 --> 00:50:11,920
that you're screening for just to make sure that, you know, everything's passing at the

547
00:50:11,920 --> 00:50:15,880
rate that you want it to.

548
00:50:15,880 --> 00:50:22,200
And for that, I'm going to go ahead and conclude the presentation just so I'm respectful of

549
00:50:22,200 --> 00:50:23,680
everybody's time.

550
00:50:23,680 --> 00:50:26,040
However, were there any questions?

551
00:50:26,040 --> 00:50:31,760
I know it was a bit of a rocky start, but hopefully we got into gear there towards the

552
00:50:31,760 --> 00:50:32,760
end.

553
00:50:32,760 --> 00:50:34,760
So any questions?

554
00:50:34,760 --> 00:50:36,040
Comment?

555
00:50:36,040 --> 00:50:42,000
I guess just for the forecasting you're talking about on Saturday, you guys usually just run

556
00:50:42,000 --> 00:50:45,880
like simple regressions, pretty much.

557
00:50:45,880 --> 00:50:53,600
We heavily lean on Fox Jenkins slash Arima forecasting, if you may have heard of it.

558
00:50:53,600 --> 00:50:59,520
So it's essentially a time series technique.

559
00:50:59,520 --> 00:51:03,120
You just use an observation that you can track over time.

560
00:51:03,120 --> 00:51:08,040
In this case, we'll just be tracking concentrations over time month by month.

561
00:51:08,040 --> 00:51:10,040
You could potentially do day by day.

562
00:51:10,040 --> 00:51:15,900
I like month by month or even week by week, week by week is quite powerful.

563
00:51:15,900 --> 00:51:23,720
And then you just use a series of observations and you forecast them into the future.

564
00:51:23,720 --> 00:51:29,320
So you basically use things like, oh, does this series have a trend?

565
00:51:29,320 --> 00:51:33,280
Is there a moving average component?

566
00:51:33,280 --> 00:51:36,840
And then you basically, it is a regression model.

567
00:51:36,840 --> 00:51:40,600
It is a regression model at the end of the day.

568
00:51:40,600 --> 00:51:47,720
And yes, you can get quite accurate, surprisingly accurate forecasts for the future.

569
00:51:47,720 --> 00:51:56,360
So I always like to say it's a heck of a lot better having a guesstimate of the future

570
00:51:56,360 --> 00:52:00,040
than no estimate of the future.

571
00:52:00,040 --> 00:52:08,400
So definitely join us if you want, because not only will we be forecasting the average

572
00:52:08,400 --> 00:52:17,200
concentration for aggregate and by individual processor, we'll also get to forecast their

573
00:52:17,200 --> 00:52:25,640
variability, which is a more advanced statistical technique, but we'll do it simply and easily

574
00:52:25,640 --> 00:52:28,360
because that's the way things should be done.

575
00:52:28,360 --> 00:52:32,060
And we can get a nice forecast for people's variability.

576
00:52:32,060 --> 00:52:39,440
And that way we can rank order all of the processors and see, OK, who's got the highest

577
00:52:39,440 --> 00:52:41,680
average concentration week by week?

578
00:52:41,680 --> 00:52:46,280
Who's got the lowest standard deviation week by week?

579
00:52:46,280 --> 00:52:51,140
And then see how the industry is moving.

580
00:52:51,140 --> 00:52:56,180
So is there any trend in average concentrations?

581
00:52:56,180 --> 00:52:58,920
Are people getting more efficient over time?

582
00:52:58,920 --> 00:53:00,840
We'll be able to answer all those questions.

583
00:53:00,840 --> 00:53:02,440
It will be exciting.

584
00:53:02,440 --> 00:53:04,480
OK, thank you.

585
00:53:04,480 --> 00:53:05,480
Definitely.

586
00:53:05,480 --> 00:53:09,360
And if you can't make it on Saturday, just feel free to register.

587
00:53:09,360 --> 00:53:13,560
It's $1 and I'll send you all the source code and material.

588
00:53:13,560 --> 00:53:18,240
So I hope your value will be tenfold, if not more.

589
00:53:18,240 --> 00:53:22,360
So check it out, Saturday morning statistics.

590
00:53:22,360 --> 00:53:28,320
If not, then next Wednesday we'll pick up with producer cultivators.

591
00:53:28,320 --> 00:53:33,280
So we'll look more at the cultivation side next time, maybe kind of recap on what we

592
00:53:33,280 --> 00:53:34,760
did today.

593
00:53:34,760 --> 00:53:37,160
So it's extraordinary work.

594
00:53:37,160 --> 00:53:44,600
So if you have any breakthrough discoveries, questions along the way, ideas, feel free

595
00:53:44,600 --> 00:53:50,320
to share them because it would be awesome to have a collaborative effort to really dig

596
00:53:50,320 --> 00:53:57,840
deep into this data because there is so many questions to answer in not enough time.

597
00:53:57,840 --> 00:53:58,840
We'll do it.

598
00:53:58,840 --> 00:54:03,880
We'll do a lot though.

599
00:54:03,880 --> 00:54:05,100
Thank you all for coming.

600
00:54:05,100 --> 00:54:07,760
So I can't thank you enough for coming.

601
00:54:07,760 --> 00:54:10,960
So it's been fun.

602
00:54:10,960 --> 00:54:16,960
And hopefully we were able to get through some of the internet lags there and maybe

603
00:54:16,960 --> 00:54:18,560
you found it worthwhile.

604
00:54:18,560 --> 00:54:20,680
So definitely feel free to share some of your feedback.

605
00:54:20,680 --> 00:54:24,320
I'm always here if you want to send a message.

606
00:54:24,320 --> 00:54:25,880
Awesome.

607
00:54:25,880 --> 00:54:26,880
Thank you, Deacon.

608
00:54:26,880 --> 00:54:27,880
Awesome.

609
00:54:27,880 --> 00:54:28,880
Thank you all.

610
00:54:28,880 --> 00:54:56,960
Let's get in stone and have an awesome week!