1
00:00:00,000 --> 00:00:07,880
Welcome to the Cannabis Data Science Meetup Group.

2
00:00:07,880 --> 00:00:10,480
As always, you're in for a treat today.

3
00:00:10,480 --> 00:00:14,600
We're going to crunch a lot of cool new cannabis data.

4
00:00:14,600 --> 00:00:20,380
Finally got our paws on the Washington State data for 2022.

5
00:00:20,380 --> 00:00:29,960
This is spanning, I believe, January 22nd or so of 2021 through the end of November,

6
00:00:29,960 --> 00:00:33,200
the beginning of December of 2022.

7
00:00:33,200 --> 00:00:36,600
We have almost a year worth of lab results.

8
00:00:36,600 --> 00:00:38,440
I'll share these with you.

9
00:00:38,440 --> 00:00:40,240
I've been curating them.

10
00:00:40,240 --> 00:00:48,680
There's maybe between 48 and 52,000 lab tests, lab samples.

11
00:00:48,680 --> 00:00:52,760
That's almost 2 million analytes that were tested.

12
00:00:52,760 --> 00:01:00,720
Really exciting, one of the largest sets of lab results, and it's in complete population,

13
00:01:00,720 --> 00:01:03,200
which makes it real fruitful for analysis.

14
00:01:03,200 --> 00:01:10,480
Instead of having a potentially non-random sample, we have each and every lab test that

15
00:01:10,480 --> 00:01:15,280
was conducted in Washington State, which gives us a lot of statistical power.

16
00:01:15,280 --> 00:01:20,680
Without further ado, I'll go ahead and just give you a brief presentation of what I'm

17
00:01:20,680 --> 00:01:23,320
interested in researching.

18
00:01:23,320 --> 00:01:29,440
As we may know, survey data may not be the most reliable data in the world because it

19
00:01:29,440 --> 00:01:32,960
relies on people to self-report.

20
00:01:32,960 --> 00:01:41,240
There's a lot of literature in psychology and in economics that suggests that people

21
00:01:41,240 --> 00:01:47,960
may not necessarily have the incentive to be honest with what they tell you.

22
00:01:47,960 --> 00:01:54,040
What they teach you in economics is look at people's actions rather than what they may

23
00:01:54,040 --> 00:01:55,040
say.

24
00:01:55,040 --> 00:01:58,320
People's actions hopefully will reveal their preferences.

25
00:01:58,320 --> 00:02:04,960
In this case, we're interested in many things.

26
00:02:04,960 --> 00:02:08,880
One of the things we started talking about in the past few weeks were contaminants.

27
00:02:08,880 --> 00:02:11,560
We started looking at microbes.

28
00:02:11,560 --> 00:02:15,600
In the past, we've looked at residual solvents.

29
00:02:15,600 --> 00:02:23,920
Now that we have 2022 data, pesticide testing was mandated in Washington State in April

30
00:02:23,920 --> 00:02:25,600
of 2022.

31
00:02:25,600 --> 00:02:33,680
Before that, other states have pesticide testing mandated, California, Oklahoma, perhaps I

32
00:02:33,680 --> 00:02:36,120
want to say Massachusetts.

33
00:02:36,120 --> 00:02:40,920
Most states already had pesticide testing mandated, but as we know, we don't have the

34
00:02:40,920 --> 00:02:44,120
best access to data in these states.

35
00:02:44,120 --> 00:02:51,680
For example, in California, the best we know of what pesticides are cultivators using is

36
00:02:51,680 --> 00:02:53,680
this self-reported survey.

37
00:02:53,680 --> 00:03:01,200
As we can see here, there's only 59 growers, so this leaves us wanting for more information.

38
00:03:01,200 --> 00:03:03,720
This was four years ago.

39
00:03:03,720 --> 00:03:05,120
It's been some time now.

40
00:03:05,120 --> 00:03:08,000
We've got some new data and some new techniques.

41
00:03:08,000 --> 00:03:13,720
Let's see if we can't replicate this per se, but let's see if we can't come close to

42
00:03:13,720 --> 00:03:19,880
replicating an analysis of pesticides being used in cannabis.

43
00:03:19,880 --> 00:03:25,320
As always, at the Cannabis Data Science Meetup Group, we're here to get our hands on the

44
00:03:25,320 --> 00:03:26,320
actual data.

45
00:03:26,320 --> 00:03:33,200
As I said, I was curating these data points for a cannabis data project.

46
00:03:33,200 --> 00:03:37,280
This is open source, and you can get your hands on this too.

47
00:03:37,280 --> 00:03:43,960
There's one thing that I want to iron out, and then I'll get this sent to you.

48
00:03:43,960 --> 00:03:52,520
Essentially, what I want to iron out, when I first did the first pass, my first pass

49
00:03:52,520 --> 00:03:57,360
just curated data from January to the end of November.

50
00:03:57,360 --> 00:04:04,800
It only curated around 48,000 unique inventory items that were tested.

51
00:04:04,800 --> 00:04:13,360
In the second pass, I collected all the lab results from January to December 7th, and

52
00:04:13,360 --> 00:04:18,240
that puts us around 52,000 lab results.

53
00:04:18,240 --> 00:04:19,920
It's not 100%.

54
00:04:19,920 --> 00:04:27,220
The samples don't overlap perfectly, so I want to rerun this and get a better merge

55
00:04:27,220 --> 00:04:29,080
of this data for you.

56
00:04:29,080 --> 00:04:38,040
However, right now, we do have more than 48,000 inventory items that were tested, so this

57
00:04:38,040 --> 00:04:43,640
can get us a good start at the very least.

58
00:04:43,640 --> 00:04:46,800
First things first, let's look at this data.

59
00:04:46,800 --> 00:04:50,500
What do we have here?

60
00:04:50,500 --> 00:04:57,480
These are the different types of products that were tested in Washington State in 2022.

61
00:04:57,480 --> 00:05:09,440
As we can see, flower lots were the bulk of testing, with just shy of 25,000 lots of cannabis

62
00:05:09,440 --> 00:05:11,080
flower being tested.

63
00:05:11,080 --> 00:05:13,900
Second, hydrocarbon concentrates.

64
00:05:13,900 --> 00:05:20,480
This appears to be the most popular type of concentrates, with more than 6,000 inventory

65
00:05:20,480 --> 00:05:22,680
items tested.

66
00:05:22,680 --> 00:05:31,520
Surprisingly, edibles were higher on this list than I anticipated.

67
00:05:31,520 --> 00:05:35,840
As you can see, there's different types of products for sure.

68
00:05:35,840 --> 00:05:43,880
However, for today, I thought maybe we could just look at the bulk, the flower lots, and

69
00:05:43,880 --> 00:05:50,800
the hydrocarbon concentrates, just to pick two categories, just to keep things relatively

70
00:05:50,800 --> 00:05:52,300
simple.

71
00:05:52,300 --> 00:05:56,960
This can be generalized to more sample types.

72
00:05:56,960 --> 00:05:58,640
This is a nice starting point.

73
00:05:58,640 --> 00:06:00,760
What are we after?

74
00:06:00,760 --> 00:06:03,440
What does this data even look like?

75
00:06:03,440 --> 00:06:12,440
I'll just show you a random sample here, just to show you all the cool data points we have.

76
00:06:12,440 --> 00:06:14,440
There's a lot of data for each of these.

77
00:06:14,440 --> 00:06:17,480
I'll just show you some of the cool things that we know.

78
00:06:17,480 --> 00:06:21,420
Also, I may have mislabeled this as the retailer.

79
00:06:21,420 --> 00:06:24,680
This may actually be the producer.

80
00:06:24,680 --> 00:06:27,500
I want to go through and double-check this.

81
00:06:27,500 --> 00:06:29,560
We know the name.

82
00:06:29,560 --> 00:06:33,600
This was cherry sorbet.

83
00:06:33,600 --> 00:06:35,200
We know the inventory type.

84
00:06:35,200 --> 00:06:37,680
This is cherry sorbet flower.

85
00:06:37,680 --> 00:06:42,160
This one coincidentally failed lab testing.

86
00:06:42,160 --> 00:06:46,520
We've got the results here.

87
00:06:46,520 --> 00:06:53,680
We should be able to dig in to the results to see why it failed testing.

88
00:06:53,680 --> 00:07:02,280
This one may have failed for microbes because there were no pesticides or residual solvents

89
00:07:02,280 --> 00:07:03,280
detected.

90
00:07:03,280 --> 00:07:08,160
Long story short, we've got some nicely formatted data.

91
00:07:08,160 --> 00:07:10,440
We'll just keep it visual.

92
00:07:10,440 --> 00:07:16,560
I'll just show you real quick the various pesticides that were detected in cannabis

93
00:07:16,560 --> 00:07:20,960
flower in Washington state in 2022.

94
00:07:20,960 --> 00:07:29,800
As you can see, piperanol butoxide takes the lead with more than 600 detections.

95
00:07:29,800 --> 00:07:34,240
Pesticides are notoriously hard to pronounce.

96
00:07:34,240 --> 00:07:44,240
As I mentioned in prior weeks, peculiar names, there may be more magic to the show than one

97
00:07:44,240 --> 00:07:46,200
may think.

98
00:07:46,200 --> 00:07:54,600
For example, I've heard a chemist make the point that one of the reasons why these chemicals

99
00:07:54,600 --> 00:08:02,320
may have odd names are for comprehension and memory retention.

100
00:08:02,320 --> 00:08:12,520
It's going to be hard for you to recognize, say, and remember imidacloripid.

101
00:08:12,520 --> 00:08:16,160
I probably mispronounced that.

102
00:08:16,160 --> 00:08:29,160
Just to show you a bit about these pesticides here, I thought I had one of them pulled up.

103
00:08:29,160 --> 00:08:33,400
Here's piperanol butoxide.

104
00:08:33,400 --> 00:08:37,400
This was what I was searching for in the prior week.

105
00:08:37,400 --> 00:08:40,400
This I think is, what's the name of this?

106
00:08:40,400 --> 00:08:43,080
A molecular structure.

107
00:08:43,080 --> 00:08:47,900
This is a way that you can depict chemical compounds.

108
00:08:47,900 --> 00:08:56,440
As I was telling you, I'm doing my best to read up on organic chemistry and probably

109
00:08:56,440 --> 00:09:04,360
know less than your average high school chemistry student.

110
00:09:04,360 --> 00:09:10,080
From what I'm gathering, this is just a simplified diagram.

111
00:09:10,080 --> 00:09:12,440
This is organic chemistry.

112
00:09:12,440 --> 00:09:18,700
The fundamental building block of organic chemistry is carbon.

113
00:09:18,700 --> 00:09:22,260
This is what life forms are made out of.

114
00:09:22,260 --> 00:09:30,800
It's real interesting how life works because you just have these fundamental building blocks.

115
00:09:30,800 --> 00:09:41,280
You've got carbon, which are essentially depicted from my understanding by these various lines.

116
00:09:41,280 --> 00:09:47,360
I'm going to butcher the interpretation of this, but from my understanding, these lines

117
00:09:47,360 --> 00:09:53,800
represent carbon bonding to oxygen.

118
00:09:53,800 --> 00:10:02,640
Instead of having carbon bonding to hydrogen or oxygen.

119
00:10:02,640 --> 00:10:04,640
Like I said, I'm butchering this.

120
00:10:04,640 --> 00:10:13,080
I'm not going to be able to think of this on the fly, but here you can see the white

121
00:10:13,080 --> 00:10:17,480
molecules, I forget if it's hydrogen or oxygen, aren't depicted.

122
00:10:17,480 --> 00:10:20,600
It must be hydrogen, I think.

123
00:10:20,600 --> 00:10:26,440
Yes, because see here at the end, this is H3.

124
00:10:26,440 --> 00:10:30,080
So three white balls and one gray ball.

125
00:10:30,080 --> 00:10:33,600
So I think that's three hydrogens and a carbon.

126
00:10:33,600 --> 00:10:37,840
And then the red must be the oxygen.

127
00:10:37,840 --> 00:10:41,000
So here you can see the red depicted as O's.

128
00:10:41,000 --> 00:10:50,240
So long story short, you've got these organic compounds and they come in various degrees

129
00:10:50,240 --> 00:10:57,900
of complexity and add a few more hydrogen molecules or carbon molecules and these can

130
00:10:57,900 --> 00:11:00,760
turn into two different substances.

131
00:11:00,760 --> 00:11:08,720
So I think that's also part of the magic that's happening behind the curtain in that people

132
00:11:08,720 --> 00:11:14,200
may develop slightly different pesticides.

133
00:11:14,200 --> 00:11:23,520
So they may develop a pesticide that's similar to trifloxistrobin, but it's slightly different.

134
00:11:23,520 --> 00:11:26,680
They'll give it a different chemical name.

135
00:11:26,680 --> 00:11:32,960
Say trifloxistrobin's regulated, the new molecule is not.

136
00:11:32,960 --> 00:11:39,720
It can be a way to still use pesticides while avoiding detection.

137
00:11:39,720 --> 00:11:49,680
So I've heard pesticide screening be described as a cat and mouse gang where they, and it's

138
00:11:49,680 --> 00:11:52,240
not just cannabis cultivators.

139
00:11:52,240 --> 00:11:56,600
This is, I think, could be an issue in agriculture in general.

140
00:11:56,600 --> 00:12:01,700
People engaged in agriculture, we've talked about this in the past, contaminants are of

141
00:12:01,700 --> 00:12:03,120
top concern.

142
00:12:03,120 --> 00:12:12,400
So they're always chasing new innovative ways to fight off pests and disease and molds and

143
00:12:12,400 --> 00:12:13,400
fungi.

144
00:12:13,400 --> 00:12:16,760
And so they're coming up with new chemicals.

145
00:12:16,760 --> 00:12:25,600
These chemicals are then being studied by toxicologists and then regulators will put

146
00:12:25,600 --> 00:12:33,120
limits on these and then say they'll be screened for in cannabis and then people will start

147
00:12:33,120 --> 00:12:37,240
using a new substance if they're failing a lot.

148
00:12:37,240 --> 00:12:40,280
So this is just something to watch for.

149
00:12:40,280 --> 00:12:46,960
Okay, so we haven't done too much yet except just describe the data.

150
00:12:46,960 --> 00:12:49,440
And that's often a useful first step.

151
00:12:49,440 --> 00:12:55,380
For example, this data, it's all publicly available and sitting right there in front

152
00:12:55,380 --> 00:12:56,380
of us.

153
00:12:56,380 --> 00:13:05,440
However, it took a good bit of curation simply to be able to count the number of pesticides

154
00:13:05,440 --> 00:13:08,240
that were detected.

155
00:13:08,240 --> 00:13:17,160
Well, not only can we count them, but we also know what type of product.

156
00:13:17,160 --> 00:13:25,720
So right now this isn't differentiating between say flour and hydrocarbon concentrates.

157
00:13:25,720 --> 00:13:34,680
Well, I'm going to show you a nifty way that you can model this data.

158
00:13:34,680 --> 00:13:44,600
Okay, so we've talked about how cultivators may want to view the risk, the chance of failing

159
00:13:44,600 --> 00:13:48,600
as a business risk and to factor that in.

160
00:13:48,600 --> 00:14:02,240
So for example, if you're cultivating flour and you value your flour at, I forget what

161
00:14:02,240 --> 00:14:08,560
the going rate is, but it's $1,000 a pound or $2,000 a pound.

162
00:14:08,560 --> 00:14:13,600
So that's your value of a pound of flour.

163
00:14:13,600 --> 00:14:17,040
I think they changed the flour lot sizes.

164
00:14:17,040 --> 00:14:23,360
And in fact, I saved them, but the old ones were a five pound lot.

165
00:14:23,360 --> 00:14:25,000
And so I think they changed them.

166
00:14:25,000 --> 00:14:27,920
I think some lots are up to 50 pounds now.

167
00:14:27,920 --> 00:14:33,080
But let's say you've got a five pound lot.

168
00:14:33,080 --> 00:14:41,320
Each pound will be parsimonious and say each pound is valued to you at $1,000.

169
00:14:41,320 --> 00:14:49,640
So a five pound lot has a value to your cultivator of $5,000.

170
00:14:49,640 --> 00:14:51,600
Okay.

171
00:14:51,600 --> 00:15:01,280
Well, if we were just going to do a simple count here, we know, let's just do a back

172
00:15:01,280 --> 00:15:03,880
of the envelope here real quick, right?

173
00:15:03,880 --> 00:15:07,200
So there's 600 give or take.

174
00:15:07,200 --> 00:15:20,000
Actually, let's find the exact number just to pesticides and perennial butoxide.

175
00:15:20,000 --> 00:15:21,000
Okay.

176
00:15:21,000 --> 00:15:22,000
644.

177
00:15:22,000 --> 00:15:23,000
Okay.

178
00:15:23,000 --> 00:15:32,960
Well, there's only, well, not only there's 48,000 observations and 644 of them failed

179
00:15:32,960 --> 00:15:35,400
for piperennial butoxide.

180
00:15:35,400 --> 00:15:41,760
So if you were just going to do a simple back of the envelope calculation, you could just

181
00:15:41,760 --> 00:15:46,240
say, oh, well, I know how many failed for piperennial butoxide.

182
00:15:46,240 --> 00:15:48,560
I know the total amount tested.

183
00:15:48,560 --> 00:15:57,220
Well, that's around 1.3% that are failing for piperennial butoxide.

184
00:15:57,220 --> 00:16:08,360
So if you have a $5,000 value flower lot and your risk is $66.

185
00:16:08,360 --> 00:16:16,960
So that's a $66 cost, expected cost of pesticide screening.

186
00:16:16,960 --> 00:16:26,120
And so $66 out of 5,000, right?

187
00:16:26,120 --> 00:16:29,020
We backed out the failure rate there.

188
00:16:29,020 --> 00:16:36,760
So pesticide screening imposes around a 1.3% cost.

189
00:16:36,760 --> 00:16:38,240
This is an unconditional cost.

190
00:16:38,240 --> 00:16:40,400
We haven't conditioned it on anything.

191
00:16:40,400 --> 00:16:43,620
We haven't conditioned on the product type.

192
00:16:43,620 --> 00:16:48,360
We have conditioned on where the grower may be located.

193
00:16:48,360 --> 00:16:51,460
So this is completely unconditional.

194
00:16:51,460 --> 00:17:00,280
The cultivator has an unconditional risk of 1.3% if they're growing a $5,000 lot.

195
00:17:00,280 --> 00:17:03,200
That's a $66 cost.

196
00:17:03,200 --> 00:17:10,240
It may not seem large, but margins are tight in the cannabis industry.

197
00:17:10,240 --> 00:17:12,320
And this adds up.

198
00:17:12,320 --> 00:17:21,200
So for example, if all of a sudden maybe you're selling not just $5,000 worth of product,

199
00:17:21,200 --> 00:17:30,960
but maybe you're selling $500,000 worth of product here, well, all of a sudden, now that's

200
00:17:30,960 --> 00:17:34,840
a $6,500 cost.

201
00:17:34,840 --> 00:17:36,880
$6,500.

202
00:17:36,880 --> 00:17:39,200
That's non-negligible.

203
00:17:39,200 --> 00:17:51,320
So if, say, your revenue is $500,000, it's a decent sized cost.

204
00:17:51,320 --> 00:17:56,720
And if all of a sudden you're faced with this cost and it's going to be lumpy, so that's

205
00:17:56,720 --> 00:18:00,240
assuming this is all evened out.

206
00:18:00,240 --> 00:18:07,520
Well, this is going to be a lumpy cost in that you may not fail pesticide screening

207
00:18:07,520 --> 00:18:17,160
for a long time, and then one month you fail and you have to now bear that entire cost

208
00:18:17,160 --> 00:18:19,160
in that one month.

209
00:18:19,160 --> 00:18:26,120
So this is what I'm trying to drive home in that just want to bear in mind that there

210
00:18:26,120 --> 00:18:27,120
is a cost.

211
00:18:27,120 --> 00:18:30,960
And you can minimize this cost.

212
00:18:30,960 --> 00:18:33,880
So for example, this is unconditional.

213
00:18:33,880 --> 00:18:41,600
Maybe we can find some conditions that may lower your probability of failing pesticide

214
00:18:41,600 --> 00:18:42,600
testing.

215
00:18:42,600 --> 00:18:46,000
And maybe you can manage those conditions.

216
00:18:46,000 --> 00:18:55,440
In economics, we learn that profit maximization is equivalent to cost minimization.

217
00:18:55,440 --> 00:18:59,820
And that's what I often try to stress to people in business.

218
00:18:59,820 --> 00:19:08,240
So many times people are worried about maximizing their profit, just trying to get as most money,

219
00:19:08,240 --> 00:19:10,720
as most revenue as possible.

220
00:19:10,720 --> 00:19:14,040
And they don't take into consideration their costs.

221
00:19:14,040 --> 00:19:24,360
And I can see people run awry with this, run astray with this.

222
00:19:24,360 --> 00:19:32,800
And an effective way to achieve the same result would just be to have a laser focus on keeping

223
00:19:32,800 --> 00:19:38,480
your costs to a minimum while engaging in as much business as you can.

224
00:19:38,480 --> 00:19:39,480
Cool.

225
00:19:39,480 --> 00:19:41,600
So I have droned on enough about that.

226
00:19:41,600 --> 00:19:45,520
Let's get back to some charts and actually some cool statistics here.

227
00:19:45,520 --> 00:19:50,640
So now I'm going to introduce conditional statistics.

228
00:19:50,640 --> 00:19:54,440
And these are the two things I tell you with, right?

229
00:19:54,440 --> 00:19:58,360
Start with simple statistics.

230
00:19:58,360 --> 00:20:00,400
So these are counts.

231
00:20:00,400 --> 00:20:03,960
These are actually conditional counts.

232
00:20:03,960 --> 00:20:07,880
These are counts by analyte.

233
00:20:07,880 --> 00:20:10,280
So those are conditional statistics.

234
00:20:10,280 --> 00:20:13,840
A count, a total is a statistic.

235
00:20:13,840 --> 00:20:19,440
A count by an analyte is a conditional statistic.

236
00:20:19,440 --> 00:20:25,000
They're amazingly powerful, interesting, and simple.

237
00:20:25,000 --> 00:20:28,440
And then you can just keep adding conditions.

238
00:20:28,440 --> 00:20:35,680
Well, as I alluded to earlier, perhaps product type matters.

239
00:20:35,680 --> 00:20:39,080
Well, how are we going to model this?

240
00:20:39,080 --> 00:20:49,360
The way I've encouraged people to go about modeling simple data in the past is binary.

241
00:20:49,360 --> 00:20:51,120
Zero, one.

242
00:20:51,120 --> 00:20:56,960
This is really going back to the basics in statistics.

243
00:20:56,960 --> 00:21:01,720
I'll be loading all the binary statistics episodes shortly.

244
00:21:01,720 --> 00:21:03,120
So those will be coming.

245
00:21:03,120 --> 00:21:10,880
This is just a clever way of modeling the data that opens it up for clever statistical

246
00:21:10,880 --> 00:21:11,880
exercises.

247
00:21:11,880 --> 00:21:12,880
OK.

248
00:21:12,880 --> 00:21:14,480
So what are we looking at here?

249
00:21:14,480 --> 00:21:24,840
So we're seeing that almost a little less than 48,000 of these samples had no piperanial

250
00:21:24,840 --> 00:21:27,920
butoxide detected.

251
00:21:27,920 --> 00:21:32,120
And a little more than something went awry in my counts here.

252
00:21:32,120 --> 00:21:34,720
But I'm just going to power on.

253
00:21:34,720 --> 00:21:40,760
And I want you to encourage you to come back through and double check my work like a good

254
00:21:40,760 --> 00:21:41,760
scientist would.

255
00:21:41,760 --> 00:21:51,040
And so 600 of these are coded as one, meaning that piperanial butoxide was detected.

256
00:21:51,040 --> 00:21:52,840
Fantastic.

257
00:21:52,840 --> 00:22:03,480
Well, when you have simple binary data, zero or one, you can estimate a probit model.

258
00:22:03,480 --> 00:22:12,680
And this would basically predict the probability of being zero versus the probability of being

259
00:22:12,680 --> 00:22:20,080
one conditional on your explanatory variables.

260
00:22:20,080 --> 00:22:22,240
This is exactly what we want.

261
00:22:22,240 --> 00:22:32,720
We want to know the probability that piperanial butoxide will be detected conditional on the

262
00:22:32,720 --> 00:22:34,600
sample type.

263
00:22:34,600 --> 00:22:43,360
We already know that the unconditional probability of piperanial butoxide being detected is around

264
00:22:43,360 --> 00:22:44,360
1.3%.

265
00:22:44,360 --> 00:22:50,280
Well, what's the probability in flour?

266
00:22:50,280 --> 00:22:54,400
What's the probability in hydrocarbon concentrates?

267
00:22:54,400 --> 00:23:00,080
You could do this again by just taking another conditional count.

268
00:23:00,080 --> 00:23:04,800
I'm simply introducing the probit model to you as a new...

269
00:23:04,800 --> 00:23:10,920
Well, we've already discovered this tool, but it's another tool that you can use to

270
00:23:10,920 --> 00:23:14,040
calculate conditional probabilities.

271
00:23:14,040 --> 00:23:20,120
And it's a tool that we can load up with explanatory variables.

272
00:23:20,120 --> 00:23:26,760
And I will save the big takeaway for the end of the day.

273
00:23:26,760 --> 00:23:31,600
You'll tease that the magic is going to happen with those explanatory variables.

274
00:23:31,600 --> 00:23:36,200
Now, I'm isolating a subsample here.

275
00:23:36,200 --> 00:23:43,720
So essentially, just want to look at flour lots and hydrocarbon concentrates.

276
00:23:43,720 --> 00:23:44,920
Why?

277
00:23:44,920 --> 00:23:46,880
Just to keep things simple.

278
00:23:46,880 --> 00:23:51,960
There's no reason why we couldn't look at more inventory types.

279
00:23:51,960 --> 00:23:57,440
These were just the two most frequent inventory types.

280
00:23:57,440 --> 00:24:00,480
And so just wanted to use them for a demonstration.

281
00:24:00,480 --> 00:24:02,840
Okay, enough of me rambling on.

282
00:24:02,840 --> 00:24:04,720
Let's estimate this model.

283
00:24:04,720 --> 00:24:07,800
So it's real simple to do.

284
00:24:07,800 --> 00:24:13,960
Well, and this is actually why I encourage you, if you really want to cut your teeth

285
00:24:13,960 --> 00:24:23,040
programming and learn statistics as intimately as possible, then I would challenge you to

286
00:24:23,040 --> 00:24:26,080
code a probit model.

287
00:24:26,080 --> 00:24:31,760
Because here we're just using one from the stats model package.

288
00:24:31,760 --> 00:24:34,640
And we can do it in one line of code.

289
00:24:34,640 --> 00:24:41,040
That's phenomenal, because now we can add this to our favorite programming algorithms

290
00:24:41,040 --> 00:24:48,320
and we could calculate thousands upon thousands of probit models in a short fashion.

291
00:24:48,320 --> 00:24:52,080
But this is a useful exercise.

292
00:24:52,080 --> 00:24:58,360
You'll have to hit, say Wikipedia or your favorite statistics textbook.

293
00:24:58,360 --> 00:25:08,560
And it's a non-trivial task, but it's completely in the realm of possible.

294
00:25:08,560 --> 00:25:13,600
And if you're up for the challenge, it's more of an exercise, right?

295
00:25:13,600 --> 00:25:21,640
Because at the end of the day, I would recommend to use the stats models, probit model, because

296
00:25:21,640 --> 00:25:27,920
one thing I love about open source is when it's used by many, many thousands and thousands,

297
00:25:27,920 --> 00:25:33,600
perhaps millions of people, then it becomes robust.

298
00:25:33,600 --> 00:25:40,160
And perhaps little quirks get discovered that you couldn't figure out by yourself.

299
00:25:40,160 --> 00:25:46,160
So probably the stats model's probit model would probably be more robust than any probit

300
00:25:46,160 --> 00:25:48,480
model that you may be able to write.

301
00:25:48,480 --> 00:25:51,120
You could probably write a better one, I'm sure.

302
00:25:51,120 --> 00:25:55,120
But as I said, it's a good exercise in both programming and statistics.

303
00:25:55,120 --> 00:25:56,840
And you can then double check it.

304
00:25:56,840 --> 00:26:01,040
You can basically double check that the probit model you wrote is equivalent.

305
00:26:01,040 --> 00:26:03,720
So that's just a tangent on programming.

306
00:26:03,720 --> 00:26:04,840
Here it is.

307
00:26:04,840 --> 00:26:10,520
And I'll just talk about the model and then show you the actual takeaway.

308
00:26:10,520 --> 00:26:18,720
So when you estimate a probit model, OK, so we've got our 30,000 observations.

309
00:26:18,720 --> 00:26:30,320
Our dependent variable, remember, that is going to be a binary, 0 or 1, mostly zeros.

310
00:26:30,320 --> 00:26:32,600
And in fact, that's actually a problem.

311
00:26:32,600 --> 00:26:35,920
And there may be better models out there.

312
00:26:35,920 --> 00:26:46,160
That's often a criticism in that if your zeros and ones aren't balanced, then that can lead

313
00:26:46,160 --> 00:26:47,160
to issues.

314
00:26:47,160 --> 00:26:51,160
But this can at least give us a good start.

315
00:26:51,160 --> 00:26:53,240
OK, so what do we have here?

316
00:26:53,240 --> 00:26:56,760
Well, we've got various coefficients here.

317
00:26:56,760 --> 00:27:04,000
And don't read too much into the coefficients alone, because the probit model, you actually

318
00:27:04,000 --> 00:27:06,240
have to evaluate it.

319
00:27:06,240 --> 00:27:11,200
The one thing that you can see is that this has a positive sign on it.

320
00:27:11,200 --> 00:27:17,980
So right out of the gate, we're thinking, OK, hydrocarbons may actually be detecting.

321
00:27:17,980 --> 00:27:23,000
We may be detecting piperannual butoxide more often in hydrocarbon concentrates.

322
00:27:23,000 --> 00:27:25,920
OK, well, how much?

323
00:27:25,920 --> 00:27:27,480
Well, guess what?

324
00:27:27,480 --> 00:27:29,320
It's actually super easy.

325
00:27:29,320 --> 00:27:38,700
So the way you go about this is you find what's called marginal probability of detecting for

326
00:27:38,700 --> 00:27:42,360
one category, and then you would do it for the other.

327
00:27:42,360 --> 00:27:47,680
And if these weren't categorical variables, then you would use the mean.

328
00:27:47,680 --> 00:27:55,840
So you would say, oh, what would be the probability of, say, your regressor was number of plants.

329
00:27:55,840 --> 00:28:03,960
What would be the probability of failing for piperannual butoxide given the, say, average

330
00:28:03,960 --> 00:28:05,840
number of plants?

331
00:28:05,840 --> 00:28:08,880
And so you can find different marginal probabilities.

332
00:28:08,880 --> 00:28:11,760
OK, so enough of me talking.

333
00:28:11,760 --> 00:28:13,240
Let's look at the numbers here.

334
00:28:13,240 --> 00:28:16,720
And we've got, OK, what's the marginal probability of flour?

335
00:28:16,720 --> 00:28:23,840
Well, here, I'll just plot all of this out, because this way, we'll have a nice visualization

336
00:28:23,840 --> 00:28:25,200
while I'm talking about it.

337
00:28:25,200 --> 00:28:34,120
So all I did here was evaluate the model at the mean of each of the groups.

338
00:28:34,120 --> 00:28:39,880
So here, flour, we've just got a constant.

339
00:28:39,880 --> 00:28:43,560
And hydrocarbon concentrates is always 0.

340
00:28:43,560 --> 00:28:45,280
That's a dummy variable.

341
00:28:45,280 --> 00:28:50,160
And then, of course, hydrocarbon concentrates has a constant.

342
00:28:50,160 --> 00:28:52,840
And hydrocarbon concentrates is always 1.

343
00:28:52,840 --> 00:29:00,960
So really simple to calculate the marginal probabilities for dummy variables.

344
00:29:00,960 --> 00:29:07,360
As I said, it's a bit more complicated if you're, say, you want to find the average

345
00:29:07,360 --> 00:29:08,840
marginal probability.

346
00:29:08,840 --> 00:29:12,040
Or I may be butchering my language here.

347
00:29:12,040 --> 00:29:18,600
But if you want to find the marginal probability of a continuous variable, I want to say you've

348
00:29:18,600 --> 00:29:20,000
used the mean.

349
00:29:20,000 --> 00:29:26,080
So let's just go ahead and interpret the results here, because I think this is an interesting

350
00:29:26,080 --> 00:29:30,960
takeaway that's non-obvious.

351
00:29:30,960 --> 00:29:39,960
If you talk to a chemist, kind of wanted to bounce this off of our good friends over at

352
00:29:39,960 --> 00:29:46,000
MCR Labs to see if this may be in line with what they've observed.

353
00:29:46,000 --> 00:29:52,920
So if any of you want to reach out to MCR Labs and see, hey, is this what you've observed

354
00:29:52,920 --> 00:29:58,840
or something similar with piperanial butoxide in cannabis?

355
00:29:58,840 --> 00:30:07,480
Because I think this is an interesting insight that is non-obvious and requires a good bit

356
00:30:07,480 --> 00:30:15,280
of data curation as well as clever conditional statistics to get to.

357
00:30:15,280 --> 00:30:20,360
And I think these are the types of insights that people in the cannabis industry could

358
00:30:20,360 --> 00:30:27,440
really benefit from, and they may be completely oblivious to the fact that they can get their

359
00:30:27,440 --> 00:30:29,440
hands on these statistics.

360
00:30:29,440 --> 00:30:30,440
So what is this?

361
00:30:30,440 --> 00:30:40,740
So this is now the conditional probability of piperanial butoxide being detected in your

362
00:30:40,740 --> 00:30:41,740
cannabis flower.

363
00:30:41,740 --> 00:30:49,680
Ooh, I also want to walk back that cost I was talking about earlier.

364
00:30:49,680 --> 00:30:57,120
This is just the probability of piperanial butoxide being detected.

365
00:30:57,120 --> 00:31:03,840
You may not necessarily fail quality assurance just because it's detected.

366
00:31:03,840 --> 00:31:14,240
So depending on what the cultivator's internal quality control is and the state limits, they

367
00:31:14,240 --> 00:31:22,800
may not actually face the 1.3% risk of losing the batch outright.

368
00:31:22,800 --> 00:31:27,280
But it at least gets you thinking in this ballpark.

369
00:31:27,280 --> 00:31:35,680
So this is sort of my main insight for today that, as I said, this may be apparent to chemists

370
00:31:35,680 --> 00:31:39,680
in the laboratory but may not be obvious to other people.

371
00:31:39,680 --> 00:31:48,720
Piperanial butoxide is detected at a higher rate in hydrocarbon concentrates than in flower

372
00:31:48,720 --> 00:31:49,720
lots.

373
00:31:49,720 --> 00:31:52,640
Well, why is this?

374
00:31:52,640 --> 00:32:03,640
One thing that's happening is, well, in hydrocarbon concentrates, yes, you are concentrating THC

375
00:32:03,640 --> 00:32:05,840
in other cannabinoids.

376
00:32:05,840 --> 00:32:15,920
Well, if you look at, and this is why the chemistry is mighty, mightily important.

377
00:32:15,920 --> 00:32:26,400
Look at this, the boiling point of piperanial butoxide is 356 degrees Fahrenheit.

378
00:32:26,400 --> 00:32:34,320
What is the boiling point of tetrahydrocannabinoid acid?

379
00:32:34,320 --> 00:32:37,640
I want to say we can find the boiling point, 180 degrees C.

380
00:32:37,640 --> 00:32:39,640
Okay, yes.

381
00:32:39,640 --> 00:32:41,320
So check this out.

382
00:32:41,320 --> 00:32:50,520
So THC boils at 155 degrees Celsius.

383
00:32:50,520 --> 00:32:54,960
So that's the point where THC turns into vapor.

384
00:32:54,960 --> 00:33:05,360
Well, if you're trying to keep THC as a liquid, a hydrocarbon concentrate, then you'd want

385
00:33:05,360 --> 00:33:12,080
to keep the temperature less than that when you're going through extraction.

386
00:33:12,080 --> 00:33:23,200
Well, if the temperature is less than 160 degrees Celsius at all times, piperanial butoxide

387
00:33:23,200 --> 00:33:26,760
does not boil off.

388
00:33:26,760 --> 00:33:35,000
So that means in this process, the idea is other compounds are being boiled off.

389
00:33:35,000 --> 00:33:44,440
So compounds you don't want, let's say you raise the temperature to 150 degrees Celsius.

390
00:33:44,440 --> 00:33:50,160
Well, a lot of the plant matter, I think, is going to start boiling off.

391
00:33:50,160 --> 00:33:52,920
You're going to have other solvents boil off.

392
00:33:52,920 --> 00:33:54,240
Well, guess what?

393
00:33:54,240 --> 00:33:57,760
Your piperanial butoxide will remain.

394
00:33:57,760 --> 00:34:07,040
And so what ends up happening is not only are you concentrating THC into hydrocarbon

395
00:34:07,040 --> 00:34:17,400
concentrates, you're also concentrating any high boiling point pesticides that may be

396
00:34:17,400 --> 00:34:18,400
in the product.

397
00:34:18,400 --> 00:34:25,760
So this may be a reason that concentrates have a higher detection rate.

398
00:34:25,760 --> 00:34:34,840
Another reason is perhaps flour that is more contaminated is more likely to be processed

399
00:34:34,840 --> 00:34:42,240
into concentrates than flour that's not.

400
00:34:42,240 --> 00:34:44,120
There's a lot going on here.

401
00:34:44,120 --> 00:34:52,520
However, I just wanted to point this out that if you are, say, a processor, then you may

402
00:34:52,520 --> 00:35:02,240
want to be a bit more concerned about piperanial butoxide detection than a cultivator.

403
00:35:02,240 --> 00:35:06,600
And just keep this in mind as you're factoring out your costs.

404
00:35:06,600 --> 00:35:13,800
As we pointed out, it's not the greatest cost in the world, but this is getting up there.

405
00:35:13,800 --> 00:35:21,320
This is now around, let's say you have to destroy your batch if it's detected, because

406
00:35:21,320 --> 00:35:23,440
maybe that's just your internal policy.

407
00:35:23,440 --> 00:35:27,600
Well, now you're facing that around 2% of the time.

408
00:35:27,600 --> 00:35:33,640
That's going to pose a non-negligible cost at that stage.

409
00:35:33,640 --> 00:35:36,680
And so just something to think about.

410
00:35:36,680 --> 00:35:42,000
And then basically, what's my big takeaway from this?

411
00:35:42,000 --> 00:35:49,360
This was really just an exercise, because as I said, you can reach out to MCR labs or

412
00:35:49,360 --> 00:35:55,960
your favorite chemists in the cannabis space, and chances are they'll be able to tell you,

413
00:35:55,960 --> 00:35:58,200
oh, yes, hydrocarbon concentrates.

414
00:35:58,200 --> 00:36:02,600
Yes, we detect pesticides at higher rates than in flour.

415
00:36:02,600 --> 00:36:04,760
Yes, piperanial butoxide.

416
00:36:04,760 --> 00:36:10,880
It may or may not be the most detected in other states, but chances are they may have

417
00:36:10,880 --> 00:36:11,880
seen that.

418
00:36:11,880 --> 00:36:19,080
So this may not be anything new under the sun, but this is a new tool that you can use.

419
00:36:19,080 --> 00:36:28,320
It's called the Pesticide Study for predicting the probability of pesticide failure.

420
00:36:28,320 --> 00:36:33,200
And that is exactly what I've been wanting to study for years.

421
00:36:33,200 --> 00:36:42,600
So probably since about 2018 or so, the same time this study came out, and it may not be

422
00:36:42,600 --> 00:36:51,400
2018, I've been definitely interested in what pesticides are being used in cannabis,

423
00:36:51,400 --> 00:36:59,120
what's the detection rate, what are the amounts being detected, and what are the factors that

424
00:36:59,120 --> 00:37:02,640
are causing pesticide contamination.

425
00:37:02,640 --> 00:37:10,120
And a hypothesis of mine that I've wanted to explore since 2018, but we just haven't

426
00:37:10,120 --> 00:37:22,520
had the data, is proximity to say agriculture or industry a potential factor in failing

427
00:37:22,520 --> 00:37:25,760
for pesticide testing?

428
00:37:25,760 --> 00:37:32,040
We have heard about pesticide drip in the past, and something that I'm interested in

429
00:37:32,040 --> 00:37:39,720
is let's say, is your risk of failing pesticides location dependent?

430
00:37:39,720 --> 00:37:46,000
So depending on where cultivators are in the state, or some of them at higher risk than

431
00:37:46,000 --> 00:37:50,720
others for pesticide screening.

432
00:37:50,720 --> 00:37:58,860
And so some of the conditions I wanted to look at were say, proximity to a dairy farm,

433
00:37:58,860 --> 00:38:01,120
or various types of farms.

434
00:38:01,120 --> 00:38:08,080
So people farm cherries and apples and all sorts of different commodities in Washington

435
00:38:08,080 --> 00:38:09,080
State.

436
00:38:09,080 --> 00:38:13,840
And what you could do is you could calculate those data points.

437
00:38:13,840 --> 00:38:19,600
So we know where the cultivators are with latitude and longitude, and then you could

438
00:38:19,600 --> 00:38:27,640
calculate the distance to say, the nearest dairy farm, or the nearest road, the nearest

439
00:38:27,640 --> 00:38:35,480
metropolitan area, and just try to look at all the different geographical factors that

440
00:38:35,480 --> 00:38:37,120
may be at play.

441
00:38:37,120 --> 00:38:41,020
And many of that may be insignificant.

442
00:38:41,020 --> 00:38:46,880
There may be no significant relationship on failure rate.

443
00:38:46,880 --> 00:38:51,680
But as I said, we're looking for something obvious here.

444
00:38:51,680 --> 00:38:55,560
And this piggybacks on the lesson of last week.

445
00:38:55,560 --> 00:39:02,920
When you're working with statistics, differences should hopefully be readily apparent pretty

446
00:39:02,920 --> 00:39:05,040
quickly out of the gate.

447
00:39:05,040 --> 00:39:11,920
And so here, pretty quickly out of the gate, we see, OK, hydrocarbon concentrates, flower

448
00:39:11,920 --> 00:39:12,920
lots.

449
00:39:12,920 --> 00:39:18,360
It looks like there's something structurally different about pesticide detection in these

450
00:39:18,360 --> 00:39:19,360
two.

451
00:39:19,360 --> 00:39:28,520
Well, now is the fun, because now we get to hunt for other explanatory variables that

452
00:39:28,520 --> 00:39:30,440
may be significant.

453
00:39:30,440 --> 00:39:37,120
And this is how you could potentially get a paper published, which would be quite an

454
00:39:37,120 --> 00:39:39,840
interesting feat.

455
00:39:39,840 --> 00:39:45,640
So I think last year, there was thousands of papers written on cannabis.

456
00:39:45,640 --> 00:39:53,400
And that shows you there's, one, a lot of interest and demand for this type of study.

457
00:39:53,400 --> 00:39:58,780
So perhaps if you think of a novel topic, then this is something that you could write

458
00:39:58,780 --> 00:40:01,600
about and that people could study.

459
00:40:01,600 --> 00:40:07,520
Or if you don't want to research, maybe you can think of a business model or a business

460
00:40:07,520 --> 00:40:11,440
endeavor to profit off of this information.

461
00:40:11,440 --> 00:40:20,320
So say you do figure out some structural risk for failing for pesticides.

462
00:40:20,320 --> 00:40:25,560
Well, maybe you could reach out to all the cultivators who are at high risk and just

463
00:40:25,560 --> 00:40:26,560
let them know.

464
00:40:26,560 --> 00:40:32,400
So you say, oh, maybe being close to a dairy farm puts you at high risk of failing for

465
00:40:32,400 --> 00:40:33,400
pesticide X.

466
00:40:33,400 --> 00:40:37,920
Well, you could reach out to all the cultivators near the dairy farm and just let them know,

467
00:40:37,920 --> 00:40:42,520
like, hey, you may want to be on the lookout for this pesticide.

468
00:40:42,520 --> 00:40:48,520
That's maybe not the best business endeavor ever, but I'll let you brainstorm and think

469
00:40:48,520 --> 00:40:49,520
of them.

470
00:40:49,520 --> 00:40:56,360
But long story short, that's what I have been wanting to study, as I said, for almost four

471
00:40:56,360 --> 00:40:58,000
years now.

472
00:40:58,000 --> 00:41:09,560
What are the geographic factors that may affect pesticide detection or failing for pesticides?

473
00:41:09,560 --> 00:41:10,960
There may be none, right?

474
00:41:10,960 --> 00:41:14,160
That's really a good null hypothesis, right?

475
00:41:14,160 --> 00:41:23,480
So the null hypothesis would be no geographic forces matter, right?

476
00:41:23,480 --> 00:41:28,520
Being close to a dairy farm doesn't matter, being close to a metropolitan area doesn't

477
00:41:28,520 --> 00:41:29,520
matter.

478
00:41:29,520 --> 00:41:35,940
So we would start with the null hypothesis of no effect and then see if we can't find

479
00:41:35,940 --> 00:41:42,440
evidence that any of these places may have a statistical effect on, say, failing for

480
00:41:42,440 --> 00:41:43,440
pesticides.

481
00:41:43,440 --> 00:41:48,000
So hopefully you found that interesting.

482
00:41:48,000 --> 00:41:54,080
I'm not sure if this is of widespread interest to people, but this was something that has

483
00:41:54,080 --> 00:41:58,920
been on my mind for a handful of years.

484
00:41:58,920 --> 00:42:04,220
And now we finally have the data that we can actually answer this question.

485
00:42:04,220 --> 00:42:06,360
So here's a start.

486
00:42:06,360 --> 00:42:11,000
We looked at one factor sample type.

487
00:42:11,000 --> 00:42:14,200
Well, now let's expand this, right?

488
00:42:14,200 --> 00:42:16,320
And that's the nice thing about the ProBit model.

489
00:42:16,320 --> 00:42:19,440
We can just pack in explanatory variables.

490
00:42:19,440 --> 00:42:21,920
So now is the fun time, right?

491
00:42:21,920 --> 00:42:30,440
And so this is, as I said, this is how statisticians have a fun conversation at lunch.

492
00:42:30,440 --> 00:42:38,600
You just start talking about, oh, you know, what explanatory variables may matter and

493
00:42:38,600 --> 00:42:40,120
do we have data on those?

494
00:42:40,120 --> 00:42:44,800
And if we don't have data on those, how can we proxy those or how can we measure that

495
00:42:44,800 --> 00:42:48,240
or how can we measure this?

496
00:42:48,240 --> 00:42:52,720
So now that you've got a good research question, you can start to brainstorm all the factors

497
00:42:52,720 --> 00:42:54,760
that are important.

498
00:42:54,760 --> 00:42:57,080
Go ahead and conclude it there.

499
00:42:57,080 --> 00:43:03,480
I feel like I'm rambling for the amount of material that we've covered.

500
00:43:03,480 --> 00:43:09,560
The goal, or not the goal, the insight of the day that I thought was fruitful.

501
00:43:09,560 --> 00:43:15,960
So last week's insight, I think, was helpful, but maybe not the most motivational ever.

502
00:43:15,960 --> 00:43:20,840
But as far as statistics goes, it was important.

503
00:43:20,840 --> 00:43:28,320
Well, this insight of the day is super motivational, and that is persist.

504
00:43:28,320 --> 00:43:29,320
Just keep trying.

505
00:43:29,320 --> 00:43:36,080
And it's going to be surprising how long some things take before they pay off.

506
00:43:36,080 --> 00:43:37,800
And what am I talking about?

507
00:43:37,800 --> 00:43:41,360
Well, I want to share with you some big news.

508
00:43:41,360 --> 00:43:45,000
So as you know, Candletics writes open source software.

509
00:43:45,000 --> 00:43:53,840
One of the tools that we've built was a open source software development kit for integrating

510
00:43:53,840 --> 00:43:57,360
with the metric API.

511
00:43:57,360 --> 00:44:04,840
And metric is the cannabis traceability software that is used in the vast majority of states

512
00:44:04,840 --> 00:44:07,520
that have legalized cannabis.

513
00:44:07,520 --> 00:44:18,280
And a long time ago, many moons ago, in 2021, so this was in March of 2021, I wrote this

514
00:44:18,280 --> 00:44:27,280
metric software development kit and got Candletics verified in a handful of states to interface

515
00:44:27,280 --> 00:44:28,840
with metric.

516
00:44:28,840 --> 00:44:35,280
Well, it takes a while for things to proliferate on the internet.

517
00:44:35,280 --> 00:44:40,760
And lately, the metric software development kit has been getting a lot of interest.

518
00:44:40,760 --> 00:44:48,640
And finally, Candletics has contracted to integrate a company with the metrics API.

519
00:44:48,640 --> 00:44:53,280
And this is really exciting, big news.

520
00:44:53,280 --> 00:44:57,240
One, just personally, it's a new contract.

521
00:44:57,240 --> 00:45:05,200
But two, this is a demonstration of this open source software being used in the wild.

522
00:45:05,200 --> 00:45:13,800
And I've teased a lot that, yes, I think this is software that you can use to make money.

523
00:45:13,800 --> 00:45:16,600
And I think I'm finally demonstrating that.

524
00:45:16,600 --> 00:45:23,800
So the contract is specifically just to help the person use this software.

525
00:45:23,800 --> 00:45:31,320
And this is one thing I learned from someone that if you show someone how to do something

526
00:45:31,320 --> 00:45:36,160
difficult, then often they'll pay you to do it.

527
00:45:36,160 --> 00:45:43,240
And so that's one of the reasons that we put out this software is we think this is a useful

528
00:45:43,240 --> 00:45:44,440
tool.

529
00:45:44,440 --> 00:45:46,720
We're happy to show people how to do it.

530
00:45:46,720 --> 00:45:51,600
And then if they need help using it, we're more than happy to help.

531
00:45:51,600 --> 00:45:57,960
So this was sort of big news I wanted to share with you and just let you know that, hey,

532
00:45:57,960 --> 00:46:06,000
this metric software development kit is actually quite useful on making money off of it.

533
00:46:06,000 --> 00:46:10,040
Hopefully this company gets value out of it.

534
00:46:10,040 --> 00:46:17,200
And as I was talking about with the Probit model earlier, the more and more people that

535
00:46:17,200 --> 00:46:20,400
use it, the more robust it becomes.

536
00:46:20,400 --> 00:46:27,440
So I would actually benefit if you use the open source software to make money, because

537
00:46:27,440 --> 00:46:33,480
hopefully if you use it and other companies use it, hopefully the software becomes more

538
00:46:33,480 --> 00:46:35,120
and more robust.

539
00:46:35,120 --> 00:46:43,080
Maybe you find something that needs to be ironed out and you can or cannot share if

540
00:46:43,080 --> 00:46:44,080
you wish.

541
00:46:44,080 --> 00:46:49,020
But if you decide to share, then hopefully I can improve the system.

542
00:46:49,020 --> 00:46:50,800
Hopefully that'll make your life better.

543
00:46:50,800 --> 00:46:52,520
Hopefully it'll make my life better.

544
00:46:52,520 --> 00:46:55,200
And we can just keep going back and forth.

545
00:46:55,200 --> 00:47:02,640
Just incremental progress, win-win, and value for value.

546
00:47:02,640 --> 00:47:06,280
It's taken a long time to get to this stage.

547
00:47:06,280 --> 00:47:10,600
And I want to thank you all for your support to get here.

548
00:47:10,600 --> 00:47:11,960
Every dollar mattered.

549
00:47:11,960 --> 00:47:17,920
I couldn't be here without literally each and every penny.

550
00:47:17,920 --> 00:47:22,560
Each and every penny got put to unbelievably good use.

551
00:47:22,560 --> 00:47:24,600
I want to thank you.

552
00:47:24,600 --> 00:47:31,120
Thank you for your eyes, your ears, your eyeballs contributing to CanLinux and the Canis Data

553
00:47:31,120 --> 00:47:33,000
Science Meetup Group.

554
00:47:33,000 --> 00:47:39,560
Because of you, we've covered an enormous amount of ground and I think are really starting

555
00:47:39,560 --> 00:47:46,600
to reap the benefits of the open source software and really start to provide people with value.

556
00:47:46,600 --> 00:47:53,760
So just want to give you that little bit of motivation that if you don't at first succeed,

557
00:47:53,760 --> 00:47:55,720
try, try, and try again.

558
00:47:55,720 --> 00:48:00,880
Also, I know it was me standing on the pedestal for most of the time today.

559
00:48:00,880 --> 00:48:04,000
So hopefully you at least found it interesting.

560
00:48:04,000 --> 00:48:10,480
Just in the last couple of minutes, do you have any thoughts, comments, questions, Candice,

561
00:48:10,480 --> 00:48:16,960
before we call it a start of the day?

562
00:48:16,960 --> 00:48:18,680
That is just fabulous.

563
00:48:18,680 --> 00:48:21,120
Fabulous presentation and coding.

564
00:48:21,120 --> 00:48:28,440
And that's interesting too, CanLinux is interfacing with the metric API.

565
00:48:28,440 --> 00:48:36,000
And Massachusetts uses this metric and also too, I do believe that MCR Labs, Yasha and

566
00:48:36,000 --> 00:48:45,880
Isaac will be very interested in really how you nail down that concentrates can also have

567
00:48:45,880 --> 00:48:47,920
concentrates of pesticides.

568
00:48:47,920 --> 00:48:48,920
That's concerning.

569
00:48:48,920 --> 00:48:56,320
As someone that did a lot of RSO in the beginning, I think that might fall under that product

570
00:48:56,320 --> 00:48:58,280
type category.

571
00:48:58,280 --> 00:49:05,640
And those are patients that are doing that RSO and we don't want high pesticides.

572
00:49:05,640 --> 00:49:12,380
So amazing Keegan, you're always uncovering just good stuff.

573
00:49:12,380 --> 00:49:14,200
So thank you for everything.

574
00:49:14,200 --> 00:49:20,560
I love your enthusiasm, Candice, and just to continue piggybacking on that, yes, CanLinux

575
00:49:20,560 --> 00:49:24,600
actually is verified with metric in Massachusetts.

576
00:49:24,600 --> 00:49:29,760
So that demonstrates that the software can be used in Massachusetts.

577
00:49:29,760 --> 00:49:37,640
And as you pointed out, this analysis hopefully will be quite valuable to people because personally,

578
00:49:37,640 --> 00:49:45,960
I just try to limit my consumption of foreign chemicals that I don't understand.

579
00:49:45,960 --> 00:49:54,040
So for me, everything else held constant, I would actually prefer the flower with no

580
00:49:54,040 --> 00:50:02,720
pesticides detected versus one with hyperannual butoxide detected, even if it is below the

581
00:50:02,720 --> 00:50:03,960
limit for the state.

582
00:50:03,960 --> 00:50:09,320
And so this is why I argue that if people work to make their lab results accessible

583
00:50:09,320 --> 00:50:14,080
to people, then this could actually be a selling point for you.

584
00:50:14,080 --> 00:50:20,800
People are always saying, what can we compete on besides THC concentration?

585
00:50:20,800 --> 00:50:23,560
Well, guess what?

586
00:50:23,560 --> 00:50:31,320
If there were two products, one was 18% THC, the other was 20% THC, and one clearly had

587
00:50:31,320 --> 00:50:39,040
on the label no pesticides detected, and the other one, hyperannual butoxide, was detected,

588
00:50:39,040 --> 00:50:43,360
it would be a no-brainer for me which one I would choose.

589
00:50:43,360 --> 00:50:45,560
This is a personal preference.

590
00:50:45,560 --> 00:50:48,240
Everybody has different preferences.

591
00:50:48,240 --> 00:50:54,800
If you're looking for a factor to compete on, then as a cultivator, you may want to really

592
00:50:54,800 --> 00:51:01,160
stress the fact if you don't have any pesticides detected in the flower that, yes, there's

593
00:51:01,160 --> 00:51:08,840
no pesticides detected in this sample or in this flower, and other cultivators may not

594
00:51:08,840 --> 00:51:15,720
necessarily be able to make that claim because as we were showing today, there's a non-negligible,

595
00:51:15,720 --> 00:51:17,560
and that was just one pesticide.

596
00:51:17,560 --> 00:51:19,240
There's other pesticides too.

597
00:51:19,240 --> 00:51:26,600
So a non-negligible amount are getting detections even if they may be below the limit.

598
00:51:26,600 --> 00:51:32,880
So say you're a concentrate provider, this could be a really big selling point for you.

599
00:51:32,880 --> 00:51:38,400
Well, also too, maybe you want to look for pesticides with a lower boiling point than

600
00:51:38,400 --> 00:51:40,000
THC, right?

601
00:51:40,000 --> 00:51:44,760
And just to get it chemically out of the product, right?

602
00:51:44,760 --> 00:51:51,440
Even though people might still want full disclosure that maybe no pesticides are used, but never

603
00:51:51,440 --> 00:51:52,440
mind.

604
00:51:52,440 --> 00:51:57,920
Well, you're thinking like a chemist, and one would hope that the processors, and I've

605
00:51:57,920 --> 00:52:01,760
met some really, really, really smart processors.

606
00:52:01,760 --> 00:52:05,440
I imagine a lot of them are taking this into consideration.

607
00:52:05,440 --> 00:52:13,280
However, we've talked in the past about asymmetric information exists, and we can't pretend it

608
00:52:13,280 --> 00:52:14,280
doesn't.

609
00:52:14,280 --> 00:52:21,240
So there may just be processors who aren't aware that boiling points of different pesticides

610
00:52:21,240 --> 00:52:22,560
matter.

611
00:52:22,560 --> 00:52:26,640
So one thing is just helping people access the information.

612
00:52:26,640 --> 00:52:31,000
So just do they even have the information to start with?

613
00:52:31,000 --> 00:52:34,040
That's a good piece.

614
00:52:34,040 --> 00:52:36,200
And then actionable insights.

615
00:52:36,200 --> 00:52:39,100
So we're working on those.

616
00:52:39,100 --> 00:52:44,120
We don't have too much that you can act on yet except taking into consideration the risk

617
00:52:44,120 --> 00:52:46,160
of flour versus concentrate.

618
00:52:46,160 --> 00:52:52,200
But I think there's a lot of doors that have been opened simply by creating this data.

619
00:52:52,200 --> 00:52:58,240
So now you can take this data, put it in a database, and do a quick query.

620
00:52:58,240 --> 00:53:05,120
So now all of a sudden, if you see a product, you can now know if there were any pesticides

621
00:53:05,120 --> 00:53:07,780
or say residual solvents detected.

622
00:53:07,780 --> 00:53:13,440
So I think this could be something that could help out consumers, could help out regulators,

623
00:53:13,440 --> 00:53:19,520
help out the labs, help out the cultivators and processors themselves, could even potentially

624
00:53:19,520 --> 00:53:21,120
help out retailers.

625
00:53:21,120 --> 00:53:24,160
Maybe retailers don't even know to look for this.

626
00:53:24,160 --> 00:53:27,180
So that's what we're all about.

627
00:53:27,180 --> 00:53:34,240
And that's what we love to do is find useful cannabis analytics that can help as many people

628
00:53:34,240 --> 00:53:39,040
as possible in the cannabis industry or the cannabis space.

629
00:53:39,040 --> 00:53:45,680
And as I said, this was a small start, but we're all about one molecule by a time.

630
00:53:45,680 --> 00:53:52,120
So literally today, we looked at yet another molecule, piperanial butoxide.

631
00:53:52,120 --> 00:53:57,280
And I think we moved the bowl forward, even if it was just one molecule.

632
00:53:57,280 --> 00:53:59,120
So I'm happy with today.

633
00:53:59,120 --> 00:54:00,120
So thank you.

634
00:54:00,120 --> 00:54:01,120
Thank you.

635
00:54:01,120 --> 00:54:03,440
Thank you for coming.

636
00:54:03,440 --> 00:54:10,480
Keep advancing cannabis data science.

