1
00:00:00,000 --> 00:00:07,840
Welcome to the cannabis data science meetup group.

2
00:00:07,840 --> 00:00:09,920
Couldn't be happier to have you here.

3
00:00:09,920 --> 00:00:18,480
It's your ideas, your attention, you know, your thought, your ears, everything that you

4
00:00:18,480 --> 00:00:22,520
bring to the table is what's moving the ball forward.

5
00:00:22,520 --> 00:00:27,440
And we've covered a lot of ground and there's still more ground to cover.

6
00:00:27,440 --> 00:00:31,840
So I pointed you in the direction of a few lab result data sets.

7
00:00:31,840 --> 00:00:36,340
They need to be standardized a bit more before we can use them.

8
00:00:36,340 --> 00:00:41,240
As we were mentioning last week, we're super interested in some of the terpene data, still

9
00:00:41,240 --> 00:00:44,880
a bit of work ahead of us standardizing that.

10
00:00:44,880 --> 00:00:51,200
So today, coincidentally, a different data set landed in my lap.

11
00:00:51,200 --> 00:00:59,040
This giant data set from Massachusetts of historical THC and microbe tests.

12
00:00:59,040 --> 00:01:05,880
So just wanted to do a, or do diligence and give it a look.

13
00:01:05,880 --> 00:01:12,000
And then we can resume with this hunt for terpenes and land races and all of that fun

14
00:01:12,000 --> 00:01:14,000
stuff next week.

15
00:01:14,000 --> 00:01:23,240
So we'd love to hear about any of your adventures during the week before I start going off about

16
00:01:23,240 --> 00:01:25,720
some of this cool new data.

17
00:01:25,720 --> 00:01:27,520
It's a meetup after all.

18
00:01:27,520 --> 00:01:31,560
So I'll start with Candice.

19
00:01:31,560 --> 00:01:38,320
Would love to hear about, you know, what are you interested in or out of the Massachusetts

20
00:01:38,320 --> 00:01:45,200
data, you know, whether your big interests, what questions would you like to see addressed?

21
00:01:45,200 --> 00:01:47,760
Any thoughts that come to your mind?

22
00:01:47,760 --> 00:01:54,080
Well, I'm excited that you have all this new data from Massachusetts.

23
00:01:54,080 --> 00:02:01,040
And you know, I still, you know, search for the pesticide information, but this is great.

24
00:02:01,040 --> 00:02:04,960
I'm just so excited that you were able to get this data Keegan.

25
00:02:04,960 --> 00:02:05,960
Thanks.

26
00:02:05,960 --> 00:02:11,320
As we always say, some data is better than no data.

27
00:02:11,320 --> 00:02:17,680
So this is our first really big glance at the Massachusetts data.

28
00:02:17,680 --> 00:02:23,120
It puts some of our data in perspective because as we've mentioned in the past, we have a

29
00:02:23,120 --> 00:02:30,640
non-random sample of Massachusetts lab results, the public lab results we've collected from

30
00:02:30,640 --> 00:02:31,640
MCR labs.

31
00:02:31,640 --> 00:02:37,960
There's some rich terpene data there, but we didn't know, you know, what proportion

32
00:02:37,960 --> 00:02:41,200
of lab tests did that comprise?

33
00:02:41,200 --> 00:02:47,400
And it actually turns out not maybe the largest proportion.

34
00:02:47,400 --> 00:02:49,560
There's tons and tons of lab tests.

35
00:02:49,560 --> 00:02:53,000
So we'll actually get a metric of that today.

36
00:02:53,000 --> 00:03:00,600
So love your interest and there will be more lab results in the future, hopefully to suit

37
00:03:00,600 --> 00:03:06,560
your appetite, but today we'll get a taste.

38
00:03:06,560 --> 00:03:15,680
And please remind me your name in case it's not coming to mind, but populate cultivars.

39
00:03:15,680 --> 00:03:21,800
We'd love to hear about what you would like to see put on the table and you know, what

40
00:03:21,800 --> 00:03:23,520
are you interested in?

41
00:03:23,520 --> 00:03:29,080
What do you think the Cannabis Data Science Meetup Group could move forward on?

42
00:03:29,080 --> 00:03:30,080
Great.

43
00:03:30,080 --> 00:03:37,040
Well, my name is Caleb, Caleb DeLue, which I'm happy to remind as many times names are

44
00:03:37,040 --> 00:03:38,240
silly like that.

45
00:03:38,240 --> 00:03:45,680
This is my first time here at this meetup and again, thanks Keegan for inviting me.

46
00:03:45,680 --> 00:03:53,400
I founded, co-founded, and I'm a data scientist at Copuleft Cultivars, a nonprofit where we're

47
00:03:53,400 --> 00:03:57,840
working to protect and preserve vulnerable plants using copuleft and cannabis is kind

48
00:03:57,840 --> 00:04:02,680
of our flagship one given the emerging market.

49
00:04:02,680 --> 00:04:10,360
And so yeah, I just am also personally really passionate about cannabis and medicinal plants

50
00:04:10,360 --> 00:04:17,160
in general, really agriculture in general, special focus towards cannabis and genetics

51
00:04:17,160 --> 00:04:19,440
as well, super interested in that.

52
00:04:19,440 --> 00:04:23,160
And so it's really great to be here and join in.

53
00:04:23,160 --> 00:04:31,920
As to the data set, I'm fascinated by seeing the recent movement of uncovering biases and

54
00:04:31,920 --> 00:04:38,920
this shifting benchmarking that's been happening with labs resulting in inflated numbers or

55
00:04:38,920 --> 00:04:46,040
false negatives for biologics or, you know, I could list various other things that the

56
00:04:46,040 --> 00:04:48,820
industry is honing in on right now.

57
00:04:48,820 --> 00:04:51,720
But I think it'll be really interesting to view it through that lens.

58
00:04:51,720 --> 00:04:59,040
And then also, I'm always thinking about how I can connect insights that I see in the cannabis

59
00:04:59,040 --> 00:05:06,980
data to our work at Copuleft Cultivars where we're making galaxies that are copuleft protected

60
00:05:06,980 --> 00:05:14,060
so can't be privatized or abused and which will hopefully allow people to compile trait

61
00:05:14,060 --> 00:05:22,560
and gene data and other data growing data as well in this large public repository where

62
00:05:22,560 --> 00:05:30,240
we can do like GWAS and other types of analysis and share that throughout the whole community

63
00:05:30,240 --> 00:05:31,400
in an open source way.

64
00:05:31,400 --> 00:05:37,080
So really keeping my mind open to the insights in that way as well, but really just excited

65
00:05:37,080 --> 00:05:38,080
to get into the data.

66
00:05:38,080 --> 00:05:42,080
And I love rolling through this kind of data and pulling out interesting insights.

67
00:05:42,080 --> 00:05:45,360
Thanks for being here.

68
00:05:45,360 --> 00:05:46,760
I absolutely love it.

69
00:05:46,760 --> 00:05:52,480
You'll have to, if you want, share some more of your insights and knowledge with us, especially

70
00:05:52,480 --> 00:06:00,220
as we start moving back into the realm of terpies and working around with, you know,

71
00:06:00,220 --> 00:06:06,720
how people are classifying various varieties with strain names or perhaps chemo types.

72
00:06:06,720 --> 00:06:10,880
That's something we've worked with in the past and definitely something that has a ton

73
00:06:10,880 --> 00:06:12,560
more work to do.

74
00:06:12,560 --> 00:06:16,240
So we'd love to have your input there.

75
00:06:16,240 --> 00:06:23,880
And then as far as the talk on the town, the way we see it is people go back and forth,

76
00:06:23,880 --> 00:06:26,160
but what can we bring to the table?

77
00:06:26,160 --> 00:06:28,920
We can just bring data and statistics.

78
00:06:28,920 --> 00:06:33,240
So that's really what we're going to do today is we're just going to supply you with just

79
00:06:33,240 --> 00:06:36,120
a ton of data and statistics.

80
00:06:36,120 --> 00:06:38,920
As always, make of it what you will.

81
00:06:38,920 --> 00:06:40,760
Take the data at face value.

82
00:06:40,760 --> 00:06:43,040
Take the statistics at face value.

83
00:06:43,040 --> 00:06:47,320
But at least you'll have some data points in the conversation.

84
00:06:47,320 --> 00:06:50,520
But it's relevant, right?

85
00:06:50,520 --> 00:06:52,400
This is what people are talking about.

86
00:06:52,400 --> 00:06:57,380
So now you'll actually have some statistics in your pocket and I'll even point out some

87
00:06:57,380 --> 00:07:05,960
places where those come up in conversation and now you'll have a nice science back answer

88
00:07:05,960 --> 00:07:08,320
or comment.

89
00:07:08,320 --> 00:07:11,480
But anywho, cool things coming.

90
00:07:11,480 --> 00:07:20,400
But Marianne, welcome back to the group.

91
00:07:20,400 --> 00:07:21,560
Always love having you here.

92
00:07:21,560 --> 00:07:35,600
You shed so much light on the actual ongoings at the, well, you were at the CCC and this

93
00:07:35,600 --> 00:07:37,160
is really big.

94
00:07:37,160 --> 00:07:41,680
This is the framework where all the data is being generated from.

95
00:07:41,680 --> 00:07:44,720
So couldn't be happier to have you here.

96
00:07:44,720 --> 00:07:46,600
Would love to hear your input.

97
00:07:46,600 --> 00:07:50,520
Any thoughts, ideas, comments you would like to put on the table?

98
00:07:50,520 --> 00:07:51,520
Sure.

99
00:07:51,520 --> 00:07:56,320
For those who don't know me, this is my first, not my first time here, but it's been a long

100
00:07:56,320 --> 00:07:59,040
time since I've attended one of these meetups.

101
00:07:59,040 --> 00:08:00,600
I'm Marianne Sarkis.

102
00:08:00,600 --> 00:08:05,120
I'm the director of data and analytics at the Cannabis Control Commission in the state

103
00:08:05,120 --> 00:08:06,640
of Massachusetts.

104
00:08:06,640 --> 00:08:10,320
So you are using one of our data sets.

105
00:08:10,320 --> 00:08:17,680
This is kind of interesting to be in this position and just have everybody participating

106
00:08:17,680 --> 00:08:20,120
in scrutinizing this data set.

107
00:08:20,120 --> 00:08:22,560
And I'm actually really excited.

108
00:08:22,560 --> 00:08:27,080
It's rare to have that kind of a situation and experience.

109
00:08:27,080 --> 00:08:31,440
So I'm just looking forward to the meeting.

110
00:08:31,440 --> 00:08:32,440
I love it.

111
00:08:32,440 --> 00:08:41,040
You all have to grit your teeth at some parts and we're not going to be too critical, but

112
00:08:41,040 --> 00:08:42,040
it's awesome, right?

113
00:08:42,040 --> 00:08:51,420
This is kind of like seeing your child start walking on its own.

114
00:08:51,420 --> 00:08:59,400
So it's really cool that the CCC released this data.

115
00:08:59,400 --> 00:09:06,480
And as always, like you said, just the more eyes on it, the better because there's so

116
00:09:06,480 --> 00:09:10,040
many ways to slice and dice the data.

117
00:09:10,040 --> 00:09:13,920
And there's only so much time in the day.

118
00:09:13,920 --> 00:09:20,480
So you know me, I just start with the very, very basic summary statistics and then just

119
00:09:20,480 --> 00:09:23,200
start adding condition upon condition.

120
00:09:23,200 --> 00:09:25,520
Just see how far we can take it.

121
00:09:25,520 --> 00:09:31,200
It's just a tiny bit of data today, but we can go pretty far with that.

122
00:09:31,200 --> 00:09:41,320
So I just wanted to share that with you today because it is real metric data and metric

123
00:09:41,320 --> 00:09:44,600
likes to store their data in certain formats.

124
00:09:44,600 --> 00:09:49,280
And I like to rearrange the data for analysis and I'll just kind of share with you some

125
00:09:49,280 --> 00:09:52,320
of my techniques and some of the figures you can create.

126
00:09:52,320 --> 00:09:59,640
So this is a starting point and then as always, I encourage you all to first, if you're interested,

127
00:09:59,640 --> 00:10:06,240
repeat my statistics, double check my work, and then to take it further, go above and

128
00:10:06,240 --> 00:10:11,960
beyond, add some conditions of your own, see where you can take it.

129
00:10:11,960 --> 00:10:12,960
Excellent.

130
00:10:12,960 --> 00:10:13,960
Excellent.

131
00:10:13,960 --> 00:10:14,960
Excellent.

132
00:10:14,960 --> 00:10:18,160
And then Stephanie, welcome to the group.

133
00:10:18,160 --> 00:10:19,420
Happy to have you here.

134
00:10:19,420 --> 00:10:24,080
If you have any thoughts, ideas, comments, questions, or anything at all that you'd like

135
00:10:24,080 --> 00:10:29,920
to put on the table to make sure that we're addressing what you think is important here

136
00:10:29,920 --> 00:10:35,320
in the cannabis space, we'd love to hear your input.

137
00:10:35,320 --> 00:10:38,280
Hi, I'm Stephanie Thomas.

138
00:10:38,280 --> 00:10:46,800
I am the data manager with the Cannabis Control Commission in the state of Massachusetts.

139
00:10:46,800 --> 00:10:51,760
This was not intentional, just so you know, this was not intentional that we would have

140
00:10:51,760 --> 00:10:55,600
both of us here just for this dataset.

141
00:10:55,600 --> 00:10:57,520
This is complete synchronicity.

142
00:10:57,520 --> 00:11:03,960
And unfortunately, my schedule has cleared up because when I originally signed up for

143
00:11:03,960 --> 00:11:06,600
this group, I had a conflict.

144
00:11:06,600 --> 00:11:08,160
And so the other meeting has ended.

145
00:11:08,160 --> 00:11:12,800
So I'm very excited to be here and see what happens.

146
00:11:12,800 --> 00:11:18,280
And it's awesome because you're working with our data.

147
00:11:18,280 --> 00:11:20,280
It's such a coincidence.

148
00:11:20,280 --> 00:11:24,920
Well, we may as well start sooner rather than later.

149
00:11:24,920 --> 00:11:29,720
And basically, I'll just show you how we're going to look at the data.

150
00:11:29,720 --> 00:11:37,520
So this is basically, okay, what are the data points that we think are pertinent?

151
00:11:37,520 --> 00:11:41,160
How do we look at the data?

152
00:11:41,160 --> 00:11:44,280
So this could it be a better group?

153
00:11:44,280 --> 00:11:48,160
So let's go ahead and commence.

154
00:11:48,160 --> 00:11:54,920
So basically, there's not a lot of rambling today, just all data.

155
00:11:54,920 --> 00:12:05,960
I try to just focus more on the visualizations because I think that's what everybody can

156
00:12:05,960 --> 00:12:08,520
relate with and draw insights with.

157
00:12:08,520 --> 00:12:13,600
And I'll get the code posted to GitHub afterwards if you're interested.

158
00:12:13,600 --> 00:12:22,840
But this was data from a public records request that was shared with me.

159
00:12:22,840 --> 00:12:35,120
And if you just want to see the raw data here, it's just a big CSV.

160
00:12:35,120 --> 00:12:38,080
And we don't have many data points.

161
00:12:38,080 --> 00:12:45,360
So I'll have to refer you to the metric documentation.

162
00:12:45,360 --> 00:12:55,520
But in metric, they basically split the lab test up into the lab result, which is what

163
00:12:55,520 --> 00:13:03,040
you would think of is like, and once again, I could be butchering this, but from my understanding,

164
00:13:03,040 --> 00:13:08,120
it's like the lab result is what you think of is like the certificate of analysis that

165
00:13:08,120 --> 00:13:14,880
has the date tested and some of the meta details.

166
00:13:14,880 --> 00:13:21,760
And then these are then the lab tests.

167
00:13:21,760 --> 00:13:28,160
And you can think of this as every analyte that's being tested for.

168
00:13:28,160 --> 00:13:38,160
So this is delta 9 THC, and you'll see down here there's THCA.

169
00:13:38,160 --> 00:13:43,120
And just the way metric classifies this, this is just for raw plant material.

170
00:13:43,120 --> 00:13:46,760
So this is just flour.

171
00:13:46,760 --> 00:13:50,940
And then there's also microbes down here.

172
00:13:50,940 --> 00:13:53,880
So total yeast and mold.

173
00:13:53,880 --> 00:13:56,440
I just call them microbes.

174
00:13:56,440 --> 00:14:01,520
There are other microbes, but it's a short word.

175
00:14:01,520 --> 00:14:10,440
Okay, so we've got three different analytes that are being tested for.

176
00:14:10,440 --> 00:14:13,760
We've got the test result.

177
00:14:13,760 --> 00:14:17,440
There's an ID for the lab that tested it.

178
00:14:17,440 --> 00:14:25,620
One of the most important data points as we've pointed out is just when it was tested.

179
00:14:25,620 --> 00:14:34,160
So we've got, we have the time range starting at April of 2021.

180
00:14:34,160 --> 00:14:40,340
And we just have data going through the end of 2021.

181
00:14:40,340 --> 00:14:48,720
So as always, would be awesome to get a wider range of data.

182
00:14:48,720 --> 00:14:57,400
And as I kind of stressed in the past, any data is better than no data.

183
00:14:57,400 --> 00:15:01,240
You can't just be hypercritical.

184
00:15:01,240 --> 00:15:09,920
You can't just say, oh, I just want data to 2023 or bust.

185
00:15:09,920 --> 00:15:12,000
It's like, okay, we'll take it.

186
00:15:12,000 --> 00:15:15,400
It's got data till the end of 21.

187
00:15:15,400 --> 00:15:18,840
So there was better than nothing.

188
00:15:18,840 --> 00:15:21,720
And once again, we'd love more and more analytes.

189
00:15:21,720 --> 00:15:30,560
But of course, as we've noted time and time again, consumers are rather fixated on THC

190
00:15:30,560 --> 00:15:35,360
numbers and THC is what's regulated by the federal government.

191
00:15:35,360 --> 00:15:41,520
So out of all the data points, if we were going to only have one, and it's a pretty

192
00:15:41,520 --> 00:15:50,720
good one to have also, welcome to the group, Dave, just looking at Massachusetts cannabis

193
00:15:50,720 --> 00:15:51,720
test results.

194
00:15:51,720 --> 00:15:55,280
I'm just going to start calculating a bunch of statistics here.

195
00:15:55,280 --> 00:16:00,480
So just feel free to chime in at any time if you have any thoughts.

196
00:16:00,480 --> 00:16:08,280
But okay, so the first things first, right, we noticed, okay, just have to standardize

197
00:16:08,280 --> 00:16:12,280
the date, nothing fancy there.

198
00:16:12,280 --> 00:16:15,280
Okay.

199
00:16:15,280 --> 00:16:19,160
Okay.

200
00:16:19,160 --> 00:16:23,520
So first things first, and I don't know why I went in this order.

201
00:16:23,520 --> 00:16:28,800
So I may actually go in a different order than the way I coded it.

202
00:16:28,800 --> 00:16:35,760
But basically, first things first, the way I like to conceptualize this is more in like

203
00:16:35,760 --> 00:16:43,280
the COA format, right, where you just have a package label.

204
00:16:43,280 --> 00:16:50,000
One second, we may have another person joining us.

205
00:16:50,000 --> 00:16:59,200
So the way I like to think about this is a package gets sent in for testing, and then

206
00:16:59,200 --> 00:17:05,680
it gets tested for THC, THCA, microbes, so on and so forth.

207
00:17:05,680 --> 00:17:08,600
And that happens on a specific date.

208
00:17:08,600 --> 00:17:18,000
So basically, I wanted to just group everything by package.

209
00:17:18,000 --> 00:17:21,540
And then basically ran into this one issue here.

210
00:17:21,540 --> 00:17:35,520
So I think actually, actually, let's go asynchronously here.

211
00:17:35,520 --> 00:17:41,800
Let's actually just start by looking at THC since that's what we've been looking at.

212
00:17:41,800 --> 00:17:46,320
And then I'll kind of circle back for this one little note here.

213
00:17:46,320 --> 00:17:56,480
So long story short, I'm just getting the very first result here.

214
00:17:56,480 --> 00:17:59,440
So this is going to come up later.

215
00:17:59,440 --> 00:18:08,040
So basically, there is the consideration that a package may get tested multiple times.

216
00:18:08,040 --> 00:18:12,880
And in fact, I was...

217
00:18:12,880 --> 00:18:20,720
And it's awesome that we have some people at the CCC here because I was digging through

218
00:18:20,720 --> 00:18:23,040
the sampling protocol.

219
00:18:23,040 --> 00:18:28,560
And there were some mentions in there about retesting.

220
00:18:28,560 --> 00:18:34,040
So I'll kind of get to that here in a bit.

221
00:18:34,040 --> 00:18:39,880
But just wanted to start off by saying that we'll have to take into consideration duplicate,

222
00:18:39,880 --> 00:18:42,080
so to speak.

223
00:18:42,080 --> 00:18:45,040
But I'm getting too bogged down with that.

224
00:18:45,040 --> 00:18:49,040
Let's look at a graph here.

225
00:18:49,040 --> 00:18:54,160
We need a histogram.

226
00:18:54,160 --> 00:18:59,040
Okay, so a question has come up.

227
00:18:59,040 --> 00:19:07,520
And for some reason, this isn't styled quite the way I was wanting it to be.

228
00:19:07,520 --> 00:19:10,960
Okay, I'll try to just power through this.

229
00:19:10,960 --> 00:19:13,360
Okay, there it is.

230
00:19:13,360 --> 00:19:15,480
There it's a little bigger.

231
00:19:15,480 --> 00:19:21,000
So long story short, a question came up at a really cool conversation yesterday.

232
00:19:21,000 --> 00:19:30,440
So I think it's the Future Cannabis Project had a conversation the other day with Yasha

233
00:19:30,440 --> 00:19:37,480
Khan from MCR Labs and some of the other people who have attended the Cannabis Data Science

234
00:19:37,480 --> 00:19:42,760
Meetup Group in the past, Jamie Toth and Jeff Rawson.

235
00:19:42,760 --> 00:19:45,280
And so that was an awesome conversation.

236
00:19:45,280 --> 00:19:48,360
I'll have to point you in the direction of that.

237
00:19:48,360 --> 00:19:58,880
But a question came up that was basically, what would be an unreasonable amount of THC

238
00:19:58,880 --> 00:20:02,200
on a product label?

239
00:20:02,200 --> 00:20:10,040
It was kind of a fun question because it basically put everybody on the spot to basically give

240
00:20:10,040 --> 00:20:18,800
what we sometimes call the, if we're trying to sound sophisticated, the Bayesian prior,

241
00:20:18,800 --> 00:20:25,720
which is just what statistical nerds, that's a fancy way of just saying that's just what

242
00:20:25,720 --> 00:20:30,600
you personally believe a statistic would be.

243
00:20:30,600 --> 00:20:37,760
So they're basically just asking, what do you think would be the tippity top, like the

244
00:20:37,760 --> 00:20:43,680
99th percentile of THC?

245
00:20:43,680 --> 00:20:53,720
Well, now that we actually have that data from Massachusetts, we can actually calculate

246
00:20:53,720 --> 00:20:54,720
that.

247
00:20:54,720 --> 00:20:59,880
And once again, it matters what you're actually looking at here.

248
00:20:59,880 --> 00:21:10,160
And once again, consumers are often interested in just one number, so say total THC.

249
00:21:10,160 --> 00:21:17,280
And so remember, total THC is a factor of THCA.

250
00:21:17,280 --> 00:21:26,120
It's delta 9 THC plus 0.877 times THCA.

251
00:21:26,120 --> 00:21:41,880
So long story short, then in 2021, the second two or the third two quarters of 2021, the

252
00:21:41,880 --> 00:21:49,160
average THC was around 17 and a half percent.

253
00:21:49,160 --> 00:22:01,160
Just exclude real quick anything that was zero, just in case the zeros are outliers.

254
00:22:01,160 --> 00:22:09,840
So if you exclude everything that was zero, then you're looking at about 18 and a half

255
00:22:09,840 --> 00:22:14,840
percent average total THC.

256
00:22:14,840 --> 00:22:21,960
And then if you were wanting to know what would be like sort of a reasonable upper bound

257
00:22:21,960 --> 00:22:23,880
to that?

258
00:22:23,880 --> 00:22:33,400
Well, from testing, you would see that, OK, the reasonable upper bound is around 20, 28

259
00:22:33,400 --> 00:22:36,760
and a half percent.

260
00:22:36,760 --> 00:22:44,360
And then let's actually just real quick, let's just check out what the maps is.

261
00:22:44,360 --> 00:22:50,280
So it looks like this must have just been some miscoded outlier.

262
00:22:50,280 --> 00:22:55,400
So I'm wondering what the max is, maybe less than 40.

263
00:22:55,400 --> 00:23:04,720
So you're seeing some that are really, really high, like to this one is at 38%.

264
00:23:04,720 --> 00:23:11,440
But that's just a quite an outlier.

265
00:23:11,440 --> 00:23:17,960
So long story short, the Yash and everybody had pretty good answers.

266
00:23:17,960 --> 00:23:26,600
I think Jeff mentioned that he had tested a legitimate sample, testing at around 31%.

267
00:23:26,600 --> 00:23:31,640
So you may see some outliers way up there.

268
00:23:31,640 --> 00:23:38,760
But Dr. Anna said that, oh, really anything above 25% or so is starting to get a little

269
00:23:38,760 --> 00:23:39,760
high.

270
00:23:39,760 --> 00:23:45,520
But there is maybe a thought comment question from somebody.

271
00:23:45,520 --> 00:23:48,680
I just hear a notification that somebody may have.

272
00:23:48,680 --> 00:23:51,000
Sorry, that was me again.

273
00:23:51,000 --> 00:23:56,800
I put in the link to the future cannabis project, not high enough reality versus expectations

274
00:23:56,800 --> 00:24:00,200
YouTube link in the chat.

275
00:24:00,200 --> 00:24:01,200
Phenomenal.

276
00:24:01,200 --> 00:24:05,720
I meant to help you not interrupt you.

277
00:24:05,720 --> 00:24:06,720
It's super good.

278
00:24:06,720 --> 00:24:07,720
Super good.

279
00:24:07,720 --> 00:24:14,600
And once again, I'm not even trying to get too fixated on THC numbers.

280
00:24:14,600 --> 00:24:17,840
So I'm going to go ahead and start moving on here in a bit.

281
00:24:17,840 --> 00:24:27,520
But I just have my ear open for statistics that people are curious about.

282
00:24:27,520 --> 00:24:30,880
And that was a statistic.

283
00:24:30,880 --> 00:24:36,000
What percentage THC is unreasonable?

284
00:24:36,000 --> 00:24:46,800
And once again, if you're looking at THCA, you could measure 31% THCA.

285
00:24:46,800 --> 00:24:48,600
But once again, it depends.

286
00:24:48,600 --> 00:24:52,040
Are they talking about total THC?

287
00:24:52,040 --> 00:24:59,760
Are they talking about THCA, delta 9 THC, so on and so forth?

288
00:24:59,760 --> 00:25:05,960
Because as you notice, the delta 9 is actually really, really small in a flower.

289
00:25:05,960 --> 00:25:07,680
Small concentrations.

290
00:25:07,680 --> 00:25:11,960
OK, moving on.

291
00:25:11,960 --> 00:25:13,560
Just slightly though.

292
00:25:13,560 --> 00:25:25,680
So here's a interesting figure where once again, we wouldn't be doing our due diligence

293
00:25:25,680 --> 00:25:29,520
if we didn't calculate these statistics.

294
00:25:29,520 --> 00:25:32,000
Because this is just what everybody's talking about.

295
00:25:32,000 --> 00:25:37,240
They're saying, oh, there's been THC inflation.

296
00:25:37,240 --> 00:25:46,240
So the classic thing to look for there is just a trend in THC over time.

297
00:25:46,240 --> 00:25:50,840
And once again, we'd love to get this to the current day.

298
00:25:50,840 --> 00:26:05,800
And in 2021, we actually see the average, just on whole, drop from above 18% to around

299
00:26:05,800 --> 00:26:09,120
the 17, 17 and 1.5%.

300
00:26:09,120 --> 00:26:14,640
But once again, there could be no statistically different change to this.

301
00:26:14,640 --> 00:26:21,640
And once again, with time series, you really, really want to see them play out a bit.

302
00:26:21,640 --> 00:26:25,160
Because this may not be a long enough time range.

303
00:26:25,160 --> 00:26:32,920
This could just all be, you know, there could be some sort of seasonality component in this.

304
00:26:32,920 --> 00:26:34,920
It's not impossible.

305
00:26:34,920 --> 00:26:39,320
But so there's that.

306
00:26:39,320 --> 00:26:54,920
And then once again, I'm almost getting almost bored at pointing out variances by lab.

307
00:26:54,920 --> 00:26:57,440
But it's something that comes up.

308
00:26:57,440 --> 00:27:00,600
And it is data that we have.

309
00:27:00,600 --> 00:27:12,160
But once again, it's pertinent just to see, OK, are there any variances in THC by lab?

310
00:27:12,160 --> 00:27:21,360
Once again, as we've noticed in the past, there is sort of a wide variance in how labs

311
00:27:21,360 --> 00:27:23,200
are measuring.

312
00:27:23,200 --> 00:27:29,440
What you would like to see, and you kind of see it to a certain extent here, is you would

313
00:27:29,440 --> 00:27:35,960
like to see maybe a wide variance in how people are measuring.

314
00:27:35,960 --> 00:27:46,880
But over time, you would like to see them kind of hone in on what would be the true

315
00:27:46,880 --> 00:27:50,080
mean, so to speak.

316
00:27:50,080 --> 00:27:53,320
And so you kind of see that here to a certain degree.

317
00:27:53,320 --> 00:28:05,920
But you see some labs testing on average pretty high, north of 20% for these two labs.

318
00:28:05,920 --> 00:28:08,840
And then these ones are testing low.

319
00:28:08,840 --> 00:28:14,920
This one, I think that's Lab D. And see, Lab D may have exited.

320
00:28:14,920 --> 00:28:18,520
They may have been measuring too low.

321
00:28:18,520 --> 00:28:22,160
They're measuring below 16%.

322
00:28:22,160 --> 00:28:26,040
And once again, I'm having a little trouble following which lines which.

323
00:28:26,040 --> 00:28:29,880
So my apologies to that.

324
00:28:29,880 --> 00:28:38,960
So long story short, the labs may be kind of coming to a consensus around 18% or so.

325
00:28:38,960 --> 00:28:44,760
But I would just like to point out this kind of interesting dynamic, which is right here

326
00:28:44,760 --> 00:28:52,480
in October, you see Lab H enter the market.

327
00:28:52,480 --> 00:28:57,360
And they're testing at pretty high percentages.

328
00:28:57,360 --> 00:29:05,040
And curiously, Lab G looks like maybe their December number is not representative for

329
00:29:05,040 --> 00:29:06,460
some reason.

330
00:29:06,460 --> 00:29:13,600
So hard to say what happened with Lab G. And then I'm curious about sort of the future

331
00:29:13,600 --> 00:29:22,840
trajectory here, because for whatever reason, Lab H and Lab X are now testing a little higher

332
00:29:22,840 --> 00:29:25,640
than average.

333
00:29:25,640 --> 00:29:38,640
So not too much more to say on that, other than we'd like to see where that one goes

334
00:29:38,640 --> 00:29:44,160
over time, but I don't know.

335
00:29:44,160 --> 00:29:46,520
Do any of you have any thoughts?

336
00:29:46,520 --> 00:29:56,320
Basically my thought is there is basically repeated observations that there's just not

337
00:29:56,320 --> 00:30:01,840
like the greatest variation in the world, but there is variation between labs.

338
00:30:01,840 --> 00:30:11,400
So my comment is just if you're doing statistics as a cultivator or a retailer or something

339
00:30:11,400 --> 00:30:19,640
like that, I would just highly recommend to include the laboratory as a variable, almost

340
00:30:19,640 --> 00:30:24,960
like control for the lab.

341
00:30:24,960 --> 00:30:39,240
If you're a cultivator and you're doing scientific studies, you're doing R&D, just one of your

342
00:30:39,240 --> 00:30:48,240
variables should maybe just be lab, because there could be kind of something structurally

343
00:30:48,240 --> 00:30:54,560
different between how Lab X is testing and Lab D.

344
00:30:54,560 --> 00:31:04,520
So honestly, I should practice what I preach, because I just realized that I haven't been

345
00:31:04,520 --> 00:31:08,640
maybe controlling through lab in some of my studies.

346
00:31:08,640 --> 00:31:16,600
But anywho, I kind of want to move on to some other interesting aspects to the data, unless

347
00:31:16,600 --> 00:31:23,720
am I talking too much about THC or what are people's thoughts so far?

348
00:31:23,720 --> 00:31:34,400
Okay, on that note, I'm going to just do this real quick, because it was just I found this

349
00:31:34,400 --> 00:31:37,240
a little bit interesting.

350
00:31:37,240 --> 00:31:45,520
Basically I was just curious about like the retests by lab.

351
00:31:45,520 --> 00:31:53,360
So I was just because it and once again, just tell me if I should skip this if it's getting

352
00:31:53,360 --> 00:31:56,080
too tedious.

353
00:31:56,080 --> 00:32:05,360
But basically, as I was calculating all these statistics, I realized that there were essentially

354
00:32:05,360 --> 00:32:10,360
duplicates here in the data.

355
00:32:10,360 --> 00:32:13,800
So I wanted to kind of figure out how to handle those.

356
00:32:13,800 --> 00:32:22,520
So you see this sample, this package came in and was tested multiple times here.

357
00:32:22,520 --> 00:32:25,880
It looks like it was tested twice.

358
00:32:25,880 --> 00:32:35,720
So basically, was just, you know, going to look into that and see what's happening there.

359
00:32:35,720 --> 00:32:46,880
But basically, what I've done is just what did I do?

360
00:32:46,880 --> 00:32:49,900
I think this is yes.

361
00:32:49,900 --> 00:32:57,600
So for each package, so here's this package, I basically just said, Okay, when was the

362
00:32:57,600 --> 00:33:00,480
first time it was tested?

363
00:33:00,480 --> 00:33:04,080
When was the last time it was tested?

364
00:33:04,080 --> 00:33:07,880
What was the delta nine when it was first tested?

365
00:33:07,880 --> 00:33:11,560
And then what was the delta nine when it was last tested?

366
00:33:11,560 --> 00:33:19,880
And then same for the THC a, then went ahead and grabbed the lab to I did.

367
00:33:19,880 --> 00:33:28,200
I tried to find out if the lab was changing between the tests, and it didn't appear to

368
00:33:28,200 --> 00:33:29,200
be.

369
00:33:29,200 --> 00:33:35,680
So it appears to be whichever lab tested it first is the same lab that tested it the second

370
00:33:35,680 --> 00:33:37,200
time.

371
00:33:37,200 --> 00:33:44,700
And just from pawing through the sampling protocol book, it seems to mention something

372
00:33:44,700 --> 00:33:50,440
about Oh, when you're doing a retest, I want to say that you're supposed to use the initial

373
00:33:50,440 --> 00:33:53,000
testing lab.

374
00:33:53,000 --> 00:34:01,320
So this is pointing me to the direction of these are retests for one reason or the other.

375
00:34:01,320 --> 00:34:09,360
And then basically, the natural thing to do is calculate the difference between these.

376
00:34:09,360 --> 00:34:18,840
So I just, you know, calculated the difference, you know, between THC and THC a at these two

377
00:34:18,840 --> 00:34:21,400
points in time.

378
00:34:21,400 --> 00:34:34,840
And then let's see if we can print some of these things out for you.

379
00:34:34,840 --> 00:34:42,120
So once again, I don't know what if I should be fixated on this or not.

380
00:34:42,120 --> 00:34:49,480
So this, this is what one of the meetup members taught me that was just super valuable was

381
00:34:49,480 --> 00:34:56,040
you've got to actually look at the data and just calculate some of these summary statistics.

382
00:34:56,040 --> 00:34:59,880
There may be absolutely nothing to this.

383
00:34:59,880 --> 00:35:05,800
But we can't be like making mistakes as we're calculating statistics.

384
00:35:05,800 --> 00:35:13,280
So we have to at least acknowledge that there's some sort of duplicates in here, the retests

385
00:35:13,280 --> 00:35:15,880
and kind of try to get to the bottom of them.

386
00:35:15,880 --> 00:35:19,520
I'm starting to think there's not much to them.

387
00:35:19,520 --> 00:35:26,020
I think they're just sort of routine retests for various reasons.

388
00:35:26,020 --> 00:35:35,520
Maybe they failed microbes and they got retested for microbes, or maybe this is some sort of

389
00:35:35,520 --> 00:35:37,400
routine testing.

390
00:35:37,400 --> 00:35:43,240
So I'd kind of encourage you to investigate this further.

391
00:35:43,240 --> 00:35:51,200
But long story short, actually, before I do it, I was just curious, you know, what does

392
00:35:51,200 --> 00:35:56,120
the difference look like?

393
00:35:56,120 --> 00:35:57,360
Right.

394
00:35:57,360 --> 00:36:01,480
And it's actually mostly negative.

395
00:36:01,480 --> 00:36:04,240
So for it, and this is great.

396
00:36:04,240 --> 00:36:07,460
This was why you want to calculate statistics, right?

397
00:36:07,460 --> 00:36:11,940
Because as I'm calculating this, I'm thinking, oh, I'm on to something, right?

398
00:36:11,940 --> 00:36:15,080
Because I'm thinking that, right?

399
00:36:15,080 --> 00:36:21,180
Whether you I'd like to admit it or not, but my null hypothesis was actually that the difference

400
00:36:21,180 --> 00:36:29,920
was going to be positive that, oh, that these labs, they're retesting samples and they're

401
00:36:29,920 --> 00:36:33,960
testing them at higher THC values.

402
00:36:33,960 --> 00:36:38,640
But it doesn't actually the data doesn't actually support that.

403
00:36:38,640 --> 00:36:49,740
It's you know, on average, the tests are about a half percent lower.

404
00:36:49,740 --> 00:37:00,960
When they're retested, you see some that are testing at higher values for various reasons.

405
00:37:00,960 --> 00:37:11,600
And then I just I just wanted to specifically look at Lab B, because the other labs just

406
00:37:11,600 --> 00:37:17,520
don't really have enough retest to really draw my attention.

407
00:37:17,520 --> 00:37:23,560
You know, if a lab just has one retest, I don't know.

408
00:37:23,560 --> 00:37:25,680
But I should have looked at Lab E too.

409
00:37:25,680 --> 00:37:27,640
And in fact, we can do that too.

410
00:37:27,640 --> 00:37:34,240
But basically, you know, Lab B and Lab E are the only two that are really doing a lot

411
00:37:34,240 --> 00:37:38,160
of like retests or duplicates.

412
00:37:38,160 --> 00:37:43,240
So it would just be pertinent to just look at those two.

413
00:37:43,240 --> 00:37:50,400
So this is the first time I've looked at Lab E, but may as well plot that one real quick.

414
00:37:50,400 --> 00:37:52,080
And sorry that it's all in yellow.

415
00:37:52,080 --> 00:37:57,400
So Lab E is a bit more all over the board.

416
00:37:57,400 --> 00:38:03,800
And with these retests, so that so that's actually kind of curious in and of its own

417
00:38:03,800 --> 00:38:08,480
self is like, why are the retests different between these two?

418
00:38:08,480 --> 00:38:20,320
That for whatever for whatever reason, Lab B is is testing them at slightly lower value.

419
00:38:20,320 --> 00:38:28,080
Wait, yes, they're testing it slightly.

420
00:38:28,080 --> 00:38:30,080
Hold on.

421
00:38:30,080 --> 00:38:32,640
Yeah, so yeah.

422
00:38:32,640 --> 00:38:42,880
So Lab B is testing the same packages at slightly lower THC percentages later in time.

423
00:38:42,880 --> 00:38:49,440
And remember, we talked in the past about stability.

424
00:38:49,440 --> 00:38:58,240
And so this once again could be a natural study in the stability of cannabis.

425
00:38:58,240 --> 00:39:05,400
This came up in the conversation the other day, too, is how do you distinguish between

426
00:39:05,400 --> 00:39:12,520
degradation and say mislabeled cannabis products?

427
00:39:12,520 --> 00:39:19,920
And it's basically it's difficult because you've got variance on variance.

428
00:39:19,920 --> 00:39:27,240
But if you were going to attempt it, you have to at least have some metric of natural degradation

429
00:39:27,240 --> 00:39:35,480
in cannabis products, especially it would be cool to have it experimental on the shelf

430
00:39:35,480 --> 00:39:41,200
because it's one thing just to do a controlled experiment in your laboratory.

431
00:39:41,200 --> 00:39:51,560
It's a whole nother for these products to be jostled about and bandied about the state.

432
00:39:51,560 --> 00:39:54,480
They could experience all sorts of degradation.

433
00:39:54,480 --> 00:40:01,400
So I don't want to read too much into this until I read more into this, the sampling

434
00:40:01,400 --> 00:40:02,400
protocol.

435
00:40:02,400 --> 00:40:10,360
I'll just open that just to show you just what it looks like.

436
00:40:10,360 --> 00:40:11,360
So long.

437
00:40:11,360 --> 00:40:14,600
It's a hefty document here.

438
00:40:14,600 --> 00:40:25,200
But I want to say anyways, I won't look into it right now to bore you to death.

439
00:40:25,200 --> 00:40:35,720
But we're going to have to look into it further or maybe get chat GPT to to summarize it for

440
00:40:35,720 --> 00:40:39,400
us to figure out what's going on here with the retests.

441
00:40:39,400 --> 00:40:42,360
Is this just noise to the data?

442
00:40:42,360 --> 00:40:52,200
Is there anything meaningful here or at very, very best, it could be a natural study in

443
00:40:52,200 --> 00:40:54,000
degradation.

444
00:40:54,000 --> 00:40:56,160
That's what I'm personally hoping for.

445
00:40:56,160 --> 00:41:02,500
I would just hope, hope, hope that there is just some sort of protocol where they're doing

446
00:41:02,500 --> 00:41:09,920
stability testing or they once again, they may be retesting because of a microbe issue

447
00:41:09,920 --> 00:41:11,560
or this or that.

448
00:41:11,560 --> 00:41:19,880
But does anybody from the CCC want to chime in if I'm butchering this analysis too bad?

449
00:41:19,880 --> 00:41:24,320
And then I'll kind of change gears one final time.

450
00:41:24,320 --> 00:41:27,240
You're definitely not butchering things.

451
00:41:27,240 --> 00:41:32,640
This has been a fascinating presentation so far.

452
00:41:32,640 --> 00:41:39,960
The question, I have actually a question earlier that might have some implications on this

453
00:41:39,960 --> 00:41:41,400
chart.

454
00:41:41,400 --> 00:41:48,480
So when you were looking at the retest, what was the population for that?

455
00:41:48,480 --> 00:41:51,520
It wasn't that high, right?

456
00:41:51,520 --> 00:41:56,640
Yes, basically.

457
00:41:56,640 --> 00:41:59,000
It was non negligible.

458
00:41:59,000 --> 00:42:23,600
So there's 2,700 tests, but let's say they're just two packages each.

459
00:42:23,600 --> 00:42:34,800
So there are about, just a quick back of the envelope estimate, there are about 1,300 packages

460
00:42:34,800 --> 00:42:38,680
that were retested.

461
00:42:38,680 --> 00:43:01,960
I want to say there were around 31,000 packages of 31,163.

462
00:43:01,960 --> 00:43:13,320
So it's maybe around 4%, maybe between 4% and 5% of the samples of the packages are

463
00:43:13,320 --> 00:43:19,520
getting tested multiple times.

464
00:43:19,520 --> 00:43:25,520
So on a related note, how would you account for that for the earlier charts, right?

465
00:43:25,520 --> 00:43:34,400
So for looking at the THC changes over time, wouldn't we need to exclude some of those?

466
00:43:34,400 --> 00:43:37,680
That was actually a question that I put in the chat.

467
00:43:37,680 --> 00:43:39,280
How do you handle that?

468
00:43:39,280 --> 00:43:44,280
Good question.

469
00:43:44,280 --> 00:43:48,520
So there's 31,000 here that we can double check this.

470
00:43:48,520 --> 00:43:54,800
So let's just double check that this label is unique.

471
00:43:54,800 --> 00:43:59,600
So actually, unfortunately, it looks like there are some duplicates in there.

472
00:43:59,600 --> 00:44:02,760
So I thought I was...

473
00:44:02,760 --> 00:44:08,920
So I wonder if...

474
00:44:08,920 --> 00:44:16,400
Okay, so this is why I need you guys to double check my work.

475
00:44:16,400 --> 00:44:24,720
Okay, so I may have made a mistake here because I was thinking that I was just getting the

476
00:44:24,720 --> 00:44:27,960
first observed test.

477
00:44:27,960 --> 00:44:38,760
So I was thinking that for each package here that I was just going to get the very first

478
00:44:38,760 --> 00:44:40,560
time it was tested.

479
00:44:40,560 --> 00:44:53,480
So let's see real quick if anything would change here if we get the last one.

480
00:44:53,480 --> 00:44:56,480
Okay.

481
00:44:56,480 --> 00:45:13,000
Okay, so as is good, your question has kind of shown light onto the imperfections of my

482
00:45:13,000 --> 00:45:14,440
analysis.

483
00:45:14,440 --> 00:45:24,080
So as always, it appears I've made some sort of mistake and take my statistics at face

484
00:45:24,080 --> 00:45:25,080
value.

485
00:45:25,080 --> 00:45:37,320
But what I was trying to do was just take the first time a lab result was tested.

486
00:45:37,320 --> 00:45:45,680
And I guess what we could do here is just drop the duplicates just to exclude them.

487
00:45:45,680 --> 00:45:54,400
Okay, so let's just try that real quick.

488
00:45:54,400 --> 00:45:55,560
We're live.

489
00:45:55,560 --> 00:45:57,520
So I think we can just...

490
00:45:57,520 --> 00:46:01,160
I don't even have to do it.

491
00:46:01,160 --> 00:46:08,640
That GBT will more or less handle this for us.

492
00:46:08,640 --> 00:46:11,920
And so that's the interesting thing about coding these days.

493
00:46:11,920 --> 00:46:16,420
So let's see what we have now.

494
00:46:16,420 --> 00:46:25,040
So now we have removed the duplicates.

495
00:46:25,040 --> 00:46:36,760
So why there are duplicates after this aggregation, I am not certain.

496
00:46:36,760 --> 00:46:40,520
I can probably shed some light on that.

497
00:46:40,520 --> 00:46:45,080
Some of the tests might have some subtests under them.

498
00:46:45,080 --> 00:46:50,160
And we're only seeing the top test rather than the subtest.

499
00:46:50,160 --> 00:46:53,160
That could be one of the reasons for the duplication.

500
00:46:53,160 --> 00:46:56,880
So the package number stayed.

501
00:46:56,880 --> 00:47:04,960
But the types of tests might have changed, if that makes sense.

502
00:47:04,960 --> 00:47:14,520
Yes, and this is where you're talking about what's the difference between a negligible

503
00:47:14,520 --> 00:47:15,520
and a non-negligible.

504
00:47:15,520 --> 00:47:16,520
Exactly.

505
00:47:16,520 --> 00:47:25,320
And so this is a borderline one, right, because the retests were about, like we said, around

506
00:47:25,320 --> 00:47:27,520
four or five percent.

507
00:47:27,520 --> 00:47:33,200
And now we've got these 500 or so oddballs.

508
00:47:33,200 --> 00:47:38,120
And for today, I'm going to drop them.

509
00:47:38,120 --> 00:47:44,680
But what I'd actually encourage you all to do is actually specifically look at these,

510
00:47:44,680 --> 00:47:51,000
try to find out why the oddballs are odd.

511
00:47:51,000 --> 00:47:53,480
Once again, there could be absolutely nothing to it.

512
00:47:53,480 --> 00:48:02,040
It could just be some sort of weird coding, not coding, but just sort of some weird entry

513
00:48:02,040 --> 00:48:03,360
detail.

514
00:48:03,360 --> 00:48:07,880
But leave no stone unturned.

515
00:48:07,880 --> 00:48:11,800
So today, unfortunately, I'm going to just toss this stone out.

516
00:48:11,800 --> 00:48:18,840
But just keep in mind that that does go against my philosophy of never throw away any data.

517
00:48:18,840 --> 00:48:24,200
So I'm kind of, I'm unfortunately going to throw that data away today.

518
00:48:24,200 --> 00:48:27,960
But let's just redo these statistics.

519
00:48:27,960 --> 00:48:36,400
So after we exclude those oddballs, I wonder if they're not the zeros.

520
00:48:36,400 --> 00:48:50,080
But luckily, this is the power to the law of large numbers is our sample doesn't necessarily

521
00:48:50,080 --> 00:48:54,320
have to be perfect.

522
00:48:54,320 --> 00:48:56,120
There could be some flaws in there.

523
00:48:56,120 --> 00:49:08,080
But as long as the number of observations we get goes up, we're going to start approximating

524
00:48:56,120 --> 00:49:08,160
But as long as the number of observations we get goes up, we're going to start approximating

525
00:49:08,080 --> 00:49:11,680
the mean sooner rather than later.

526
00:49:08,160 --> 00:49:11,640
the mean sooner rather than later.

527
00:49:11,680 --> 00:49:18,520
So long story short is we're still looking at around 28 and 1 1

528
00:49:18,520 --> 00:49:41,920
And so long story short is we're still looking at around 28 and 1 1

529
00:49:41,920 --> 00:49:57,840
And unfalible as has just been pointed out, and that's why the scientific process, one of the most important pieces is the repeat, right?

530
00:49:57,840 --> 00:50:08,120
It's repeat, repeat, repeat, reproduce, reproduce, reproduce, reproduce, which is basically it's not really enough just for one person to do the analysis.

531
00:50:08,120 --> 00:50:14,880
Really, just the more and more people that do it, the law of large numbers kicks in.

532
00:50:14,880 --> 00:50:30,120
So basically, that's what I like to think about is as we're doing the scientific process, just like there's just a distribution of THC, there's sort of going to be a distribution of our own statistics.

533
00:50:30,120 --> 00:50:39,040
And, you know, the average of everybody's studies may approximate the true value.

534
00:50:39,040 --> 00:50:52,560
But just for fun here, once again, I'm not trying to go too far and down the yeast and mold rabbit hole, but it was it's the talk on the town.

535
00:50:52,560 --> 00:50:57,640
So I just thought it would be pertinent to to look at, right?

536
00:50:57,640 --> 00:51:07,160
We've got the data. It would not be.

537
00:51:07,160 --> 00:51:11,320
It would not be wise to not even look at it.

538
00:51:11,320 --> 00:51:15,160
So it costs barely anything just for us to look at it.

539
00:51:15,160 --> 00:51:24,760
So why not? So long story short, this is wasn't supposed to be, but this is in scientific notation here.

540
00:51:24,760 --> 00:51:36,600
Basically, the failure rate for yeast and mold is ten thousand colony forming units per gram.

541
00:51:36,600 --> 00:51:39,800
That's the same as in Washington state.

542
00:51:39,800 --> 00:51:53,040
And so in Massachusetts, basically, once again, looking at the sampling protocol, they basically say that if you fail for microbes once,

543
00:51:53,040 --> 00:51:56,680
you can you can get it remediated.

544
00:51:56,680 --> 00:52:03,280
And then I think if you fail a second time, then I think it has to be destroyed.

545
00:52:03,280 --> 00:52:09,080
And you don't have to remediate it. You can destroy it after the first time.

546
00:52:09,080 --> 00:52:16,280
And there may have been a third option, but don't quote me on that.

547
00:52:16,280 --> 00:52:19,800
In fact, you know, check out the resources above.

548
00:52:19,800 --> 00:52:26,800
But here's actually a better visualization of that.

549
00:52:26,800 --> 00:52:38,280
So because we've we've mentioned this in the past, where, of course, you want to maintain a clean environment.

550
00:52:38,280 --> 00:52:45,360
But as a cultivator, you have to measure your costs.

551
00:52:45,360 --> 00:52:50,920
And if you don't, you know, you could run into the red.

552
00:52:50,920 --> 00:53:01,840
And just the way I like to do this is if you're a cultivator, just kind of budget in the fact that you may fail a test now and again.

553
00:53:01,840 --> 00:53:06,640
Right. You may be keeping running a clean ship.

554
00:53:06,640 --> 00:53:14,480
But if you just look at the market on average, people are failing around six percent of the time.

555
00:53:14,480 --> 00:53:28,120
So you just just budget that in, just say, OK, what would what would be our loss if we did have to destroy a whole batch?

556
00:53:28,120 --> 00:53:32,880
Then multiply that by six percent.

557
00:53:32,880 --> 00:53:38,640
And that's basically your cost of micro risk.

558
00:53:38,640 --> 00:53:45,680
And you once again want to minimize your risk, keep a keep a clean environment and all that jazz.

559
00:53:45,680 --> 00:53:49,120
So ideally, we mentioned this in the past.

560
00:53:49,120 --> 00:53:54,720
You would like your failure rate to be lower than average.

561
00:53:54,720 --> 00:54:02,480
And if you notice that you're failing above average, then.

562
00:54:02,480 --> 00:54:12,920
It would be worthwhile looking into that because, you know, you could have higher costs than your your neighbor.

563
00:54:12,920 --> 00:54:17,120
So I don't want to drum on that too, too much.

564
00:54:17,120 --> 00:54:25,200
But once again, there's your good statistics to have in your back pocket.

565
00:54:25,200 --> 00:54:31,640
And we'll have to actually double check what we calculated with the average was in Washington.

566
00:54:31,640 --> 00:54:40,640
But I want to say that the average was was lower in Washington.

567
00:54:40,640 --> 00:54:46,320
But it may not have been if it was lower, was it statistically different?

568
00:54:46,320 --> 00:54:50,000
So that's actually a cool study that that you could all do.

569
00:54:50,000 --> 00:54:52,000
And we actually have this data now.

570
00:54:52,000 --> 00:55:00,040
You could actually compare yeast and mold detections in Massachusetts to Washington.

571
00:55:00,040 --> 00:55:07,480
Who knows? And then the final thing here, which has been.

572
00:55:07,480 --> 00:55:13,240
Been noted before, and once again, don't want to harsh too much on the labs,

573
00:55:13,240 --> 00:55:18,760
but it would just be pertinent for us to show this one because.

574
00:55:18,760 --> 00:55:23,600
This is a statistic that does kind of jump out at you.

575
00:55:23,600 --> 00:55:32,520
We've mentioned in the past that you don't want to just draw too, too much from small differences in things like.

576
00:55:32,520 --> 00:55:40,200
Like, so for for example, when we looked at the differences in lab testing of THC.

577
00:55:40,200 --> 00:55:44,400
Yes, there was some variance, but.

578
00:55:44,400 --> 00:55:50,560
You know, like we said, the line, a lot of the lines were around 18 percent.

579
00:55:50,560 --> 00:55:54,520
You know, if there's one that's way different than take note.

580
00:55:54,520 --> 00:55:58,400
Right. This is a difference that should jump out at you, right?

581
00:55:58,400 --> 00:56:01,760
Obviously, there's some difference going on there.

582
00:56:01,760 --> 00:56:07,360
And then this is another one that they kind of should jump out at you is.

583
00:56:07,360 --> 00:56:11,040
It's just there. There's just some.

584
00:56:11,040 --> 00:56:16,000
What appears to me like just some structural difference between,

585
00:56:16,000 --> 00:56:22,600
you know, like how the the labs are measuring microbes.

586
00:56:22,600 --> 00:56:31,080
And I put the the total number of tests that each lab is doing above each bar.

587
00:56:31,080 --> 00:56:39,560
Because I think that's important, because so, for example, lab H, they are an outlier.

588
00:56:39,560 --> 00:56:49,520
But they've only tested, you know, well, not only, I mean, they have tested 200 samples, 197.

589
00:56:49,520 --> 00:56:58,040
But, you know, lab B's tested 14000 samples in this same time period.

590
00:56:58,040 --> 00:57:01,960
Lab G looks comparable, but they only tested 70.

591
00:57:01,960 --> 00:57:07,920
And then lab A and E each tested a big bulk here themselves.

592
00:57:07,920 --> 00:57:18,960
So once again. This is actually I actually have to point it out, because I wouldn't be responsible if I didn't.

593
00:57:18,960 --> 00:57:22,960
This is kind of how you can.

594
00:57:22,960 --> 00:57:27,520
I don't want to say lie, but mislead with visualizations,

595
00:57:27,520 --> 00:57:32,680
because this is technically kind of a misleading visualization,

596
00:57:32,680 --> 00:57:42,720
because it does make you think like, oh, wow, there's this like sixfold difference in failure rate by lab.

597
00:57:42,720 --> 00:57:50,560
But. Technically, lab H and lab X and even lab Z,

598
00:57:50,560 --> 00:57:57,280
they're all kind of outliers in the sense that they're not testing very many samples.

599
00:57:57,280 --> 00:58:03,160
Like they're like they really actually could be something structurally different about those labs.

600
00:58:03,160 --> 00:58:12,360
They. They could be startup labs and they may have a specific clientele,

601
00:58:12,360 --> 00:58:17,240
maybe for whatever reason, lab H is the remediation lab, right?

602
00:58:17,240 --> 00:58:23,640
Maybe like for whatever reason. They just get tons of failures.

603
00:58:23,640 --> 00:58:30,000
So to the ones so long story short is if you excluded the outliers.

604
00:58:30,000 --> 00:58:40,480
And you're just looking at lab A to lab B, then, you know, it may not be the greatest variance in the world.

605
00:58:40,480 --> 00:58:47,400
Right. You've got lab A that like two or three percent and then lab B,

606
00:58:47,400 --> 00:58:51,920
maybe around eight or nine percent failure rate.

607
00:58:51,920 --> 00:59:01,800
So it may not be like as drastic as this this chart kind of makes it seem.

608
00:59:01,800 --> 00:59:10,120
But but I think we mentioned this in the past, where for whatever reason.

609
00:59:10,120 --> 00:59:18,680
The microbe method just doesn't appear to be standardized in just quick anecdotal evidence,

610
00:59:18,680 --> 00:59:25,600
you know, from working in a lab. It's it wasn't a super set in stone method,

611
00:59:25,600 --> 00:59:34,400
because basically the standard protocol, just the I think the cheap once again, the cost is different.

612
00:59:34,400 --> 00:59:41,320
But the standard cheapest is to test these microbes from plating,

613
00:59:41,320 --> 00:59:44,320
which is sort of like the old school method.

614
00:59:44,320 --> 00:59:50,120
And it's maybe not the most accurate, but that's just the in their microbes.

615
00:59:50,120 --> 00:59:53,000
Right. So that is it's literally dirty. So it's quick.

616
00:59:53,000 --> 00:59:57,160
Just a quick and dirty way of testing.

617
00:59:57,160 --> 01:00:00,600
And then you can do other ways.

618
01:00:00,600 --> 01:00:12,920
There's the PCR, which is a scientific instrument, and it has its own protocol, its own sample preparation.

619
01:00:12,920 --> 01:00:17,640
And then you can maybe think of that as maybe the medium degree accuracy.

620
01:00:17,640 --> 01:00:23,960
And then I think if you wanted like the highest degree of accuracy,

621
01:00:23,960 --> 01:00:25,960
don't quote me on this, because I'm not a chemist.

622
01:00:25,960 --> 01:00:31,800
You may or may may not be able to test for these on a mass spec.

623
01:00:31,800 --> 01:00:36,000
But I think that's like that's like way overkill.

624
01:00:36,000 --> 01:00:41,040
I think that's like if and you may not be able to.

625
01:00:41,040 --> 01:00:44,040
I think you can test for mycotoxins on the mass spec.

626
01:00:44,040 --> 01:00:48,560
I do not know for certain if you can test for yeast and mold on the mass spec.

627
01:00:48,560 --> 01:00:59,200
But if you could, it would be a really, really expensive test to do something that you can normally test for cheaply with plating.

628
01:00:59,200 --> 01:01:10,160
So long story short is it could just be just something as simple as lab B is using like a plating method,

629
01:01:10,160 --> 01:01:16,160
which is maybe not the most accurate in the world.

630
01:01:16,160 --> 01:01:18,560
And that's the rate they're getting.

631
01:01:18,560 --> 01:01:26,440
And then it could be that maybe lab A and lab E, maybe they're using PCR testing.

632
01:01:26,440 --> 01:01:34,640
And maybe it's more expensive, but they're able to exclude false positives.

633
01:01:34,640 --> 01:01:42,800
And. Right, if they're excluding false positives, they'll have a lower failure rate.

634
01:01:42,800 --> 01:01:49,320
And there they could potentially have happier clients.

635
01:01:49,320 --> 01:01:53,240
So that's that's kind of where I want to leave this is.

636
01:01:53,240 --> 01:02:01,600
I think, well, hopefully between the regulators and policy plus economic incentives,

637
01:02:01,600 --> 01:02:08,040
I would hope to kind of see this over time, as we mentioned with the THC.

638
01:02:08,040 --> 01:02:11,800
I would like to see them kind of come to a consensus.

639
01:02:11,800 --> 01:02:15,000
So, you know, what's it going to be?

640
01:02:15,000 --> 01:02:22,120
You know, are we going to test by by plating or are we going to test by PCR?

641
01:02:22,120 --> 01:02:27,480
Just kind of come to a consensus of what instrument?

642
01:02:27,480 --> 01:02:37,240
And they don't have to talk to each other because that could be seen as like collusion or

643
01:02:37,240 --> 01:02:41,720
anti-competitive. Really, you don't necessarily want them talking to each other.

644
01:02:41,720 --> 01:02:48,800
But what I'm just saying is market forces will kind of push them in that direction, where.

645
01:02:48,800 --> 01:02:55,360
Maybe it is cost effective to invest in a PCR machine,

646
01:02:55,360 --> 01:03:00,560
or maybe it is cost effective to use plating or who knows,

647
01:03:00,560 --> 01:03:09,080
maybe somebody will discover a new instrument to test yeast and mold.

648
01:03:09,080 --> 01:03:14,160
So that those are my main thoughts here is.

649
01:03:14,160 --> 01:03:20,080
I wouldn't be too, too alarmed about the data unless you're working at the CCC and then,

650
01:03:20,080 --> 01:03:25,240
you know, you're you're the regulator here, but just from a consumer

651
01:03:25,240 --> 01:03:36,040
point of view, it's like, yes, you know, we've highlighted some some variances in THC in and microbes.

652
01:03:36,040 --> 01:03:45,960
But I would just like to point out that this was in.

653
01:03:45,960 --> 01:03:52,120
Twenty twenty one and would love to see this played through to twenty twenty three,

654
01:03:52,120 --> 01:04:02,280
because my hypothesis is that slowly but surely, I think the industry is.

655
01:04:02,280 --> 01:04:08,240
Slowly, but as I said, surely coming to a consensus of what they think

656
01:04:08,240 --> 01:04:11,840
standard lab testing should look like.

657
01:04:11,840 --> 01:04:16,240
So we may not be in the long term yet, but, you know,

658
01:04:16,240 --> 01:04:21,480
hopefully this rocky journey will start settling down.

659
01:04:21,480 --> 01:04:24,200
But any thoughts, comments, questions?

660
01:04:24,200 --> 01:04:31,240
I know this was sort of a different session than normal.

661
01:04:31,240 --> 01:04:36,160
Not as much back and forth today, mostly just me droning on about these charts.

662
01:04:36,160 --> 01:04:41,080
So we'll be back, hopefully do a nice back and forth discussion next week and back

663
01:04:41,080 --> 01:04:47,720
to the terpenes and strains and all of some of that fun consumer related stuff.

664
01:04:47,720 --> 01:04:53,760
But just had to spend a little time and hammer out some of these dry statistics

665
01:04:53,760 --> 01:05:01,400
about lab tests just to set the stage, just so we have some statistics in our in our tool belt.

666
01:05:01,400 --> 01:05:10,720
But any thoughts, comments, questions after being bombarded by all those charts?

667
01:05:10,720 --> 01:05:16,120
One of the things to keep in mind is that some of the labs did not come online until later.

668
01:05:16,120 --> 01:05:25,520
So you're seeing kind of a snapshot of some of the older labs and some of the more recent labs.

669
01:05:25,520 --> 01:05:32,400
So that I think you might need to get the dates for when the labs opened and maybe group them

670
01:05:32,400 --> 01:05:38,840
in particular ways to make the comparison a little bit more even.

671
01:05:38,840 --> 01:05:44,080
Because right now, as you showed in that last the last slides,

672
01:05:44,080 --> 01:05:49,200
the difference among the labs is pretty stark visually, right?

673
01:05:49,200 --> 01:05:56,280
But if you group them into two groups and then, you know, the ones who came online later

674
01:05:56,280 --> 01:06:03,880
versus the earlier ones, there might be some other differences to observe as well.

675
01:06:03,880 --> 01:06:10,560
You definitely gave a good bit of foreshadowing there because that's kind

676
01:06:10,560 --> 01:06:19,360
of what I think we'll see when we hopefully eventually look at the data is I've got a feeling 2022

677
01:06:19,360 --> 01:06:26,040
was a pretty turbulent year, especially since you mentioned that there could be some new labs coming on board.

678
01:06:26,040 --> 01:06:34,320
And I mean, just anecdotally, that's what we've heard from a lot of people in the cannabis spaces.

679
01:06:34,320 --> 01:06:37,600
You know, it's been tough times and there's been a lot of change.

680
01:06:37,600 --> 01:06:40,960
So I've got a feeling.

681
01:06:40,960 --> 01:06:50,440
Tons and tons of variation and dynamics in the market in 2022 and potentially.

682
01:06:50,440 --> 01:06:56,120
Once again, are things coming to a consensus yet in 2023?

683
01:06:56,120 --> 01:07:00,280
I don't know. I've got a feeling they're heading in that direction.

684
01:07:00,280 --> 01:07:03,640
But as I mentioned, I don't think they're there yet.

685
01:07:03,640 --> 01:07:13,320
So I'm that's my just hypothesis is I bet there's going to be tons of variability and change in 2022.

686
01:07:13,320 --> 01:07:21,320
And maybe the variation slows down this year and into next.

687
01:07:21,320 --> 01:07:26,200
Only time will tell and only the data will tell.

688
01:07:26,200 --> 01:07:28,640
But that's my best guess.

689
01:07:28,640 --> 01:07:38,160
Best hypothesis.

690
01:07:38,160 --> 01:07:42,040
We will be releasing more data in the next few months.

691
01:07:42,040 --> 01:07:48,880
I know I said that last year, but we're actually getting very close to reviewing more and more of the data

692
01:07:48,880 --> 01:07:56,240
and we'll be able to release it and it will be available to everybody on our data catalog.

693
01:07:56,240 --> 01:08:04,080
Yes, and as pointed out, this is super valuable because, as I said, we went all the way down this rabbit hole

694
01:08:04,080 --> 01:08:08,640
and looking at all this lab by lab variants. But that wasn't even the important thing today.

695
01:08:08,640 --> 01:08:18,800
The important thing today is that we were able to actually answer a real question that a lot of cannabis consumers have.

696
01:08:18,800 --> 01:08:22,320
That was the top question that came up yesterday.

697
01:08:22,320 --> 01:08:28,880
The conversation is at what point do your spidey senses start tingling?

698
01:08:28,880 --> 01:08:31,000
Like at what point do you get a red flag?

699
01:08:31,000 --> 01:08:34,560
Like when you see a THC number on a label.

700
01:08:34,560 --> 01:08:38,400
And as we mentioned, it varies state by state.

701
01:08:38,400 --> 01:08:42,200
But we've at least answered that question for Massachusetts.

702
01:08:42,200 --> 01:08:48,360
And it's basically like, yes, like the very, very tippity top percentage, you know,

703
01:08:48,360 --> 01:08:56,880
you may see for the cannabis flower is, you know, like 31.8% or so THCA.

704
01:08:56,880 --> 01:09:06,400
So it is possible for like the tippity top shelf products to have north of 30% THCA.

705
01:09:06,400 --> 01:09:10,960
But just keep in mind that not every product will, right?

706
01:09:10,960 --> 01:09:14,520
That's only the 99 percentile.

707
01:09:14,520 --> 01:09:26,680
So it's like, you know, if you get something that's north of 25 percent, that's as far as total THC goes, that's well above average.

708
01:09:26,680 --> 01:09:40,200
So, you know, so I think as far as consumers go, I mean, surely it can it can only help to just have a better understanding of what is the distribution here?

709
01:09:40,200 --> 01:09:48,640
You know, what is average, what's above average, what's the maximum that I should expect?

710
01:09:48,640 --> 01:09:56,360
So there's simple summary statistics, but people out there have a lot of them have no idea, right?

711
01:09:56,360 --> 01:10:02,960
At the conversation the other day was these were the best and brilliant minds, and they actually had super close answers.

712
01:10:02,960 --> 01:10:21,000
As I mentioned, Jeff hit the 31 percent nail on the head and then Dr. Anna too also was was spot on in her answer in that she said that once it gets above 25 percent total THC, she gets a little skeptical.

713
01:10:21,000 --> 01:10:31,400
So so simple, simple statistics, but they go a long way.

714
01:10:31,400 --> 01:10:40,600
On that note, I didn't have too much time to review your comments, but I'll review them for next week.

715
01:10:40,600 --> 01:10:47,000
And feel free to share any ideas that you have during the week and I'll make sure to touch on them next week.

716
01:10:47,000 --> 01:10:55,600
Just want to give you all a huge thank you, a huge, huge thank you, especially Maryanne and Stephanie and also Quinton, too.

717
01:10:55,600 --> 01:10:58,800
I didn't get a chance to thank you for joining the Vita.

718
01:10:58,800 --> 01:11:08,040
And of course, Caleb and Candice couldn't do this without you, your eyes, your ears, your attention, your brilliant ideas.

719
01:11:08,040 --> 01:11:23,920
And today was the cherry on the top was actually having insights from the Massachusetts Cannabis Control Commission, because as we've said, you're actually in the trenches making sure the data infrastructure is up to snuff.

720
01:11:23,920 --> 01:11:27,720
We really, really, really, really appreciate your effort.

721
01:11:27,720 --> 01:11:30,400
We really appreciate any glance at the data.

722
01:11:30,400 --> 01:11:36,000
As always, you know, we can kind of be a little tough and it's just tough love, right?

723
01:11:36,000 --> 01:11:43,200
It's just any good writer needs an editor that's willing to give them a ton of red ink.

724
01:11:43,200 --> 01:11:47,000
The red ink doesn't mean we don't like you.

725
01:11:47,000 --> 01:11:48,640
No, we love what you're doing.

726
01:11:48,640 --> 01:11:49,960
So keep at it.

727
01:11:49,960 --> 01:12:06,480
You know, and, you know, we're fallible to our charts and statistics may not be perfect, but it's just, you know, we're just doing our job at just being another point in the scientific process.

728
01:12:06,480 --> 01:12:08,880
Very cool. Thank you for saying that.

729
01:12:08,880 --> 01:12:19,640
Appreciate it. And this is this is a great, great forum, as I've said before, I'm going to try and come more often if I can.

730
01:12:19,640 --> 01:12:22,920
Absolutely love it. Any time, anytime you're welcome.

731
01:12:22,920 --> 01:12:24,360
All right, cool. Thank you.

732
01:12:24,360 --> 01:12:26,320
Everybody. Thank you all for coming.

733
01:12:26,320 --> 01:12:53,920
Now, go on, get out of here, be productive and keep advancing cannabis science.

