1
00:00:00,000 --> 00:00:08,040
Welcome to the Cannabis Data Science Meetup Group.

2
00:00:08,040 --> 00:00:09,480
Thrilled to have you all here.

3
00:00:09,480 --> 00:00:11,040
Thank you all for coming.

4
00:00:11,040 --> 00:00:16,320
It's your eyes, your attention, it's your interest that's

5
00:00:16,320 --> 00:00:21,600
really moving the ball forward, helping us move the cannabis

6
00:00:21,600 --> 00:00:25,400
space hopefully forward, at least the molecule at a time.

7
00:00:25,400 --> 00:00:27,360
So that's the goal.

8
00:00:27,360 --> 00:00:31,200
And the idea today is to get our paws

9
00:00:31,200 --> 00:00:34,680
on a big set of cannabis data.

10
00:00:34,680 --> 00:00:37,480
Today, we happen to be working with lab results.

11
00:00:37,480 --> 00:00:42,920
We've worked with all sorts of cannabis data, sales, products,

12
00:00:42,920 --> 00:00:47,280
strains, licensees.

13
00:00:47,280 --> 00:00:51,160
And there's just been a big demand for lab results.

14
00:00:51,160 --> 00:00:54,080
So we've gone through some of the tedious stuff

15
00:00:54,080 --> 00:00:55,240
in the past weeks.

16
00:00:55,240 --> 00:00:59,280
We just wanted to find out who are all the players in the game.

17
00:00:59,280 --> 00:01:01,520
So we found all the licensees.

18
00:01:01,520 --> 00:01:04,800
As I'll show you, that's pretty important information

19
00:01:04,800 --> 00:01:10,440
because all of these, where do these lab results originate

20
00:01:10,440 --> 00:01:10,920
from?

21
00:01:10,920 --> 00:01:13,240
Well, they originate from licensees.

22
00:01:13,240 --> 00:01:15,400
So we can start augmenting data.

23
00:01:15,400 --> 00:01:20,000
So then we have the lab result. Then we have licensee.

24
00:01:20,000 --> 00:01:22,680
Then we may know where in the world they grew it.

25
00:01:22,680 --> 00:01:25,160
And then all of a sudden, once you know where in the world

26
00:01:25,160 --> 00:01:27,480
they grew it, then you can start doing

27
00:01:27,480 --> 00:01:28,760
all sorts of augmentation.

28
00:01:28,760 --> 00:01:30,920
You can augment climate data.

29
00:01:30,920 --> 00:01:33,360
Maybe you want to augment regulation data.

30
00:01:33,360 --> 00:01:36,200
Maybe regulations are having a play.

31
00:01:36,200 --> 00:01:39,120
The sky's the limit, in some cases,

32
00:01:39,120 --> 00:01:43,600
literally, if you're doing outdoor.

33
00:01:43,600 --> 00:01:45,960
So anywho, it's going to be good fun.

34
00:01:45,960 --> 00:01:47,840
But I'll drone on long enough.

35
00:01:47,840 --> 00:01:49,720
So it's a meetup after all.

36
00:01:49,720 --> 00:01:54,680
So you're all welcome to bring any ideas, questions,

37
00:01:54,680 --> 00:01:59,360
research your own projects that you want to share.

38
00:01:59,360 --> 00:02:01,000
This is a platform for everybody.

39
00:02:01,000 --> 00:02:03,640
And in the past, we've had people come and present

40
00:02:03,640 --> 00:02:06,240
their own research and talk about some

41
00:02:06,240 --> 00:02:07,320
of their own projects.

42
00:02:07,320 --> 00:02:11,400
So feel free to promote anything that you're working on.

43
00:02:11,400 --> 00:02:13,880
Give some big ups to yourselves.

44
00:02:13,880 --> 00:02:15,880
And then, yeah, feel free to share anything

45
00:02:15,880 --> 00:02:18,400
that you're interested in.

46
00:02:18,400 --> 00:02:25,680
So my intrepid co-host, Candice, who's been a cannabis data

47
00:02:25,680 --> 00:02:30,200
science team member for years now, it's lovely.

48
00:02:30,200 --> 00:02:32,560
We've really been moving the bull forward.

49
00:02:32,560 --> 00:02:35,440
So Candice, how have you been?

50
00:02:35,440 --> 00:02:40,880
And how are your data collection efforts going, say,

51
00:02:40,880 --> 00:02:41,800
in Florida?

52
00:02:41,800 --> 00:02:44,400
And anything on your mind?

53
00:02:44,400 --> 00:02:45,760
Let's see.

54
00:02:45,760 --> 00:02:51,880
I'm getting closer to getting an API, a metric sandbox API,

55
00:02:51,880 --> 00:02:53,320
which is key.

56
00:02:53,320 --> 00:02:56,480
And I'm going through the agreements.

57
00:02:56,480 --> 00:02:58,880
And I'm pretty excited about it.

58
00:02:58,880 --> 00:03:03,800
And that way, too, I'll be able to help cannabis businesses

59
00:03:03,800 --> 00:03:08,080
easily integrate with metric with the Canlytic software

60
00:03:08,080 --> 00:03:10,800
developer kit, the metric software developer kit.

61
00:03:10,800 --> 00:03:12,600
So I'm pretty excited about that.

62
00:03:12,600 --> 00:03:16,680
And then, actually, too, instead of just gathering a lot

63
00:03:16,680 --> 00:03:19,160
of Cjoas in Florida, Keegan, I'm just

64
00:03:19,160 --> 00:03:20,280
looking for different labs.

65
00:03:20,280 --> 00:03:22,560
But I did come across a lab called

66
00:03:22,560 --> 00:03:25,280
Canlytics, which was interesting.

67
00:03:25,280 --> 00:03:29,960
And that way, too, when we start parsing them.

68
00:03:29,960 --> 00:03:34,000
Because I really would like to see COADARC supporting

69
00:03:34,000 --> 00:03:36,880
all of the labs in Florida.

70
00:03:36,880 --> 00:03:40,360
Because the patients right now are using brute force

71
00:03:40,360 --> 00:03:44,000
spreadsheets and typing in their COA numbers.

72
00:03:44,000 --> 00:03:47,000
And I really think they would get excited.

73
00:03:47,000 --> 00:03:48,920
But I haven't had a lot of time.

74
00:03:48,920 --> 00:03:52,320
I've been working with some other stuff outside

75
00:03:52,320 --> 00:03:55,520
of cannabis data science.

76
00:03:55,520 --> 00:03:58,120
But I think that could be pretty exciting.

77
00:03:58,120 --> 00:04:01,880
So until we come up with a universal GPT script that

78
00:04:01,880 --> 00:04:04,600
will just scrape everything, I'm going

79
00:04:04,600 --> 00:04:07,960
to need to get my hands dirty and start

80
00:04:07,960 --> 00:04:13,000
PDF plumbing some COAs and try to update and add labs

81
00:04:13,000 --> 00:04:14,960
to COADARC.

82
00:04:14,960 --> 00:04:16,200
That's it.

83
00:04:16,200 --> 00:04:18,680
I absolutely love it, Candice.

84
00:04:18,680 --> 00:04:23,200
And maybe I'm getting on in my years,

85
00:04:23,200 --> 00:04:26,920
because a younger me would have already accomplished this

86
00:04:26,920 --> 00:04:27,800
for you.

87
00:04:27,800 --> 00:04:30,440
And this is why it's an all hands on deck moment.

88
00:04:30,440 --> 00:04:31,960
There's just so much to do.

89
00:04:31,960 --> 00:04:34,280
But that's the thing we're sprinting on,

90
00:04:34,280 --> 00:04:37,240
is to start unlocking some of these data

91
00:04:37,240 --> 00:04:39,520
points for consumers.

92
00:04:39,520 --> 00:04:41,960
Lab results, they're right there.

93
00:04:41,960 --> 00:04:45,440
And as we said, as soon as the consumers get them,

94
00:04:45,440 --> 00:04:47,880
then they can augment all sorts of data,

95
00:04:47,880 --> 00:04:50,760
find out information about their products,

96
00:04:50,760 --> 00:04:54,320
and do all sorts of interesting statistics.

97
00:04:54,320 --> 00:04:57,880
And what's cool is Candice says she's learning the metric

98
00:04:57,880 --> 00:04:59,640
traceability system.

99
00:04:59,640 --> 00:05:03,840
We can stand on the shoulders of giants.

100
00:05:03,840 --> 00:05:06,120
I mean, I think I was seeing somewhere

101
00:05:06,120 --> 00:05:10,400
that metric is operating in 24 states or so.

102
00:05:10,400 --> 00:05:12,200
So they're a giant.

103
00:05:12,200 --> 00:05:17,840
And they've done their work of they have,

104
00:05:17,840 --> 00:05:26,160
what do you call them, data models for various entities

105
00:05:26,160 --> 00:05:27,680
in the cannabis space, right?

106
00:05:27,680 --> 00:05:34,200
Strain, plant, package, sale, receipt, things like that.

107
00:05:34,200 --> 00:05:40,000
Well, there's the transaction, and then there's sales items.

108
00:05:40,000 --> 00:05:43,800
And so what's cool is we're also working

109
00:05:43,800 --> 00:05:45,120
to standardize results.

110
00:05:45,120 --> 00:05:49,320
And so why not just incorporate a lot of the metric data

111
00:05:49,320 --> 00:05:50,400
models?

112
00:05:50,400 --> 00:05:53,920
And so basically, the idea is, oh, well,

113
00:05:53,920 --> 00:06:00,660
if somebody has, say, a cannabis receipt in California,

114
00:06:00,660 --> 00:06:05,120
one would expect that it'll have most of the standard metric

115
00:06:05,120 --> 00:06:10,640
receipts on there, or just the various fields on there,

116
00:06:10,640 --> 00:06:14,440
which, you know, like a price, item name,

117
00:06:14,440 --> 00:06:19,400
it may have a metric ID on the label.

118
00:06:19,400 --> 00:06:20,040
Awesome.

119
00:06:20,040 --> 00:06:25,760
And so the idea is, why not get the standard metric points,

120
00:06:25,760 --> 00:06:29,080
say, from the metric from a consumer's receipt

121
00:06:29,080 --> 00:06:31,560
or the product label, or maybe they even

122
00:06:31,560 --> 00:06:33,400
have a certificate?

123
00:06:33,400 --> 00:06:34,480
Phenomenal.

124
00:06:34,480 --> 00:06:39,320
And then what we can do is do the old analytics philosophy

125
00:06:39,320 --> 00:06:42,040
is work with whatever we're given.

126
00:06:42,040 --> 00:06:44,820
So maybe they only have a couple data points.

127
00:06:44,820 --> 00:06:49,240
Maybe they just have a label that just had THC, CBD,

128
00:06:49,240 --> 00:06:52,200
but maybe it has the metric ID on it.

129
00:06:52,200 --> 00:06:53,040
Awesome.

130
00:06:53,040 --> 00:06:58,040
Maybe somebody else has a COA with that same metric ID.

131
00:06:58,040 --> 00:07:00,680
And so then we can almost connect and augment

132
00:07:00,680 --> 00:07:01,840
the data points.

133
00:07:01,840 --> 00:07:06,960
So that's the idea is just try to just rapidly collect data.

134
00:07:06,960 --> 00:07:11,040
So maybe that metric ID was for one of the IDs,

135
00:07:11,040 --> 00:07:13,440
some of these lab results we've already collected.

136
00:07:13,440 --> 00:07:16,800
So then all of a sudden, we can let the consumer know, oh,

137
00:07:16,800 --> 00:07:18,480
these are your THC.

138
00:07:18,480 --> 00:07:20,160
These are your terpenes.

139
00:07:20,160 --> 00:07:26,240
Maybe there's a pesticide screening, so on and so forth.

140
00:07:26,240 --> 00:07:30,040
You can find out the licensee, who grew it.

141
00:07:30,040 --> 00:07:32,480
Maybe you can go view their website

142
00:07:32,480 --> 00:07:36,360
and look at some of the other strains they grow.

143
00:07:36,360 --> 00:07:39,320
Once again, the sky's the limit.

144
00:07:39,320 --> 00:07:47,360
And so that's the big work underway is we're just

145
00:07:47,360 --> 00:07:51,800
coming at this from the opposite end that some other people do.

146
00:07:51,800 --> 00:07:58,920
So my philosophy is a lot of people really rely on metric.

147
00:07:58,920 --> 00:08:05,760
But unfortunately, sometimes the traceability system is

148
00:08:05,760 --> 00:08:09,120
almost like a game of telephone.

149
00:08:09,120 --> 00:08:13,480
So it's like we'd love to call up the lab,

150
00:08:13,480 --> 00:08:16,240
but instead, we have to go through metric

151
00:08:16,240 --> 00:08:18,240
and go get the lab results.

152
00:08:18,240 --> 00:08:20,960
But then the idea is why not go directly?

153
00:08:20,960 --> 00:08:24,320
So if for whatever reason, maybe the consumer

154
00:08:24,320 --> 00:08:28,280
has access to a certificate or a producer has access

155
00:08:28,280 --> 00:08:31,920
to a certificate, just go straight to the source

156
00:08:31,920 --> 00:08:34,160
and get the lab results.

157
00:08:34,160 --> 00:08:38,240
So it's far from perfect.

158
00:08:38,240 --> 00:08:40,960
But as I'll show you, as we just try

159
00:08:40,960 --> 00:08:46,520
to get our sample size large, then life will get better.

160
00:08:46,520 --> 00:08:48,080
So that's what we're working on.

161
00:08:48,080 --> 00:08:50,040
It's really cool.

162
00:08:50,040 --> 00:08:53,120
But I'm talking too long.

163
00:08:53,120 --> 00:08:55,280
Have you had a thought on that question, please?

164
00:08:55,280 --> 00:09:00,480
No, I was just going to say I need to be careful, too,

165
00:09:00,480 --> 00:09:04,280
though, because just because I have a metric API,

166
00:09:04,280 --> 00:09:07,600
I can't be scraping that data and publishing that publicly.

167
00:09:07,600 --> 00:09:12,120
I need to still go for public data sets,

168
00:09:12,120 --> 00:09:15,960
asking government states to please give me

169
00:09:15,960 --> 00:09:17,920
this public data set.

170
00:09:17,920 --> 00:09:20,200
But it would be kind of cool if patients,

171
00:09:20,200 --> 00:09:22,320
because I'm telling you, they're all doing it.

172
00:09:22,320 --> 00:09:24,000
They're going in spreadsheets.

173
00:09:24,000 --> 00:09:25,320
They're putting them in the graphs.

174
00:09:25,320 --> 00:09:26,400
They're plotting them.

175
00:09:26,400 --> 00:09:30,440
And I don't know if anybody's thinking about it.

176
00:09:30,440 --> 00:09:33,000
But I think it's so cool what you're doing, Keegan.

177
00:09:33,000 --> 00:09:34,160
And it's my fault, too.

178
00:09:34,160 --> 00:09:38,160
I didn't do any parsing coding at all this week.

179
00:09:38,160 --> 00:09:40,120
So bad.

180
00:09:40,120 --> 00:09:42,240
One, don't blame yourself.

181
00:09:42,240 --> 00:09:43,880
We're the turtle after all.

182
00:09:43,880 --> 00:09:46,200
And then let you speak one second, Rick.

183
00:09:46,200 --> 00:09:49,240
And then two, you approach things ethically,

184
00:09:49,240 --> 00:09:53,480
slow and steady, following the rules, wins the race.

185
00:09:53,480 --> 00:09:59,880
And then three, we'll just get the results

186
00:09:59,880 --> 00:10:01,240
wherever they may be.

187
00:10:01,240 --> 00:10:06,720
And so say we can't get them from the producer or the lab.

188
00:10:06,720 --> 00:10:09,680
If the consumer has a right to have them

189
00:10:09,680 --> 00:10:14,080
and they want to share, then that works for us.

190
00:10:14,080 --> 00:10:18,080
So Rick, what are your thoughts?

191
00:10:18,080 --> 00:10:22,640
So when Candice mentioned scraping the data from metric

192
00:10:22,640 --> 00:10:25,800
and being careful, I just thought,

193
00:10:25,800 --> 00:10:27,560
I don't know too much about that.

194
00:10:27,560 --> 00:10:30,040
Is any of that considered PHI?

195
00:10:30,040 --> 00:10:31,400
Or is that state to state?

196
00:10:31,400 --> 00:10:33,120
Is that federal?

197
00:10:33,120 --> 00:10:35,920
I'm not sure what type of info, personal info,

198
00:10:35,920 --> 00:10:39,440
is included if you were to scrape that.

199
00:10:39,440 --> 00:10:42,280
Well, I'm not scraping metric API.

200
00:10:42,280 --> 00:10:44,800
What I'm doing with that key is I'm

201
00:10:44,800 --> 00:10:47,760
helping cannabis businesses that want to integrate

202
00:10:47,760 --> 00:10:50,360
with the metric system.

203
00:10:50,360 --> 00:10:54,120
And I'm only getting authorized right now in Massachusetts.

204
00:10:54,120 --> 00:10:56,720
But then myself, as a software consultant,

205
00:10:56,720 --> 00:10:58,840
I can help people integrate.

206
00:10:58,840 --> 00:11:02,680
And I'll be using the CanLytics metric software developer

207
00:11:02,680 --> 00:11:06,040
kit to help me do that.

208
00:11:06,040 --> 00:11:10,360
But I still have a key, so I haven't learned metric yet.

209
00:11:10,360 --> 00:11:14,480
But I won't be scraping any public data for metric ever

210
00:11:14,480 --> 00:11:19,760
because I have signed agreements too that that's private data.

211
00:11:19,760 --> 00:11:23,840
But there are public data sets available

212
00:11:23,840 --> 00:11:25,320
being published by states.

213
00:11:25,320 --> 00:11:27,360
That's data I'm scraping.

214
00:11:27,360 --> 00:11:30,200
And then also too, I'm in a Facebook.

215
00:11:30,200 --> 00:11:32,320
You see, I'm a medical marijuana patient.

216
00:11:32,320 --> 00:11:37,360
I started late in my age, and I do it medically.

217
00:11:37,360 --> 00:11:40,920
And I don't take any pharmaceuticals.

218
00:11:40,920 --> 00:11:44,680
And I used to be prescribed like benzos, Percocets,

219
00:11:44,680 --> 00:11:48,440
and artane, a Parkinson pill daily for a long time.

220
00:11:48,440 --> 00:11:50,120
And it was affecting my health.

221
00:11:50,120 --> 00:11:51,600
But anyway, I digress.

222
00:11:51,600 --> 00:11:53,600
But the thing is that so anyway, I'm

223
00:11:53,600 --> 00:11:56,560
on these medical marijuana Facebook groups.

224
00:11:56,560 --> 00:12:01,080
And we kind of talk because none of my friends smoke pot still

225
00:12:01,080 --> 00:12:03,000
because I didn't do it till late in life.

226
00:12:03,000 --> 00:12:06,840
My friends just don't smoke it or do it.

227
00:12:06,840 --> 00:12:11,280
And but they're publishing COAs and batch numbers.

228
00:12:11,280 --> 00:12:13,080
So we have like Jungle Boys.

229
00:12:13,080 --> 00:12:14,680
And we have like True Leave.

230
00:12:14,680 --> 00:12:18,080
In fact, True Leave now, we can actually turn around.

231
00:12:18,080 --> 00:12:21,200
And I can just download COAs as a patient.

232
00:12:21,200 --> 00:12:24,840
Now see, as far as just going onto a website

233
00:12:24,840 --> 00:12:26,920
and scraping it, you have to look at agreements.

234
00:12:26,920 --> 00:12:30,280
You just can't be willy nilly taking pulling data

235
00:12:30,280 --> 00:12:31,640
off the internet.

236
00:12:31,640 --> 00:12:34,160
And but we have, well, not.

237
00:12:34,160 --> 00:12:38,680
Well, Keegan, Canlytics has had some labs

238
00:12:38,680 --> 00:12:41,640
that want transparency.

239
00:12:41,640 --> 00:12:47,200
And they've given him data or have allowed us to scrape data.

240
00:12:47,200 --> 00:12:50,440
And then also too, sometimes labs

241
00:12:50,440 --> 00:12:53,720
might want you as a software developer

242
00:12:53,720 --> 00:12:59,040
to run code to help them quantify, verify

243
00:12:59,040 --> 00:13:01,080
that their data is good.

244
00:13:01,080 --> 00:13:04,880
Whereas the Canlytics code could pick some outliers

245
00:13:04,880 --> 00:13:06,160
and anomalies.

246
00:13:06,160 --> 00:13:07,600
But anyway, sorry, Keegan.

247
00:13:07,600 --> 00:13:09,120
I'm talking to you.

248
00:13:09,120 --> 00:13:11,920
You made a lot of good points there, Candice.

249
00:13:11,920 --> 00:13:15,640
And the way I would say it is, yes, of course, definitely

250
00:13:15,640 --> 00:13:16,520
read the terms.

251
00:13:16,520 --> 00:13:21,680
And especially for metrics, so they're very explicit.

252
00:13:21,680 --> 00:13:23,840
Data in metric is metrics data.

253
00:13:23,840 --> 00:13:35,560
And in fact, yeah.

254
00:13:35,560 --> 00:13:37,920
Yeah, I'm just wondering how much value you can extract out

255
00:13:37,920 --> 00:13:39,160
of it if you can't use it.

256
00:13:39,160 --> 00:13:42,560
So when you say it's metrics data,

257
00:13:42,560 --> 00:13:49,880
if you have a license to use, can you house on-site data

258
00:13:49,880 --> 00:13:53,320
and use it to run your own analytics?

259
00:13:53,320 --> 00:13:54,120
Is that a?

260
00:13:54,120 --> 00:13:54,840
No, no, no.

261
00:13:54,840 --> 00:13:58,600
See, Rick, it's like an apple and an orange.

262
00:13:58,600 --> 00:14:03,040
So metric is why I'm pursuing the metric API,

263
00:14:03,040 --> 00:14:08,120
is so that I can help cannabis resellers, suspensories,

264
00:14:08,120 --> 00:14:10,240
connect into the metric system.

265
00:14:10,240 --> 00:14:12,360
That has nothing to do, that apple

266
00:14:12,360 --> 00:14:16,840
has nothing to do with the collection of data.

267
00:14:16,840 --> 00:14:20,400
Because I won't be collecting any data for myself

268
00:14:20,400 --> 00:14:22,680
from the metric API.

269
00:14:22,680 --> 00:14:24,560
But go ahead, Keegan.

270
00:14:24,560 --> 00:14:27,000
OK, so I'm just going to share my thoughts.

271
00:14:27,000 --> 00:14:28,640
And once again, not a lawyer.

272
00:14:28,640 --> 00:14:32,160
So as I said, this is really dicey territory.

273
00:14:32,160 --> 00:14:34,760
So definitely check your lawyer.

274
00:14:34,760 --> 00:14:37,120
And in fact, I was reading some or I

275
00:14:37,120 --> 00:14:40,120
was hearing somewhere data is now

276
00:14:40,120 --> 00:14:43,040
like the digital oil or digital gold.

277
00:14:43,040 --> 00:14:45,200
So it's incredibly valuable.

278
00:14:45,200 --> 00:14:46,920
We've known that for a while.

279
00:14:46,920 --> 00:14:48,960
But now, especially with the rise

280
00:14:48,960 --> 00:14:52,040
of the AI statistical models, yeah,

281
00:14:52,040 --> 00:14:54,360
it's really driven the point home.

282
00:14:54,360 --> 00:14:59,520
So I think a lot of this is still kind of getting hashed out,

283
00:14:59,520 --> 00:15:01,760
sometimes literally hashed out.

284
00:15:01,760 --> 00:15:07,880
But this is my understanding is anything in metrics database

285
00:15:07,880 --> 00:15:09,480
is their data.

286
00:15:09,480 --> 00:15:13,360
If you sign a terms of service to use their API,

287
00:15:13,360 --> 00:15:15,760
from my understanding, you're strictly

288
00:15:15,760 --> 00:15:19,640
only supposed to do that for traceability purposes.

289
00:15:19,640 --> 00:15:21,600
So it's a traceability system.

290
00:15:21,600 --> 00:15:24,960
So they want to know how many plants you have in the ground.

291
00:15:24,960 --> 00:15:26,720
They want to know how much you harvested.

292
00:15:26,720 --> 00:15:29,560
They want to know how much you sold.

293
00:15:29,560 --> 00:15:30,720
That's it.

294
00:15:30,720 --> 00:15:33,600
So from my understanding, that's all you're really

295
00:15:33,600 --> 00:15:37,120
allowed to do with the API.

296
00:15:37,120 --> 00:15:41,120
Things get weird because it's like, well, companies

297
00:15:41,120 --> 00:15:47,000
have their own inventory systems or sales systems.

298
00:15:47,000 --> 00:15:52,320
So retailers keeping track of what products they sold,

299
00:15:52,320 --> 00:15:55,080
what time they sold it at, how much they sold it at.

300
00:15:55,080 --> 00:16:01,640
So to a certain extent, the company has their own data.

301
00:16:01,640 --> 00:16:04,120
And so it's weird because it's like, well,

302
00:16:04,120 --> 00:16:07,560
if they then put that data into Metric,

303
00:16:07,560 --> 00:16:10,440
then maybe that's the sole place they're putting it.

304
00:16:10,440 --> 00:16:12,080
Is that still their data?

305
00:16:12,080 --> 00:16:17,200
And my point of view is, well, that's Metric's data.

306
00:16:17,200 --> 00:16:23,240
So maybe the company could maintain their own database,

307
00:16:23,240 --> 00:16:28,200
their own data, and do analytics on that.

308
00:16:28,200 --> 00:16:30,080
Not really clear.

309
00:16:30,080 --> 00:16:34,760
From my understanding, there are analytics companies out there

310
00:16:34,760 --> 00:16:37,800
that say, hey, we'll integrate with Metric.

311
00:16:37,800 --> 00:16:41,640
Put in our API key, and we'll give you analytics.

312
00:16:41,640 --> 00:16:46,480
I don't know about the legality or if that follows Metric's

313
00:16:46,480 --> 00:16:52,880
terms of service because that analytics company may

314
00:16:52,880 --> 00:16:57,480
be reading in data from the Metric API,

315
00:16:57,480 --> 00:17:02,640
calculating statistics, and then distributing those statistics

316
00:17:02,640 --> 00:17:04,600
likely for profit.

317
00:17:04,600 --> 00:17:10,120
And that's not really a needed step in traceability.

318
00:17:10,120 --> 00:17:14,280
And so to me, it's not real clear

319
00:17:14,280 --> 00:17:16,920
if that's valid using the data.

320
00:17:16,920 --> 00:17:19,440
So I've just stayed away from that.

321
00:17:19,440 --> 00:17:22,720
CanLytics has pursued getting integrated with Metric

322
00:17:22,720 --> 00:17:25,360
because I want to understand the system

323
00:17:25,360 --> 00:17:28,760
and potentially help people out if they're, say,

324
00:17:28,760 --> 00:17:33,160
looking to automate some of their workflow.

325
00:17:33,160 --> 00:17:37,280
So I think it's a useful tool to have in your tool belt

326
00:17:37,280 --> 00:17:39,280
to know the Metric API.

327
00:17:39,280 --> 00:17:42,640
But personally, I don't know.

328
00:17:42,640 --> 00:17:46,080
I'm leaving that to some of the other analytics companies

329
00:17:46,080 --> 00:17:51,480
that want to do statistics on Metric data

330
00:17:51,480 --> 00:17:54,080
for private companies.

331
00:17:54,080 --> 00:17:57,800
Once again, it may all be above ground.

332
00:17:57,800 --> 00:18:00,800
It may all be permitted because, as I said,

333
00:18:00,800 --> 00:18:03,640
to a certain extent, the companies

334
00:18:03,640 --> 00:18:06,800
have their own data, right?

335
00:18:06,800 --> 00:18:11,720
They may have their own database of products and lab results

336
00:18:11,720 --> 00:18:14,600
and sales.

337
00:18:14,600 --> 00:18:17,960
But I think if I was running a company,

338
00:18:17,960 --> 00:18:24,000
I would make real certain to have my own private database

339
00:18:24,000 --> 00:18:26,400
with my own private data that's almost

340
00:18:26,400 --> 00:18:28,360
separate from traceability.

341
00:18:28,360 --> 00:18:31,840
And then you just go through traceability

342
00:18:31,840 --> 00:18:33,680
for compliance purposes.

343
00:18:33,680 --> 00:18:35,440
So it's like, OK, I made a sale.

344
00:18:35,440 --> 00:18:38,040
I need to record that in the compliance system.

345
00:18:38,040 --> 00:18:41,560
I'll put some of my sales data into Metric.

346
00:18:41,560 --> 00:18:43,280
But this is still my sales data.

347
00:18:43,280 --> 00:18:46,320
And then, yes, if I wanted to share my personal data

348
00:18:46,320 --> 00:18:48,680
for my database with the analytics company

349
00:18:48,680 --> 00:18:53,240
to do statistics, then I think that would be reasonable.

350
00:18:53,240 --> 00:18:55,240
But once again, not a lawyer.

351
00:18:55,240 --> 00:19:01,240
Don't press me because, as I said at the beginning,

352
00:19:01,240 --> 00:19:07,440
the data landscape, to me, is not really well-defined.

353
00:19:07,440 --> 00:19:11,240
And as I said, there's tons of analytics companies out there.

354
00:19:11,240 --> 00:19:14,640
I don't really know how they operate.

355
00:19:14,640 --> 00:19:20,040
But the way I'm approaching it is just saying, OK,

356
00:19:20,040 --> 00:19:21,520
that's a little murky.

357
00:19:21,520 --> 00:19:27,800
And as Candice says, let's just go for open data sets

358
00:19:27,800 --> 00:19:32,600
that we know are public, we know we're allowed to work with.

359
00:19:32,600 --> 00:19:34,960
And sometimes we have to limit ourselves.

360
00:19:34,960 --> 00:19:39,480
We can't just put our paws all over everybody's stuff.

361
00:19:39,480 --> 00:19:41,040
But if it's in the public domain,

362
00:19:41,040 --> 00:19:43,520
so for example, the Washington results,

363
00:19:43,520 --> 00:19:48,040
I think MCR Labs is publishing results on the web.

364
00:19:48,040 --> 00:19:50,040
In the past, we've collected results.

365
00:19:50,040 --> 00:19:52,600
They will look at PSI Labs.

366
00:19:52,600 --> 00:19:57,000
And then in the past, SC Labs has published lab results.

367
00:19:57,000 --> 00:19:59,080
And then what was our final source?

368
00:19:59,080 --> 00:20:00,720
Oh, yeah, Connecticut.

369
00:20:00,720 --> 00:20:05,760
And then in Connecticut, they have a public registry

370
00:20:05,760 --> 00:20:08,480
of products and their results.

371
00:20:08,480 --> 00:20:11,280
So those are public avenues.

372
00:20:11,280 --> 00:20:14,200
And then as Candice was saying, in some states,

373
00:20:14,200 --> 00:20:17,560
people are allowed to get certificates.

374
00:20:17,560 --> 00:20:24,640
And of course, I realized if you make a purchase,

375
00:20:24,640 --> 00:20:28,560
your sales receipt is technically in the metric.

376
00:20:28,560 --> 00:20:33,160
Well, that's going to be added up into the sales total.

377
00:20:33,160 --> 00:20:36,520
And so that's why I was saying, if we're creative,

378
00:20:36,520 --> 00:20:41,520
we can still get to the data in a legitimate manner.

379
00:20:41,520 --> 00:20:45,560
So it's like, OK, you can't get sales data directly

380
00:20:45,560 --> 00:20:50,120
out of the metric API just willy nilly.

381
00:20:50,120 --> 00:20:55,560
So well, if a consumer purchased a product in Florida,

382
00:20:55,560 --> 00:20:57,160
they'll have a receipt.

383
00:20:57,160 --> 00:21:00,120
You can take a picture of their receipt.

384
00:21:00,120 --> 00:21:02,800
And I've been looking at these receipts.

385
00:21:02,800 --> 00:21:05,480
And they have a surprisingly large number

386
00:21:05,480 --> 00:21:09,800
of the exact same data points that are in the metric API.

387
00:21:09,800 --> 00:21:14,440
And that's why I said it's so useful to know the data models.

388
00:21:14,440 --> 00:21:17,080
Because on the receipt, you may have things

389
00:21:17,080 --> 00:21:22,840
like date, time sold, or I forget the official key,

390
00:21:22,840 --> 00:21:28,360
but time sold, price, sales tax.

391
00:21:28,360 --> 00:21:30,640
You may even have the ID.

392
00:21:30,640 --> 00:21:34,520
And so the idea is just to get as much data

393
00:21:34,520 --> 00:21:38,560
as you can through that avenue.

394
00:21:38,560 --> 00:21:41,520
So I don't know.

395
00:21:41,520 --> 00:21:43,880
I'm maybe talking too long.

396
00:21:43,880 --> 00:21:45,920
But there's more to come in that regard.

397
00:21:45,920 --> 00:21:47,040
But Rick?

398
00:21:47,040 --> 00:21:48,040
No, that was great.

399
00:21:48,040 --> 00:21:49,680
That was a great explanation.

400
00:21:49,680 --> 00:21:51,680
So you gave me a lot of info.

401
00:21:51,680 --> 00:21:55,880
I just didn't know a lot about the metric system.

402
00:21:55,880 --> 00:21:57,320
I knew about the public data sets,

403
00:21:57,320 --> 00:21:59,840
which I found very, very valuable.

404
00:21:59,840 --> 00:22:03,200
Seems like there's just a problem unifying the data.

405
00:22:03,200 --> 00:22:08,400
And it takes a lot of groundwork to go and tie it all together,

406
00:22:08,400 --> 00:22:10,520
which is understandable given where we're at.

407
00:22:10,520 --> 00:22:12,440
And hopefully, that changes.

408
00:22:12,440 --> 00:22:15,840
So thank you.

409
00:22:15,840 --> 00:22:16,600
Phenomenal, Rick.

410
00:22:16,600 --> 00:22:20,840
And just wait till you see what's in the store today.

411
00:22:20,840 --> 00:22:23,080
But some of that.

412
00:22:23,080 --> 00:22:24,400
So too cool.

413
00:22:24,400 --> 00:22:28,040
And once again, that's my perspective.

414
00:22:28,040 --> 00:22:31,280
As I said, there are other companies

415
00:22:31,280 --> 00:22:33,760
that do data analytics.

416
00:22:33,760 --> 00:22:36,640
So some of them are quite nice.

417
00:22:36,640 --> 00:22:38,320
And they've got some good representatives.

418
00:22:38,320 --> 00:22:39,760
So feel free to reach out.

419
00:22:39,760 --> 00:22:42,560
I can even get you a contact.

420
00:22:42,560 --> 00:22:45,760
And you can pick their brains.

421
00:22:45,760 --> 00:22:49,200
But this is the avenue we're pursuing.

422
00:22:49,200 --> 00:22:52,240
But as I'll show you, it can actually be quite fruitful.

423
00:22:52,240 --> 00:22:58,040
Because our other philosophy is just go ahead and don't wait.

424
00:22:58,040 --> 00:22:59,440
Just go get it.

425
00:22:59,440 --> 00:23:03,200
So it's like we can't just wait around for lab results

426
00:23:03,200 --> 00:23:04,840
to fall into our lap.

427
00:23:04,840 --> 00:23:07,120
So we'll just go get what we can.

428
00:23:07,120 --> 00:23:09,280
And also, I'll show you today.

429
00:23:09,280 --> 00:23:10,320
We got a fair amount.

430
00:23:13,200 --> 00:23:14,560
But I've been talking a lot.

431
00:23:14,560 --> 00:23:16,680
Edwin, any thoughts, comments, questions?

432
00:23:19,720 --> 00:23:21,480
I'm not sure if I've seen you before.

433
00:23:21,480 --> 00:23:23,600
But you're also welcome to introduce yourself.

434
00:23:23,600 --> 00:23:25,640
So sorry if I don't remember you.

435
00:23:25,640 --> 00:23:28,520
You seem a little familiar.

436
00:23:28,520 --> 00:23:32,280
No, this is actually my first time in the group.

437
00:23:32,280 --> 00:23:34,160
So hi, everybody.

438
00:23:34,160 --> 00:23:39,920
I'm a, I guess, budding data scientist for a little, little

439
00:23:39,920 --> 00:23:40,520
fun there.

440
00:23:40,520 --> 00:23:44,440
But I actually have very little experience

441
00:23:44,440 --> 00:23:45,560
in the data science world.

442
00:23:45,560 --> 00:23:50,320
I'm currently a college grad with a degree in biology.

443
00:23:50,320 --> 00:23:55,120
I've just started a data science certification program,

444
00:23:55,120 --> 00:23:57,320
because that's where I want to head in life.

445
00:23:57,320 --> 00:24:02,920
And one of the things that they suggest that we do

446
00:24:02,920 --> 00:24:05,720
is look around for meetups to kind of see

447
00:24:05,720 --> 00:24:07,600
what it's like in spaces like these.

448
00:24:07,600 --> 00:24:12,400
And I am an occasional user of cannabis products.

449
00:24:12,400 --> 00:24:17,080
And I wanted to see where data science is being,

450
00:24:17,080 --> 00:24:19,000
how it's being used in this space.

451
00:24:19,000 --> 00:24:21,040
And it's honestly really cool.

452
00:24:21,040 --> 00:24:23,280
And it's really interesting to hear

453
00:24:23,280 --> 00:24:24,600
what you guys are talking about.

454
00:24:24,600 --> 00:24:27,960
And it seems like a lot of the things so far,

455
00:24:27,960 --> 00:24:30,600
a lot of our concerns have been focused

456
00:24:30,600 --> 00:24:33,400
around procuring data.

457
00:24:33,400 --> 00:24:36,440
Am I correct in that?

458
00:24:36,440 --> 00:24:38,720
Or is that, like, what are some of the challenges

459
00:24:38,720 --> 00:24:42,240
that you guys are facing in using data

460
00:24:42,240 --> 00:24:44,960
science in this industry?

461
00:24:44,960 --> 00:24:45,600
Just wait.

462
00:24:45,600 --> 00:24:49,080
You're going to be in for a super, super good treat,

463
00:24:49,080 --> 00:24:52,760
because I've got a full demonstration about what we're

464
00:24:52,760 --> 00:24:55,480
doing and how we go about it.

465
00:24:55,480 --> 00:24:57,000
Exactly.

466
00:24:57,000 --> 00:24:59,000
Almost the, right.

467
00:24:59,000 --> 00:25:02,720
So we've been at this group for a little over two years.

468
00:25:02,720 --> 00:25:06,520
And almost the solid first year, we're

469
00:25:06,520 --> 00:25:09,280
spent just collecting the data.

470
00:25:09,280 --> 00:25:11,960
And then the second year, we're spent just

471
00:25:11,960 --> 00:25:15,520
trying to standardize it so that way we could just calculate

472
00:25:15,520 --> 00:25:19,400
the most basic of statistics.

473
00:25:19,400 --> 00:25:25,200
You can't calculate the mean until you can add everything up

474
00:25:25,200 --> 00:25:26,240
correctly.

475
00:25:26,240 --> 00:25:28,920
And so it's like you're adding things up.

476
00:25:28,920 --> 00:25:32,960
And then all of a sudden, you get numbers, numbers, numbers.

477
00:25:32,960 --> 00:25:36,520
And then you've got less than 0.01.

478
00:25:36,520 --> 00:25:39,040
How do you handle that?

479
00:25:39,040 --> 00:25:42,240
So that's what we spent almost the whole second year doing,

480
00:25:42,240 --> 00:25:45,480
was just figuring out how we could actually

481
00:25:45,480 --> 00:25:49,680
run basic statistics with the data.

482
00:25:49,680 --> 00:25:53,400
And then we got a little beyond basic statistics

483
00:25:53,400 --> 00:25:57,480
and did some pretty cool modeling.

484
00:25:57,480 --> 00:25:59,440
And then I think, at least for me,

485
00:25:59,440 --> 00:26:02,840
I reached the limitation of my knowledge real quickly,

486
00:26:02,840 --> 00:26:05,760
especially in the recent months with just

487
00:26:05,760 --> 00:26:09,960
the rapid developments in AI and statistics.

488
00:26:09,960 --> 00:26:14,440
So I'm just sprinting to stay at the back of the pack.

489
00:26:14,440 --> 00:26:18,400
But I'm having fun.

490
00:26:18,400 --> 00:26:21,040
But anywho, let's go ahead and start pretty soon.

491
00:26:21,040 --> 00:26:23,960
I'm going to go ahead and share my screen.

492
00:26:23,960 --> 00:26:28,120
And you're going to be in for a treat today.

493
00:26:28,120 --> 00:26:32,600
So I'm just going to go ahead and get all this data loading

494
00:26:32,600 --> 00:26:33,840
while we talk about it.

495
00:26:36,720 --> 00:26:38,920
And so what do we have here?

496
00:26:38,920 --> 00:26:44,880
So we've collected data from five different states

497
00:26:44,880 --> 00:26:46,280
for lab results.

498
00:26:46,280 --> 00:26:49,120
As I said, unfortunately, I don't

499
00:26:49,120 --> 00:26:51,960
think they published their lab results anymore.

500
00:26:51,960 --> 00:26:55,400
And then I missed about 1,000 or so of them.

501
00:26:55,400 --> 00:26:59,880
But we had collected around 6,000 or so lab results

502
00:26:59,880 --> 00:27:03,400
from SC labs in California.

503
00:27:03,400 --> 00:27:07,360
And as I said, we've got lab results from Connecticut.

504
00:27:07,360 --> 00:27:11,760
And see here, I'm just doing a little last minute data

505
00:27:11,760 --> 00:27:12,240
cleaning.

506
00:27:15,120 --> 00:27:18,120
That's another thing is I try to say standardization

507
00:27:18,120 --> 00:27:20,080
over cleaning these days.

508
00:27:20,080 --> 00:27:26,360
Because data cleaning almost implies the data's dirty

509
00:27:26,360 --> 00:27:30,120
and you're manipulating it.

510
00:27:30,120 --> 00:27:31,800
And so I don't know.

511
00:27:31,800 --> 00:27:36,520
I don't feel like it leads to think

512
00:27:36,520 --> 00:27:39,520
it may hurt your credibility to use words like data

513
00:27:39,520 --> 00:27:44,600
cleaning and data manipulation, even though that

514
00:27:44,600 --> 00:27:48,280
may be a reasonable jargon.

515
00:27:48,280 --> 00:27:54,680
But anywho, we're standardizing some of these fields

516
00:27:54,680 --> 00:27:58,160
we have for the various states.

517
00:27:58,160 --> 00:28:03,600
Some states are already a bit more standardized than others.

518
00:28:03,600 --> 00:28:08,480
So I'll have to share with you some prior meetups where

519
00:28:08,480 --> 00:28:11,560
we went into depth collecting these.

520
00:28:11,560 --> 00:28:14,200
And so this is where I was saying all of these

521
00:28:14,200 --> 00:28:17,280
were journeys.

522
00:28:17,280 --> 00:28:21,720
It was an expedition to get all these lab results

523
00:28:21,720 --> 00:28:25,080
from California and Michigan.

524
00:28:25,080 --> 00:28:28,280
The Washington ones took a long time

525
00:28:28,280 --> 00:28:31,880
because there's a ton of data augmentation

526
00:28:31,880 --> 00:28:39,240
where we have to match lab results to inventory

527
00:28:39,240 --> 00:28:42,400
all the way to the plant stream.

528
00:28:42,400 --> 00:28:45,360
So that one's a little complicated.

529
00:28:45,360 --> 00:28:48,000
And then the Massachusetts one, that's

530
00:28:48,000 --> 00:28:50,280
where we're getting results from MCR labs.

531
00:28:53,800 --> 00:28:59,520
And once again, a little more, actually more than a little.

532
00:28:59,520 --> 00:29:02,240
So I'm just going to go ahead and run all of this

533
00:29:02,240 --> 00:29:07,480
while I talk because my philosophy is the code is

534
00:29:07,480 --> 00:29:11,520
a little less interesting than the data,

535
00:29:11,520 --> 00:29:14,320
and in particular, the visualizations.

536
00:29:14,320 --> 00:29:16,160
So I'm just going to rush through this as quick

537
00:29:16,160 --> 00:29:19,520
as possible to get to the visualizations.

538
00:29:19,520 --> 00:29:21,520
But I'll at least talk about what I'm doing.

539
00:29:21,520 --> 00:29:25,120
We've got a sizable amount of lab results for each state.

540
00:29:25,120 --> 00:29:29,120
And I'm just making sure we have all of the same data points.

541
00:29:29,120 --> 00:29:34,200
So I'm just making sure for each observation,

542
00:29:34,200 --> 00:29:39,200
which is a cannabis product that was sampled and sent

543
00:29:39,200 --> 00:29:41,800
to a laboratory to get tested.

544
00:29:41,800 --> 00:29:46,800
So these are lab results for some cannabis product that

545
00:29:46,800 --> 00:29:50,480
was sold at some retail dispensary.

546
00:29:50,480 --> 00:29:55,920
And so we're keeping track of the THC content and the CBD

547
00:29:55,920 --> 00:29:58,040
content.

548
00:29:58,040 --> 00:30:00,920
And then just for simplicity, I'm

549
00:30:00,920 --> 00:30:08,880
just going to use total THC and total CBD for today.

550
00:30:08,880 --> 00:30:12,080
And then where possible, I'm going

551
00:30:12,080 --> 00:30:20,000
to be standardizing these top, I think, five or six terpenes

552
00:30:20,000 --> 00:30:22,720
that we've looked at in the past.

553
00:30:22,720 --> 00:30:28,880
And so there are maybe anywhere from a dozen to two dozen

554
00:30:28,880 --> 00:30:33,820
or more terpenes that are tested on a regular basis

555
00:30:33,820 --> 00:30:35,040
in the plants.

556
00:30:35,040 --> 00:30:37,640
And as we discovered in, remember

557
00:30:37,640 --> 00:30:41,040
when we talked about what makes a cheese a cheese,

558
00:30:41,040 --> 00:30:46,320
some of the minor terpenes may actually be quite important.

559
00:30:46,320 --> 00:30:49,880
But we found these major terpenes

560
00:30:49,880 --> 00:30:52,480
have some interesting relationships.

561
00:30:52,480 --> 00:30:56,760
So basically, just going to get to these are just compounds,

562
00:30:56,760 --> 00:31:03,960
beta-cariophylline, alpha-humulene, beta-mercine,

563
00:31:03,960 --> 00:31:08,120
cariophylline oxide, beta-pinene, d-lemonine.

564
00:31:08,120 --> 00:31:10,000
And we won't look at this one today,

565
00:31:10,000 --> 00:31:12,200
but this is a compound that I think

566
00:31:12,200 --> 00:31:14,280
is interesting for penicillin.

567
00:31:17,440 --> 00:31:24,160
So once again, for any of you who want to do some research,

568
00:31:24,160 --> 00:31:28,840
we found that chemistry can be remarkably explanatory.

569
00:31:28,840 --> 00:31:35,000
So once you actually have this analyte name,

570
00:31:35,000 --> 00:31:38,840
there's a whole other suite of data augmentation

571
00:31:38,840 --> 00:31:40,800
that we haven't done yet.

572
00:31:40,800 --> 00:31:48,040
You could augment the compound boiling point,

573
00:31:48,040 --> 00:31:57,760
which and other properties of these chemicals.

574
00:31:57,760 --> 00:32:03,480
So I'm not a chemist, but the boiling point

575
00:32:03,480 --> 00:32:08,000
is a low-hanging fruit data point.

576
00:32:08,000 --> 00:32:11,600
But that one's of critical importance.

577
00:32:11,600 --> 00:32:16,200
So for example, processors, people making cannabis oil

578
00:32:16,200 --> 00:32:20,520
from flour, the boiling point for each compound

579
00:32:20,520 --> 00:32:23,760
is of utmost importance.

580
00:32:23,760 --> 00:32:33,280
So this is an error that I think I can troubleshoot pretty quick.

581
00:32:39,120 --> 00:32:40,560
We may just have to do without.

582
00:32:44,520 --> 00:32:46,040
Hopefully, we can still aggregate these.

583
00:32:46,040 --> 00:32:47,400
OK, wonderful.

584
00:32:47,400 --> 00:32:54,400
So as promised, we've gone through all of this rigmarole.

585
00:32:54,400 --> 00:32:56,840
I'll just show you what a data point looks like.

586
00:32:56,840 --> 00:32:58,880
We'll just take a random sample of five.

587
00:33:01,800 --> 00:33:07,520
And then let's just look at some of these top fields.

588
00:33:07,520 --> 00:33:09,040
So we've got the product name.

589
00:33:09,040 --> 00:33:14,760
And then we've got the total THC and the total CDE.

590
00:33:16,480 --> 00:33:21,360
So we can just start taking some samples of these.

591
00:33:21,360 --> 00:33:25,560
And these are the types of products we're working with.

592
00:33:25,560 --> 00:33:30,480
So here's an infused joint, an indica joint.

593
00:33:34,160 --> 00:33:36,640
Here's ridge-rite.

594
00:33:36,640 --> 00:33:44,480
Here's ridge-line runs, 30% THC, so on and so forth.

595
00:33:44,480 --> 00:33:48,520
And we've done all sorts of fine analysis in the past

596
00:33:48,520 --> 00:33:52,320
because this is natural language.

597
00:33:52,320 --> 00:33:57,040
And so we found the product names are a fun field

598
00:33:57,040 --> 00:34:00,520
to use natural language processing on.

599
00:34:00,520 --> 00:34:05,200
OK, that's all for today.

600
00:34:05,200 --> 00:34:08,320
That's all fine and dandy.

601
00:34:08,320 --> 00:34:13,040
So what's a good research question for today?

602
00:34:13,040 --> 00:34:23,520
Well, last week, we were talking about South Africa.

603
00:34:23,520 --> 00:34:30,200
We were talking about licenses in South Africa.

604
00:34:30,200 --> 00:34:32,800
Well, what I realized is from I was

605
00:34:32,800 --> 00:34:35,200
starting to look at the map to figure out all

606
00:34:35,200 --> 00:34:38,320
the different provinces.

607
00:34:38,320 --> 00:34:40,960
And I may forget the name of it.

608
00:34:40,960 --> 00:34:44,400
But correct me if I'm wrong, but I

609
00:34:44,400 --> 00:34:50,680
want to say there's a city in South Africa, Durban.

610
00:34:50,680 --> 00:34:57,280
And well, that jogged my memory that, oh, yes, there's

611
00:34:57,280 --> 00:35:04,040
a famous cannabis strain, Durban poison.

612
00:35:09,120 --> 00:35:11,640
Let's just print all these out real quick.

613
00:35:17,120 --> 00:35:20,400
And also, let me actually check the map real quick just

614
00:35:20,400 --> 00:35:21,960
to make sure I'm not quittering this.

615
00:35:21,960 --> 00:35:27,320
Right?

616
00:35:27,320 --> 00:35:28,160
Yeah.

617
00:35:28,160 --> 00:35:38,440
So yeah, so here we were talking about South Africa.

618
00:35:38,440 --> 00:35:44,320
And there you see Durban is in KwaZulu-Natal,

619
00:35:44,320 --> 00:35:47,120
which is one of the provinces.

620
00:35:47,120 --> 00:35:51,840
So OK, so just wanted to do a quick sanity check.

621
00:35:51,840 --> 00:35:55,840
That's what I always recommend in data science.

622
00:35:55,840 --> 00:35:59,240
And that's a philosophy from Python, right,

623
00:35:59,240 --> 00:36:03,760
is don't guess, you know.

624
00:36:03,760 --> 00:36:06,840
So check.

625
00:36:06,840 --> 00:36:11,560
So anywho, so there's a city in South Africa, Durban.

626
00:36:11,560 --> 00:36:14,720
So we found all these Durban poisons.

627
00:36:14,720 --> 00:36:18,600
And then I realized we were just talking about Thailand.

628
00:36:18,600 --> 00:36:24,320
So we could find Thai strains.

629
00:36:24,320 --> 00:36:30,640
So here are a bunch of Thai strains that we've observed.

630
00:36:30,640 --> 00:36:38,680
So we've seen a lot in Michigan, some in Massachusetts.

631
00:36:38,680 --> 00:36:47,320
Not certain if this is a Thai, this Thai leva, but perhaps.

632
00:36:47,320 --> 00:36:52,720
And then just to go ahead and add a third,

633
00:36:52,720 --> 00:36:57,440
I was trying to find some Colombian strains.

634
00:36:57,440 --> 00:37:03,520
And so there's Colombian gold is the famous one.

635
00:37:03,520 --> 00:37:07,400
But the only one observed was this one in Michigan.

636
00:37:07,400 --> 00:37:10,960
And so I'm not certain if this one is valid or not.

637
00:37:10,960 --> 00:37:15,800
But there's a Colombian profit and a newer strain

638
00:37:15,800 --> 00:37:21,640
that I realized may technically be Colombian is Medellin,

639
00:37:21,640 --> 00:37:27,640
which I think is, once again, let's double check.

640
00:37:27,640 --> 00:37:32,840
I want to say Medellin is a city in Colombia.

641
00:37:36,560 --> 00:37:39,320
Maybe or maybe not.

642
00:37:39,320 --> 00:37:41,000
So Medellin.

643
00:37:41,000 --> 00:37:44,120
So this is a newer strain.

644
00:37:44,120 --> 00:37:47,120
There's also a strain Santa Marta.

645
00:37:47,120 --> 00:37:52,880
But I didn't see any of those in the data set.

646
00:37:52,880 --> 00:37:58,600
And now we can start getting to some of the visualizations.

647
00:37:58,600 --> 00:38:04,720
So this is what's interesting in the importance of getting

648
00:38:04,720 --> 00:38:07,720
our sample size large.

649
00:38:07,720 --> 00:38:20,360
So out of the 125,000 plus lab results,

650
00:38:20,360 --> 00:38:28,480
there are only 77 Durbin type strains or flower type strains

651
00:38:28,480 --> 00:38:33,360
that were tested, which is actually we'll take it.

652
00:38:33,360 --> 00:38:36,200
So remember, I've said in previous meetups

653
00:38:36,200 --> 00:38:40,440
that I'm a small sample guy.

654
00:38:40,440 --> 00:38:42,560
So I'm OK with that.

655
00:38:42,560 --> 00:38:45,560
And what I like to say is this is

656
00:38:45,560 --> 00:38:51,160
called doing conditional upon conditional means.

657
00:38:51,160 --> 00:38:56,840
So I may or may not be able to do this because we were talking

658
00:38:56,840 --> 00:39:00,880
about standardization here.

659
00:39:00,880 --> 00:39:04,040
Yeah, so this is what I was saying.

660
00:39:04,040 --> 00:39:11,400
It's sometimes non-trivial just to do a mean.

661
00:39:11,400 --> 00:39:13,680
But I'll try here.

662
00:39:13,680 --> 00:39:17,480
So we can just say, oh, let's make this numeric.

663
00:39:17,480 --> 00:39:19,960
And let's coerce our errors.

664
00:39:23,160 --> 00:39:24,480
And let's see.

665
00:39:24,480 --> 00:39:29,640
OK, so that's not even logical, right?

666
00:39:29,640 --> 00:39:32,480
Because it's like, oh, you know, there

667
00:39:32,480 --> 00:39:37,560
are some THC values above 100.

668
00:39:37,560 --> 00:39:42,200
So you could get fancy and say, oh, let's just

669
00:39:42,200 --> 00:39:50,040
get everything where THC is less than 100.

670
00:39:50,040 --> 00:40:07,520
So once again, any who, I'm not going to comment question.

671
00:40:07,520 --> 00:40:09,000
Yeah, a quick question.

672
00:40:09,000 --> 00:40:10,720
We don't have to go through this exercise.

673
00:40:10,720 --> 00:40:12,120
But I was just curious.

674
00:40:12,120 --> 00:40:15,600
You had mentioned augmenting the data back there.

675
00:40:15,600 --> 00:40:21,160
So could we, and kind of getting crazy with it,

676
00:40:21,160 --> 00:40:22,720
but if we went down a rabbit hole,

677
00:40:22,720 --> 00:40:24,680
if we were that interested, could we

678
00:40:24,680 --> 00:40:30,400
take the results for all of the raw flower material

679
00:40:30,400 --> 00:40:35,680
with the Durbin poison name and then track,

680
00:40:35,680 --> 00:40:40,520
is there growing conditions of the facility,

681
00:40:40,520 --> 00:40:42,000
how they grew it?

682
00:40:42,000 --> 00:40:44,640
Because I think there's expressions for turpenes,

683
00:40:44,640 --> 00:40:47,720
especially, that come out under certain environments.

684
00:40:47,720 --> 00:40:50,520
So if we were to look at that city in South Africa

685
00:40:50,520 --> 00:40:55,640
or wherever that land race that still is attached to that strain

686
00:40:55,640 --> 00:41:00,200
name originated, as a grower or breeder,

687
00:41:00,200 --> 00:41:04,240
I would be interested in trying to make that expression stronger

688
00:41:04,240 --> 00:41:09,000
either through environmental variables or some other way.

689
00:41:09,000 --> 00:41:13,720
So can we go that deep down the rabbit hole with this?

690
00:41:13,720 --> 00:41:15,200
You can go pretty deep.

691
00:41:15,200 --> 00:41:24,480
So for example, we have the 77 Thai results, or 13.

692
00:41:24,480 --> 00:41:26,920
We've got 13 Thai results.

693
00:41:26,920 --> 00:41:30,280
So for example, you can look at this one.

694
00:41:30,280 --> 00:41:34,720
This first one was a lemon Thai.

695
00:41:34,720 --> 00:41:37,800
And you can see that, OK, this was actually

696
00:41:37,800 --> 00:41:41,200
grown in Palm Springs, California.

697
00:41:41,200 --> 00:41:46,960
So my guess is it was probably grown inside,

698
00:41:46,960 --> 00:41:49,280
because this is Southern California, I think.

699
00:41:49,280 --> 00:41:51,480
So this was probably grown inside.

700
00:41:55,640 --> 00:42:03,800
And then let's see if we can't find one of the Durbans.

701
00:42:03,800 --> 00:42:07,120
And so where was this Durban grown?

702
00:42:07,120 --> 00:42:10,040
OK, actually, this is interesting here.

703
00:42:10,040 --> 00:42:12,640
And so for example, this strain was

704
00:42:12,640 --> 00:42:16,800
grown in Oakland, California.

705
00:42:16,800 --> 00:42:20,800
Once again, maybe inside.

706
00:42:20,800 --> 00:42:23,880
So this is something that's come up before

707
00:42:23,880 --> 00:42:26,240
and something that's super tricky

708
00:42:26,240 --> 00:42:29,360
is that, like you said, the plants

709
00:42:29,360 --> 00:42:33,720
may have different terpenes expressions

710
00:42:33,720 --> 00:42:35,520
based on environmental factors.

711
00:42:35,520 --> 00:42:40,200
And this is something that we've racked our brain around.

712
00:42:40,200 --> 00:42:49,160
But we haven't had any insights to add here.

713
00:42:49,160 --> 00:42:50,320
Then let us know.

714
00:42:50,320 --> 00:42:54,720
But basically, we've run into this conundrum,

715
00:42:54,720 --> 00:43:03,320
because one would expect outdoor strains would be different.

716
00:43:03,320 --> 00:43:06,400
One would expect people in, say, California

717
00:43:06,400 --> 00:43:15,040
would be more likely to grow outdoor than, say, somebody

718
00:43:15,040 --> 00:43:19,680
in Massachusetts or Michigan.

719
00:43:19,680 --> 00:43:22,560
So where is this person located?

720
00:43:22,560 --> 00:43:25,960
And so look, this is just an anonymous producer,

721
00:43:25,960 --> 00:43:28,160
so we don't know where this person is.

722
00:43:28,160 --> 00:43:30,600
This person's in Santa Cruz.

723
00:43:34,280 --> 00:43:38,200
This one's anonymous, but it's a Detroit Durban.

724
00:43:38,200 --> 00:43:41,840
So they must be up in Michigan.

725
00:43:41,840 --> 00:43:45,080
So I don't know.

726
00:43:45,080 --> 00:43:51,640
I think there may be something there, like state effects.

727
00:43:51,640 --> 00:43:58,640
Are people better at growing in Massachusetts or Michigan?

728
00:43:58,640 --> 00:44:05,560
Is there an effect of being in California on your terpenes?

729
00:44:05,560 --> 00:44:10,080
So maybe people are growing in greenhouses in California,

730
00:44:10,080 --> 00:44:12,920
and they're getting that bright California sun,

731
00:44:12,920 --> 00:44:16,320
and it's doing wonders for their plants.

732
00:44:16,320 --> 00:44:18,800
That may be a problem.

733
00:44:18,800 --> 00:44:25,480
But it's going to take an ambitious data scientist

734
00:44:25,480 --> 00:44:32,080
like yourself to figure out a way to have an orchestrator that

735
00:44:32,080 --> 00:44:37,600
has access to all these random different data sets

736
00:44:37,600 --> 00:44:40,960
and tries to intelligently string them together.

737
00:44:40,960 --> 00:44:44,080
So I think that's where AI steps in.

738
00:44:44,080 --> 00:44:49,200
It's going to take an ambitious data scientist,

739
00:44:49,200 --> 00:44:52,800
and it's going to take an ambitious data scientist

740
00:44:52,800 --> 00:44:57,480
to figure out a way to have an orchestrator that has access

741
00:44:57,480 --> 00:45:00,000
to all these random different data sets

742
00:45:00,000 --> 00:45:03,120
and tries to intelligently string them together,

743
00:45:03,120 --> 00:45:07,160
because it gets overwhelming quickly.

744
00:45:07,160 --> 00:45:08,840
You're quite right.

745
00:45:08,840 --> 00:45:13,200
Because as I was saying, in a lot of the meetups,

746
00:45:13,200 --> 00:45:17,880
I'm just using the data point, and then some of the compounds.

747
00:45:17,880 --> 00:45:23,480
So I'm just using product name, and total THC, and total CBD,

748
00:45:23,480 --> 00:45:26,120
and some of the terpenes.

749
00:45:26,120 --> 00:45:31,240
But as you were saying, if you built a more complex statistical

750
00:45:31,240 --> 00:45:36,320
model or somehow used some of the AI models,

751
00:45:36,320 --> 00:45:41,360
you may be able to incorporate every data point.

752
00:45:41,360 --> 00:45:44,800
I'm missing.

753
00:45:44,800 --> 00:45:51,920
But the location in the state may, well, not may.

754
00:45:51,920 --> 00:45:55,400
I think they're probably significant predictors

755
00:45:55,400 --> 00:45:58,080
of the results.

756
00:45:58,080 --> 00:46:01,320
But there may be other interesting things

757
00:46:01,320 --> 00:46:02,440
you could uncover.

758
00:46:02,440 --> 00:46:06,360
So it ended up being controversial.

759
00:46:06,360 --> 00:46:09,360
But a long time ago, I had suggested that, oh, you

760
00:46:09,360 --> 00:46:13,800
could look at pesticide failures by location

761
00:46:13,800 --> 00:46:15,920
and see if, oh, if you're near anything,

762
00:46:15,920 --> 00:46:18,320
if you may be failing.

763
00:46:18,320 --> 00:46:20,800
And then all of a sudden in Washington,

764
00:46:20,800 --> 00:46:25,800
they did this sampling of samples.

765
00:46:25,800 --> 00:46:31,200
And they were finding the derivative of BDT

766
00:46:31,200 --> 00:46:33,560
from old apple orchards.

767
00:46:33,560 --> 00:46:38,360
But then I was actually hearing a scientist

768
00:46:38,360 --> 00:46:40,520
from Health Canada speak.

769
00:46:40,520 --> 00:46:43,400
And what he was saying was, yes, they

770
00:46:43,400 --> 00:46:48,760
used to test for these DDT compounds in Canada.

771
00:46:48,760 --> 00:46:53,400
But unfortunately, they're so ubiquitous.

772
00:46:53,400 --> 00:46:56,760
Because I think they were used maybe heavily, maybe

773
00:46:56,760 --> 00:46:59,520
in the 70s or maybe earlier.

774
00:46:59,520 --> 00:47:02,040
They were just spraying them all over the place on orchards

775
00:47:02,040 --> 00:47:02,560
and stuff.

776
00:47:02,560 --> 00:47:05,320
And I don't think they'd be great.

777
00:47:05,320 --> 00:47:06,980
So they were just spraying them all over.

778
00:47:06,980 --> 00:47:08,760
And they're ending up in the soil.

779
00:47:08,760 --> 00:47:12,840
So unfortunately, a lot of agricultural products

780
00:47:12,840 --> 00:47:14,920
just test high for DDT.

781
00:47:14,920 --> 00:47:18,920
And so they were saying they just stopped testing for it

782
00:47:18,920 --> 00:47:23,440
in Canada because people were getting detections.

783
00:47:23,440 --> 00:47:27,080
But it's not really their fault. I

784
00:47:27,080 --> 00:47:30,320
don't know if they thought they were high concentrations or not.

785
00:47:30,320 --> 00:47:33,520
So it's just sort of an unfortunate thing

786
00:47:33,520 --> 00:47:36,280
that we live in this dirty world.

787
00:47:36,280 --> 00:47:38,680
But anywho, that was something they

788
00:47:38,680 --> 00:47:43,720
were looking at in Washington was proximity to orchards

789
00:47:43,720 --> 00:47:46,240
and failure rates for pesticides.

790
00:47:49,160 --> 00:47:53,960
But actually, last week we talked about terra noire.

791
00:47:53,960 --> 00:47:56,680
And I think that would actually be a more interesting, that's

792
00:47:56,680 --> 00:47:59,440
more fun, uplifting thing to look at is,

793
00:47:59,440 --> 00:48:01,240
instead of looking at pesticides,

794
00:48:01,240 --> 00:48:05,720
maybe look at geographic density of terpenes.

795
00:48:05,720 --> 00:48:10,720
And my hypothesis would be, I'd bet you anything,

796
00:48:10,720 --> 00:48:15,080
the Emerald Triangle and just in general, sunny California,

797
00:48:15,080 --> 00:48:21,040
I bet they have pretty high expressions of terpenes,

798
00:48:21,040 --> 00:48:29,840
maybe versus some of the coal indoor warehouse-grown cannabis

799
00:48:29,840 --> 00:48:33,920
maybe in Massachusetts or Michigan or even

800
00:48:33,920 --> 00:48:35,320
Washington state.

801
00:48:35,320 --> 00:48:40,720
But indoor growers can sometimes impress you, so who knows?

802
00:48:43,720 --> 00:48:47,560
But anywho, let me go ahead and run through some of this.

803
00:48:47,560 --> 00:48:50,440
That was an awesome comment there, Rick.

804
00:48:50,440 --> 00:48:53,360
So feel free to chime in any time.

805
00:48:53,360 --> 00:48:59,680
But basically, the question was posed in the prior weeks,

806
00:48:59,680 --> 00:49:03,080
like how could we even go about distinguishing

807
00:49:03,080 --> 00:49:07,760
between these various varieties?

808
00:49:07,760 --> 00:49:11,520
And so let's see if we can kind of do that.

809
00:49:11,520 --> 00:49:13,880
So we've got three varieties here,

810
00:49:13,880 --> 00:49:17,880
like pie, Colombian, and Durban.

811
00:49:17,880 --> 00:49:24,640
And if we just look at their THC and CBD,

812
00:49:24,640 --> 00:49:28,840
they're kind of all clustered together.

813
00:49:28,840 --> 00:49:31,640
You may look at the average.

814
00:49:31,640 --> 00:49:38,320
So you may just say, oh, just looking at the average THC,

815
00:49:38,320 --> 00:49:42,640
maybe Durban poison just has the highest on average.

816
00:49:42,640 --> 00:49:48,000
But once again, how much can you really read into this?

817
00:49:48,000 --> 00:49:52,960
Are these, first off, are they statistically different?

818
00:49:52,960 --> 00:49:58,560
So you can use, first off, you can use an ANOVA.

819
00:49:58,560 --> 00:50:01,160
So this is an analysis of variance.

820
00:50:01,160 --> 00:50:03,200
So that's the statistical test you

821
00:50:03,200 --> 00:50:06,720
would do to actually test if these are statistically

822
00:50:06,720 --> 00:50:07,720
different.

823
00:50:07,720 --> 00:50:14,920
And that basically depends on mostly the sample size

824
00:50:14,920 --> 00:50:19,320
you have and then the standard deviation.

825
00:50:19,320 --> 00:50:26,120
So if these have large standard deviations, which they do,

826
00:50:26,120 --> 00:50:32,680
and you have a small sample size, which we do,

827
00:50:32,680 --> 00:50:36,680
then you probably can't conclude these

828
00:50:36,680 --> 00:50:38,680
are statistically different.

829
00:50:38,680 --> 00:50:42,400
So the smallest standard deviation we have

830
00:50:42,400 --> 00:50:45,400
is around 4 and 1.5%.

831
00:50:45,400 --> 00:50:51,360
So all of these are well within two standard deviations

832
00:50:51,360 --> 00:50:52,560
of each other.

833
00:50:52,560 --> 00:50:55,520
So I almost don't even have to do an ANOVA test

834
00:50:55,520 --> 00:51:05,080
to just know for pretty high confidence

835
00:51:05,080 --> 00:51:12,080
that I don't think these are statistically different.

836
00:51:12,080 --> 00:51:17,080
OK, but once again, so this is left to do.

837
00:51:17,080 --> 00:51:19,680
So if you're interested, this is something

838
00:51:19,680 --> 00:51:20,920
we've done in prior weeks.

839
00:51:20,920 --> 00:51:25,000
You can probably just grab the code and reuse it.

840
00:51:25,000 --> 00:51:29,760
Also, this is code we've done in prior weeks, probant models.

841
00:51:29,760 --> 00:51:33,520
So you could actually see if you could predict

842
00:51:33,520 --> 00:51:42,520
high Colombian Durbin with CBD using a probant model.

843
00:51:42,520 --> 00:51:46,360
Once again, you can do this.

844
00:51:46,360 --> 00:51:50,360
But how accurate is that model going to be?

845
00:51:50,360 --> 00:51:52,200
Probably not super accurate.

846
00:51:52,200 --> 00:51:55,200
But have no fear.

847
00:51:55,200 --> 00:51:57,000
So you may be getting a little worried.

848
00:51:57,000 --> 00:51:59,400
You're like, OK, we've got almost no data.

849
00:51:59,400 --> 00:52:01,480
And things aren't looking very good.

850
00:52:01,480 --> 00:52:03,720
But have no fear.

851
00:52:03,720 --> 00:52:08,520
We can still do some analysis here.

852
00:52:08,520 --> 00:52:13,280
So we can start, say, looking at the ratios.

853
00:52:13,280 --> 00:52:17,520
So this is the ratio of THC to CBD,

854
00:52:17,520 --> 00:52:21,240
see if anything jumps out at us.

855
00:52:21,240 --> 00:52:27,240
So basically, it looks like these are high THC strains.

856
00:52:27,240 --> 00:52:31,520
That's not super surprising.

857
00:52:31,520 --> 00:52:38,440
Once again, OK, so for me, nothing really jumped out

858
00:52:38,440 --> 00:52:44,440
at me here as just saying, OK, these three strains,

859
00:52:44,440 --> 00:52:46,200
there's something different about them.

860
00:52:46,200 --> 00:52:50,680
So right now, just from looking at THC and CBD,

861
00:52:50,680 --> 00:52:54,160
they all look relatively similar.

862
00:52:54,160 --> 00:52:57,920
But now we can augment, or not augment.

863
00:52:57,920 --> 00:52:59,360
We already have these points.

864
00:52:59,360 --> 00:53:03,720
But we can start looking at some of the terpenes.

865
00:53:03,720 --> 00:53:07,720
And so this is where it's going to get a little interesting.

866
00:53:07,720 --> 00:53:12,120
And these relationships, I want to go ahead and thank

867
00:53:12,120 --> 00:53:16,320
old John Abrams at the CESC.

868
00:53:16,320 --> 00:53:20,600
He was the one who first told me about the relationships

869
00:53:20,600 --> 00:53:22,840
between some of these terpenes.

870
00:53:22,840 --> 00:53:27,320
And for you ambitious chemists, this

871
00:53:27,320 --> 00:53:31,440
is where I think some of the frontier lies.

872
00:53:31,440 --> 00:53:36,280
So these chemicals are produced in the plant

873
00:53:36,280 --> 00:53:42,880
by various synthases for my understanding.

874
00:53:42,880 --> 00:53:45,920
And so I think maybe understanding

875
00:53:45,920 --> 00:53:49,600
of the relationship between the synthases.

876
00:53:49,600 --> 00:53:50,920
And once again, I'm not a chemist,

877
00:53:50,920 --> 00:53:52,280
so I may be butchering this.

878
00:53:52,280 --> 00:53:54,760
But maybe there's a THC synthase.

879
00:53:54,760 --> 00:53:56,520
And maybe there's another synthase

880
00:53:56,520 --> 00:54:01,640
that produces some of the terpenes.

881
00:54:01,640 --> 00:54:07,680
Maybe there's some correlation between these.

882
00:54:07,680 --> 00:54:12,040
Maybe the Surin genetics express some synthases more

883
00:54:12,040 --> 00:54:12,760
and some less.

884
00:54:12,760 --> 00:54:17,520
So that's sort of, like I said, I barely know anything there.

885
00:54:17,520 --> 00:54:20,000
If you're particularly interested,

886
00:54:20,000 --> 00:54:22,840
that's what I think an avenue for new research

887
00:54:22,840 --> 00:54:28,160
is, actually studying how the plant produces

888
00:54:28,160 --> 00:54:36,120
these chemicals and the properties of these chemicals.

889
00:54:36,120 --> 00:54:40,040
I think that's going to be important.

890
00:54:40,040 --> 00:54:43,320
But for now, we'll just see if we can't start finding

891
00:54:43,320 --> 00:54:45,400
differences between these.

892
00:54:45,400 --> 00:54:49,480
So here, here's all the derbins.

893
00:54:49,480 --> 00:54:56,440
You see beta-karyophylline is generally less than 0.15.

894
00:54:56,440 --> 00:55:01,640
And alpha-humulenes generally less than 0.06.

895
00:55:01,640 --> 00:55:04,360
You see the Colombian profit.

896
00:55:04,360 --> 00:55:08,600
It's kind of starting to drag away a little bit.

897
00:55:08,600 --> 00:55:13,160
So this may just be maybe it's just a more potent strain.

898
00:55:13,160 --> 00:55:19,760
But you see it's got a bit higher, 0.25.

899
00:55:19,760 --> 00:55:22,680
And then I don't think I've done the tie.

900
00:55:22,680 --> 00:55:27,840
But we can just try the tie results real quick.

901
00:55:27,840 --> 00:55:29,320
Yeah, so for some reason, I don't think

902
00:55:29,320 --> 00:55:33,640
I have terpenes for the ties.

903
00:55:33,640 --> 00:55:40,240
So unfortunately, the tie has dropped out of the race.

904
00:55:40,240 --> 00:55:46,360
And so now we're just dropped out of the land race.

905
00:55:46,360 --> 00:55:48,880
So right now, we've just got derbin and Colombian

906
00:55:48,880 --> 00:55:50,840
that we're looking at.

907
00:55:50,840 --> 00:55:53,440
And so this is the next relationship,

908
00:55:53,440 --> 00:55:57,960
is looking at beta-mercine to delimitene

909
00:55:57,960 --> 00:56:02,720
and seeing if we can't parse out anything interesting.

910
00:56:02,720 --> 00:56:07,960
And as I was saying, tie fell out of the land race.

911
00:56:07,960 --> 00:56:09,640
But check this out.

912
00:56:09,640 --> 00:56:14,600
You've got land race derbin.

913
00:56:14,600 --> 00:56:18,440
That's kind of an outlier here.

914
00:56:18,440 --> 00:56:22,560
And so you see almost all the other derbin poisons

915
00:56:22,560 --> 00:56:25,480
have less than 1% beta-mercine.

916
00:56:25,480 --> 00:56:31,720
And for some odd reason, this land race has 2%.

917
00:56:31,720 --> 00:56:39,480
So that's double the normal amount of beta-mercine

918
00:56:39,480 --> 00:56:44,560
that you would see in a derbin.

919
00:56:44,560 --> 00:56:48,360
And it's labeled a land race.

920
00:56:48,360 --> 00:56:50,840
And if you do a little research, you'll

921
00:56:50,840 --> 00:56:54,960
learn that land race strains are basically,

922
00:56:54,960 --> 00:56:57,040
from my understanding, where people

923
00:56:57,040 --> 00:57:00,160
go to remote regions of the world.

924
00:57:00,160 --> 00:57:02,680
Maybe they go to South Africa.

925
00:57:02,680 --> 00:57:07,280
Maybe they go to Colombia or India or Thailand

926
00:57:07,280 --> 00:57:08,880
or where have you.

927
00:57:08,880 --> 00:57:15,840
And they try to go to regions where the cannabis plants

928
00:57:15,840 --> 00:57:21,120
haven't been interbred with a lot of the general plants

929
00:57:21,120 --> 00:57:22,080
that we see today.

930
00:57:22,080 --> 00:57:29,280
So as I was saying, a lot of the California OG and Cush

931
00:57:29,280 --> 00:57:32,760
has been interbred with so many strains

932
00:57:32,760 --> 00:57:36,720
because people are just trying to get high THC.

933
00:57:36,720 --> 00:57:43,320
And so I think people go and try to find these untapped genetics

934
00:57:43,320 --> 00:57:50,000
and bring those back to the US market

935
00:57:50,000 --> 00:57:53,560
to breed in with the Cushes and the OGs

936
00:57:53,560 --> 00:58:01,920
to get some new flavor profiles with that super high THC

937
00:58:01,920 --> 00:58:04,840
stuff that the market demands.

938
00:58:04,840 --> 00:58:09,960
So I'll zoom in in just one second on that land race.

939
00:58:09,960 --> 00:58:14,400
Just to finish these off, check this out.

940
00:58:14,400 --> 00:58:20,400
Colombian profit has a significant amount

941
00:58:20,400 --> 00:58:23,080
of delimiting, which you just don't

942
00:58:23,080 --> 00:58:26,640
observe in the derbin poisons.

943
00:58:26,640 --> 00:58:34,960
And beta-mercine is maybe less.

944
00:58:34,960 --> 00:58:38,880
C has only got 0.1% beta-mercine, where over here

945
00:58:38,880 --> 00:58:42,840
you're seeing anywhere from 0.5% to 1%.

946
00:58:42,840 --> 00:58:45,320
So that's notable.

947
00:58:45,320 --> 00:58:52,800
But here's the goal.

948
00:58:52,800 --> 00:58:58,640
So this is my favorite relationship

949
00:58:58,640 --> 00:59:07,000
that I've been introduced to, in that it's not perfect,

950
00:59:07,000 --> 00:59:14,320
but a good rule of thumb to me is using the beta-pinene

951
00:59:14,320 --> 00:59:20,960
to deliminin ratio as a proxy for indica sativa.

952
00:59:20,960 --> 00:59:26,760
So what I would find is anything with a greater than around 0.2,

953
00:59:26,760 --> 00:59:32,400
0.25 ratio of beta-pinene to deliminin

954
00:59:32,400 --> 00:59:34,880
is what I'd call a sativa.

955
00:59:34,880 --> 00:59:37,680
And anything lower would be kind of what I would

956
00:59:37,680 --> 00:59:40,400
call more on the indica side.

957
00:59:40,400 --> 00:59:45,280
And so if you look at this, see that most things here,

958
00:59:45,280 --> 00:59:50,840
they're less than 0.1 to 0.1.

959
00:59:50,840 --> 00:59:55,560
So this is 0.1 to 0.1, right?

960
00:59:55,560 --> 00:59:58,480
That's like a 1 to 1 ratio.

961
00:59:58,480 --> 01:00:02,320
So it looks at first glance that this

962
01:00:02,320 --> 01:00:07,480
has around a 1 to 1 ratio between beta-pinene

963
01:00:07,480 --> 01:00:11,960
and deliminin, whereas see the Colombian profit,

964
01:00:11,960 --> 01:00:15,480
you've got 1 to 5 to 0.6.

965
01:00:15,480 --> 01:00:22,080
So the Colombian profit has around a 1 to 5 ratio

966
01:00:22,080 --> 01:00:24,800
between beta-pinene and deliminin.

967
01:00:24,800 --> 01:00:28,000
And so they get lower than this.

968
01:00:28,000 --> 01:00:33,320
So I wouldn't call this a strong indica.

969
01:00:33,320 --> 01:00:37,320
What this is looking to me like is maybe the Colombian profit

970
01:00:37,320 --> 01:00:43,160
is kind of falling right at sort of the hybrid point,

971
01:00:43,160 --> 01:00:47,600
in my opinion, because I think these get really low.

972
01:00:47,600 --> 01:00:50,440
You can get some really strong indicas.

973
01:00:50,440 --> 01:00:54,040
So this is looking like right at sort of like the indica cut

974
01:00:54,040 --> 01:00:55,400
off line to me.

975
01:00:55,400 --> 01:00:59,760
But once again, we only have a sample size of 1.

976
01:00:59,760 --> 01:01:03,840
And you can't really do that much with a sample size of 1

977
01:01:03,840 --> 01:01:08,520
because it's like you may have just grabbed the outlier.

978
01:01:08,520 --> 01:01:16,840
So we'd really like to see more Colombian terpene profiles.

979
01:01:16,840 --> 01:01:21,520
So if any of you are growing Colombian gold, Santa Marta,

980
01:01:21,520 --> 01:01:25,040
medellin, get your terpenes tested

981
01:01:25,040 --> 01:01:28,320
because these are critical data points.

982
01:01:28,320 --> 01:01:36,400
And from the 125,000 lab results we have,

983
01:01:36,400 --> 01:01:40,960
from my understanding, we only have one Colombian

984
01:01:40,960 --> 01:01:44,400
that has lab results.

985
01:01:44,400 --> 01:01:45,280
We'll take it.

986
01:01:45,280 --> 01:01:46,800
It's better than nothing.

987
01:01:46,800 --> 01:01:53,320
So just given this one data point,

988
01:01:53,320 --> 01:01:58,680
before this, my prior was a null hypothesis.

989
01:01:58,680 --> 01:02:01,440
My prior was there's no difference

990
01:02:01,440 --> 01:02:05,000
between Colombian and Durban.

991
01:02:05,000 --> 01:02:09,400
However, after seeing just one observation,

992
01:02:09,400 --> 01:02:13,200
my prior is now updated.

993
01:02:13,200 --> 01:02:18,600
And I now think that Colombian is closer to a hybrid.

994
01:02:18,600 --> 01:02:27,440
And Durban, Durban poison, looks closer to a sativa to me.

995
01:02:27,440 --> 01:02:32,480
And once again, you can actually calculate the average Durban

996
01:02:32,480 --> 01:02:34,840
beta-pinene to d-limonene.

997
01:02:34,840 --> 01:02:37,280
That one is 0.75.

998
01:02:37,280 --> 01:02:39,720
And then this one is 0.23.

999
01:02:39,720 --> 01:02:43,200
And once again, let's look at just the min here.

1000
01:02:52,480 --> 01:02:55,400
Because there may be some, see look,

1001
01:02:55,400 --> 01:03:01,920
there are some Durbans that have low ratios.

1002
01:03:01,920 --> 01:03:06,440
So the question is, are these just not traditional Durbans?

1003
01:03:06,440 --> 01:03:10,440
Or is it possible that the distribution goes all the way

1004
01:03:10,440 --> 01:03:12,640
down to 0.2?

1005
01:03:12,640 --> 01:03:18,600
And then it's not even abnormal to have, say, a strain down

1006
01:03:18,600 --> 01:03:19,360
here.

1007
01:03:19,360 --> 01:03:23,560
So this is the problem about having these tiny sample sizes

1008
01:03:23,560 --> 01:03:28,360
is we don't know if this Colombian profit is kind

1009
01:03:28,360 --> 01:03:34,760
of an outlier or if this is around average.

1010
01:03:34,760 --> 01:03:37,240
So don't know.

1011
01:03:37,240 --> 01:03:40,920
And then we've gone way, way over time.

1012
01:03:40,920 --> 01:03:45,440
But I just wanted to show you one last set of charts

1013
01:03:45,440 --> 01:03:48,440
just to really drive this point home.

1014
01:03:48,440 --> 01:03:51,880
I mean, this one's kind of fun.

1015
01:03:51,880 --> 01:03:57,120
Remember, we talked about the land race Durban.

1016
01:03:57,120 --> 01:03:59,840
So I thought, well, why not actually just look

1017
01:03:59,840 --> 01:04:02,440
at its terpene profile?

1018
01:04:02,440 --> 01:04:06,000
So this may just be improperly named.

1019
01:04:06,000 --> 01:04:09,720
But if this is, in fact, the true land race,

1020
01:04:09,720 --> 01:04:13,920
then this may be characteristic of some of the genetics

1021
01:04:13,920 --> 01:04:17,640
that you would see in South Africa cannabis.

1022
01:04:17,640 --> 01:04:20,560
So I kind of was super curious about what

1023
01:04:20,560 --> 01:04:25,560
the terpene profile of this particular strain would be.

1024
01:04:25,560 --> 01:04:29,680
And if you look at it, so these are the absolute concentrations

1025
01:04:29,680 --> 01:04:33,280
with over 2% betamersene.

1026
01:04:33,280 --> 01:04:37,720
If you look at just the relative concentration,

1027
01:04:37,720 --> 01:04:45,240
you'll see almost 60% of the terpenes in this plant

1028
01:04:45,240 --> 01:04:50,520
are betamersene, which is interesting.

1029
01:04:50,520 --> 01:04:53,520
But we need a comparison.

1030
01:04:53,520 --> 01:04:56,960
So let's look at the Colombian profit.

1031
01:04:56,960 --> 01:05:04,080
So here are the terpenes of the Colombian profit.

1032
01:05:04,080 --> 01:05:06,960
One thing we notice is this sample

1033
01:05:06,960 --> 01:05:09,320
was tested for more terpenes.

1034
01:05:09,320 --> 01:05:17,940
So you see here, we're testing for about a dozen terpenes.

1035
01:05:17,940 --> 01:05:22,480
And then the Colombian, they tested for about two dozen

1036
01:05:22,480 --> 01:05:23,560
terpenes.

1037
01:05:23,560 --> 01:05:28,640
So you have a more granular strain footprint here.

1038
01:05:28,640 --> 01:05:30,040
But check this out.

1039
01:05:30,040 --> 01:05:34,480
You right off the bat, we can start doing some comparison.

1040
01:05:34,480 --> 01:05:37,480
So I'm going to put this one up side by side.

1041
01:05:37,480 --> 01:05:38,360
So check this out.

1042
01:05:41,240 --> 01:05:42,880
Someone have a thought comment question?

1043
01:05:48,280 --> 01:05:49,760
Feel free to chime in.

1044
01:05:49,760 --> 01:05:53,640
Or someone had to go.

1045
01:05:53,640 --> 01:05:54,240
That's OK.

1046
01:05:54,240 --> 01:05:58,080
I'll wrap this up since we're well over time here.

1047
01:05:58,080 --> 01:06:01,760
But long story short is, and in fact, let

1048
01:06:01,760 --> 01:06:07,960
me pull back up the absolute concentration here.

1049
01:06:07,960 --> 01:06:09,640
I thought the relative concentration

1050
01:06:09,640 --> 01:06:10,680
would be interesting.

1051
01:06:10,680 --> 01:06:14,840
But let's just stick with the absolute concentration

1052
01:06:14,840 --> 01:06:18,360
just to not get confused.

1053
01:06:18,360 --> 01:06:22,560
So to this one, and remember, if we actually

1054
01:06:22,560 --> 01:06:33,480
look at this land-raised derbin, so if we look at this one,

1055
01:06:33,480 --> 01:06:35,440
do we still have total THC for this?

1056
01:06:41,640 --> 01:06:48,160
So this land-raised derbin has kind of a low THC percentage.

1057
01:06:48,160 --> 01:06:49,600
In my opinion.

1058
01:06:49,600 --> 01:06:56,720
So its THC percentage is only 14%.

1059
01:06:56,720 --> 01:07:01,280
But it's got almost 2% beta-mercine.

1060
01:07:01,280 --> 01:07:05,760
So that just seems like a staggering amount

1061
01:07:05,760 --> 01:07:11,360
of beta-mercine relative to its THC levels.

1062
01:07:11,360 --> 01:07:19,920
And so here you see the Colombian has less than 0.1%

1063
01:07:19,920 --> 01:07:21,560
beta-mercine.

1064
01:07:21,560 --> 01:07:26,080
So is that meaningful?

1065
01:07:26,080 --> 01:07:28,520
Not super certain.

1066
01:07:28,520 --> 01:07:31,120
But the other thing that was really kind of jumping out

1067
01:07:31,120 --> 01:07:34,720
at these, and that's what I was saying about statistics.

1068
01:07:34,720 --> 01:07:37,320
We were looking at the THC and CBD,

1069
01:07:37,320 --> 01:07:40,800
and nothing was jumping out at us.

1070
01:07:40,800 --> 01:07:44,200
When you look at statistics, you want something

1071
01:07:44,200 --> 01:07:48,720
to jump out at you almost immediately,

1072
01:07:48,720 --> 01:07:52,680
and then you can pursue it further.

1073
01:07:52,680 --> 01:07:59,440
Otherwise, you have to be super cautious about spurious

1074
01:07:59,440 --> 01:08:04,120
causation and bias.

1075
01:08:04,120 --> 01:08:09,240
But here, once again, this is just two observations.

1076
01:08:09,240 --> 01:08:12,080
One observation of Landry's Durban,

1077
01:08:12,080 --> 01:08:15,200
and one observation of Colombian profit.

1078
01:08:15,200 --> 01:08:20,080
So we've got a sample size of one for each.

1079
01:08:20,080 --> 01:08:24,280
So we'd really like to get our sample sizes a lot higher.

1080
01:08:28,080 --> 01:08:30,560
They look like different strains to me.

1081
01:08:30,560 --> 01:08:37,040
So it looks like this one maybe just has small or fewer

1082
01:08:37,040 --> 01:08:40,440
amounts of terpenes except for beta-mercine.

1083
01:08:40,440 --> 01:08:43,920
And then, as I said, what's jumping out at the Colombian,

1084
01:08:43,920 --> 01:08:48,680
we kind of noted earlier, is it has a high concentration

1085
01:08:48,680 --> 01:08:56,480
of D-limonene, whereas the Landry's Durban does not.

1086
01:08:56,480 --> 01:09:02,560
So I don't know if that's meaningful or not to you,

1087
01:09:02,560 --> 01:09:09,440
but if you were, say, interested in what makes a Durban

1088
01:09:09,440 --> 01:09:12,240
a Durban, what makes a Colombian a Colombian,

1089
01:09:12,240 --> 01:09:19,680
well, we can start to think about the chemical profiles.

1090
01:09:19,680 --> 01:09:24,240
Maybe if you wanted a strain that was a sativa that also

1091
01:09:24,240 --> 01:09:29,480
had beta-mercine, you could grow some Durban.

1092
01:09:29,480 --> 01:09:34,280
Or if you want a nice, maybe this is a nice hybrid,

1093
01:09:34,280 --> 01:09:36,880
you can try some Colombian out.

1094
01:09:36,880 --> 01:09:40,400
Or maybe you could get creative.

1095
01:09:40,400 --> 01:09:48,000
And what would happen if you mix a Colombian and a Durban

1096
01:09:48,000 --> 01:09:49,240
together?

1097
01:09:49,240 --> 01:09:54,680
And so I think that would be wildly interesting

1098
01:09:54,680 --> 01:09:56,480
predictive analytics.

1099
01:09:56,480 --> 01:10:00,400
Rick, you talked about what could AI models

1100
01:10:00,400 --> 01:10:03,240
do that we couldn't do.

1101
01:10:03,240 --> 01:10:06,320
This is about as far as I can go.

1102
01:10:06,320 --> 01:10:08,920
All I can really do is just calculate

1103
01:10:08,920 --> 01:10:14,680
a bunch of conditional statistics

1104
01:10:14,680 --> 01:10:19,880
and to a very limited extent.

1105
01:10:19,880 --> 01:10:24,080
I can just calculate the concentrate.

1106
01:10:24,080 --> 01:10:26,840
This is just the raw data.

1107
01:10:26,840 --> 01:10:29,600
So this is just the raw concentration.

1108
01:10:29,600 --> 01:10:31,600
Just the raw concentration.

1109
01:10:31,600 --> 01:10:35,120
But what if some AI model could start

1110
01:10:35,120 --> 01:10:46,000
thinking about what would a terpene profile of the offspring

1111
01:10:46,000 --> 01:10:48,600
of these two plants look like?

1112
01:10:48,600 --> 01:10:53,960
And so then you could almost do AI-powered breeding.

1113
01:10:53,960 --> 01:10:59,600
And try to figure out for statistical or statistics

1114
01:10:59,600 --> 01:11:01,280
powered breeding.

1115
01:11:01,280 --> 01:11:05,600
Where instead of just thinking, oh, maybe today

1116
01:11:05,600 --> 01:11:09,640
I'll breed the Colombian with the Thai.

1117
01:11:09,640 --> 01:11:12,640
You could actually do predictive analytics

1118
01:11:12,640 --> 01:11:15,360
and try to figure out what would be the terpene

1119
01:11:15,360 --> 01:11:18,760
profile of a Colombian Thai.

1120
01:11:18,760 --> 01:11:24,600
And what would be the terpene profile of a Colombian Durban

1121
01:11:24,600 --> 01:11:27,720
or a profile of a Thai Durban.

1122
01:11:27,720 --> 01:11:34,840
And you can mix and match to get those ideal terpene profiles

1123
01:11:34,840 --> 01:11:38,760
that maybe you've done other predictive analytics.

1124
01:11:38,760 --> 01:11:41,520
And we were talking about matching terpene profiles

1125
01:11:41,520 --> 01:11:42,920
to consumers.

1126
01:11:42,920 --> 01:11:49,040
So maybe there's a consumer or a big consumer segment that

1127
01:11:49,040 --> 01:11:51,360
likes a certain profile.

1128
01:11:51,360 --> 01:11:54,720
And you can laser focus in your breeding

1129
01:11:54,720 --> 01:11:57,440
and give the consumer just what they want.

1130
01:12:00,640 --> 01:12:05,920
That's sort of what I see sort of the future of all of this.

1131
01:12:05,920 --> 01:12:07,320
What am I doing?

1132
01:12:07,320 --> 01:12:10,120
I'm just doing basic statistics.

1133
01:12:10,120 --> 01:12:13,840
But where can you take this?

1134
01:12:13,840 --> 01:12:16,440
The sky is the limit.

1135
01:12:16,440 --> 01:12:20,240
So any thoughts, comments, questions

1136
01:12:20,240 --> 01:12:23,960
before I let you all out of here to go seize the day?

1137
01:12:28,240 --> 01:12:30,760
Since this is my second meeting, I just

1138
01:12:30,760 --> 01:12:35,040
wanted to briefly introduce myself to the group.

1139
01:12:35,040 --> 01:12:40,040
Similar to Edwin, I'm also in a data science

1140
01:12:40,040 --> 01:12:45,560
program and interested in natural language processing

1141
01:12:45,560 --> 01:12:48,960
and just finished up a classification project.

1142
01:12:48,960 --> 01:12:53,440
And this, to me, is very fascinating data.

1143
01:12:53,440 --> 01:12:58,400
Like you said, looking at terpene profiles

1144
01:12:58,400 --> 01:13:02,080
and just some of the raw data that's out there.

1145
01:13:02,080 --> 01:13:03,480
So I just wanted to say hi.

1146
01:13:03,480 --> 01:13:07,040
I don't want to keep the group too much longer.

1147
01:13:07,040 --> 01:13:13,840
I guess the user interface had the speaker off earlier.

1148
01:13:13,840 --> 01:13:19,320
So I couldn't chime in earlier, but just wanted to say hello.

1149
01:13:19,320 --> 01:13:20,560
Phenomenal, Robert.

1150
01:13:20,560 --> 01:13:26,160
And that's where, as I said, ambitious data scientists

1151
01:13:26,160 --> 01:13:30,160
like yourself can just take this one step further.

1152
01:13:30,160 --> 01:13:33,480
Today we're just doing summary statistics.

1153
01:13:33,480 --> 01:13:35,960
How many Thai strains have we observed?

1154
01:13:35,960 --> 01:13:38,520
How many Durban strains have we observed?

1155
01:13:38,520 --> 01:13:40,680
What's the average concentration?

1156
01:13:40,680 --> 01:13:42,800
So these are just low-hanging fruit.

1157
01:13:42,800 --> 01:13:48,600
This is just either raw data, simple statistics,

1158
01:13:48,600 --> 01:13:51,480
or conditional statistics.

1159
01:13:51,480 --> 01:13:54,000
Nothing super fancy.

1160
01:13:54,000 --> 01:13:56,560
But you can do much further.

1161
01:13:56,560 --> 01:13:58,680
So classification.

1162
01:13:58,680 --> 01:14:00,880
We just looked at a couple strains today.

1163
01:14:00,880 --> 01:14:03,760
But I think that may be the name of the game in the future,

1164
01:14:03,760 --> 01:14:07,920
is trying to put these into buckets.

1165
01:14:07,920 --> 01:14:10,800
So there's the Colombian.

1166
01:14:10,800 --> 01:14:13,280
There's Thai.

1167
01:14:13,280 --> 01:14:14,280
There's Durban.

1168
01:14:14,280 --> 01:14:17,920
In the past, we've looked at cheese.

1169
01:14:17,920 --> 01:14:26,000
We've looked at push, OG, cape, skunk.

1170
01:14:26,000 --> 01:14:30,040
So the strain names is sort of a fun.

1171
01:14:30,040 --> 01:14:33,720
It's fun because people come up with all these wild

1172
01:14:33,720 --> 01:14:35,520
and wacky strain names.

1173
01:14:35,520 --> 01:14:38,920
And there's been some papers that say, oh, this

1174
01:14:38,920 --> 01:14:42,000
may not be very consistent.

1175
01:14:42,000 --> 01:14:46,120
But I feel like there may be something to it.

1176
01:14:46,120 --> 01:14:49,520
So as I said today, there may be absolutely no difference

1177
01:14:49,520 --> 01:14:51,200
between all these strains.

1178
01:14:51,200 --> 01:14:51,880
But I don't know.

1179
01:14:51,880 --> 01:14:53,640
I just feel like there's just something

1180
01:14:53,640 --> 01:14:58,520
different between a plant that originated out of South Africa

1181
01:14:58,520 --> 01:15:04,720
and a plant that originated out of Colombia.

1182
01:15:04,720 --> 01:15:07,320
Who knows where the skunks originated?

1183
01:15:07,320 --> 01:15:10,240
And as I said, there's been a lot of breeding in California.

1184
01:15:10,240 --> 01:15:16,200
So I think there's ripe ground for classification.

1185
01:15:16,200 --> 01:15:19,360
But it's going to take your brilliant thinking.

1186
01:15:19,360 --> 01:15:22,720
So Rick, any thoughts, comments, questions real quick?

1187
01:15:22,720 --> 01:15:23,720
Yeah, real quick.

1188
01:15:23,720 --> 01:15:26,480
And I didn't want to take us over time more than we already

1189
01:15:26,480 --> 01:15:26,980
are.

1190
01:15:26,980 --> 01:15:29,600
But I just wanted to say thank you.

1191
01:15:29,600 --> 01:15:31,960
You were spot on with a lot of stuff

1192
01:15:31,960 --> 01:15:34,440
that I've been working on personally

1193
01:15:34,440 --> 01:15:37,760
in regards to breeding and tracking,

1194
01:15:37,760 --> 01:15:42,200
essentially, land races through different data sets.

1195
01:15:42,200 --> 01:15:44,520
One you might be interested in and find helpful

1196
01:15:44,520 --> 01:15:49,120
is seedfinder.eu has an API where you can basically

1197
01:15:49,120 --> 01:15:51,480
track genetics through strain names.

1198
01:15:51,480 --> 01:15:53,020
So you type a strain name.

1199
01:15:53,020 --> 01:15:55,360
And then there's basically a family tree

1200
01:15:55,360 --> 01:15:58,440
of all the different crosses, if it was cubed,

1201
01:15:58,440 --> 01:16:02,120
or the entire genetic process behind it.

1202
01:16:02,120 --> 01:16:04,040
Usually a lot of backstory, too.

1203
01:16:04,040 --> 01:16:06,000
So it's helpful as a breeder to be

1204
01:16:06,000 --> 01:16:07,940
able to track the lineage of the strains

1205
01:16:07,940 --> 01:16:09,280
that you're working with.

1206
01:16:09,280 --> 01:16:13,840
Now you've helped me make a connection to lab results.

1207
01:16:13,840 --> 01:16:17,680
So now I can tie the genetic data

1208
01:16:17,680 --> 01:16:20,720
that I've been working with now with lab results

1209
01:16:20,720 --> 01:16:25,640
to do predictive analytics for crosses or see

1210
01:16:25,640 --> 01:16:29,120
what's a hybrid or a polyhybrid and try to track a land race

1211
01:16:29,120 --> 01:16:30,280
to get a cut.

1212
01:16:30,280 --> 01:16:31,640
There's a lot of interesting work

1213
01:16:31,640 --> 01:16:34,240
being done with node labs, jungle boys,

1214
01:16:34,240 --> 01:16:38,140
with genetic preservation and tracking that aspect of things

1215
01:16:38,140 --> 01:16:41,560
as well, which is another good avenue if we could get

1216
01:16:41,560 --> 01:16:45,720
that data, obviously, later, but to track

1217
01:16:45,720 --> 01:16:47,720
the environmental variables and how.

1218
01:16:47,720 --> 01:16:51,280
Because a clone should be a standardized test

1219
01:16:51,280 --> 01:16:55,640
across different variables for a genetic sample.

1220
01:16:55,640 --> 01:16:59,680
So just want to say hi and lots of interesting things

1221
01:16:59,680 --> 01:17:00,560
that you're doing.

1222
01:17:00,560 --> 01:17:02,220
I really appreciate all the work that you've

1223
01:17:02,220 --> 01:17:03,520
done to tie everything together.

1224
01:17:03,520 --> 01:17:06,160
And it's been extremely valuable to me just

1225
01:17:06,160 --> 01:17:08,360
by discovering it today.

1226
01:17:08,360 --> 01:17:10,160
So thanks.

1227
01:17:10,160 --> 01:17:11,120
Phenomenal, Rick.

1228
01:17:11,120 --> 01:17:13,680
And let's keep the conversation going.

1229
01:17:13,680 --> 01:17:17,760
This seed finder has been brought up before.

1230
01:17:17,760 --> 01:17:20,720
And you're spot on.

1231
01:17:20,720 --> 01:17:23,600
And that's where a lot of the value is added,

1232
01:17:23,600 --> 01:17:26,560
is tying all these rich data sets together.

1233
01:17:26,560 --> 01:17:28,480
And I didn't know they had an API.

1234
01:17:28,480 --> 01:17:32,880
So I'm definitely going to have to explore this now.

1235
01:17:32,880 --> 01:17:34,640
Because I just want to see.

1236
01:17:34,640 --> 01:17:36,920
So you should definitely do it.

1237
01:17:36,920 --> 01:17:39,200
Oh, I will today.

1238
01:17:39,200 --> 01:17:43,960
And I'll share everything with you all.

1239
01:17:43,960 --> 01:17:46,720
Because as you said, all you need is a name.

1240
01:17:46,720 --> 01:17:48,880
And then you can start tying these together.

1241
01:17:48,880 --> 01:17:52,280
And that's where a lot of the beauty in data science

1242
01:17:52,280 --> 01:17:53,200
comes from.

1243
01:17:53,200 --> 01:17:56,240
It's just tying together interesting data

1244
01:17:56,240 --> 01:17:58,680
sets in interesting ways that maybe people

1245
01:17:58,680 --> 01:17:59,720
haven't thought of before.

1246
01:18:05,040 --> 01:18:07,440
I love every bit of it, Rick.

1247
01:18:07,440 --> 01:18:11,960
And why don't we maybe pick up with the conversation there

1248
01:18:11,960 --> 01:18:12,560
next week?

1249
01:18:12,560 --> 01:18:15,240
Because I'll start exploring seed finder.

1250
01:18:15,240 --> 01:18:20,400
And as you pointed out, readers have their own set of data

1251
01:18:20,400 --> 01:18:21,920
points that they're interested in.

1252
01:18:21,920 --> 01:18:24,960
Because we're talking a lot about, oh, Indica Sativa

1253
01:18:24,960 --> 01:18:27,320
from the consumer point of view.

1254
01:18:27,320 --> 01:18:30,560
Well, from the producer point of view,

1255
01:18:30,560 --> 01:18:34,520
Indica and Sativa have their whole other wide variety

1256
01:18:34,520 --> 01:18:36,240
of associations.

1257
01:18:36,240 --> 01:18:42,600
For example, breeders are super interested in how many days

1258
01:18:42,600 --> 01:18:44,880
is it going to take to flower?

1259
01:18:44,880 --> 01:18:48,760
How tall is the plant going to be?

1260
01:18:48,760 --> 01:18:53,320
So they're super concerned about the morphology of the plant.

1261
01:18:53,320 --> 01:18:55,240
How big are the leaves?

1262
01:18:55,240 --> 01:18:58,800
How far apart are the buds going to be?

1263
01:18:58,800 --> 01:19:02,520
So these are things that are of utmost importance to breeders.

1264
01:19:02,520 --> 01:19:07,000
Because if you're growing a Sativa,

1265
01:19:07,000 --> 01:19:08,800
it's going to be 8 feet tall.

1266
01:19:08,800 --> 01:19:11,640
Or an Indica that's 5 feet tall, that matter

1267
01:19:11,640 --> 01:19:17,360
is about where you hang your lights.

1268
01:19:17,360 --> 01:19:20,600
And then, of course, the days to flower.

1269
01:19:20,600 --> 01:19:24,720
I've heard everybody wants that real quick turnaround time.

1270
01:19:24,720 --> 01:19:27,120
And that's typically the Indica type strains

1271
01:19:27,120 --> 01:19:30,440
with the shorter flowering periods.

1272
01:19:30,440 --> 01:19:33,040
And so that's a whole other reason why maybe you

1273
01:19:33,040 --> 01:19:36,680
don't see a bunch of Sativas out there on the market.

1274
01:19:36,680 --> 01:19:39,120
Because generally, from my understanding,

1275
01:19:39,120 --> 01:19:43,160
they have a much longer time to flower.

1276
01:19:46,240 --> 01:19:49,200
Yeah, and that's unfortunate with how things are going

1277
01:19:49,200 --> 01:19:54,720
with THC percentage being the primary goal

1278
01:19:54,720 --> 01:19:56,000
to be the highest.

1279
01:19:56,000 --> 01:19:58,480
And I think a lot of strains and cultivators

1280
01:19:58,480 --> 01:19:59,840
are looking for that.

1281
01:19:59,840 --> 01:20:01,120
Or it's difficult to grow.

1282
01:20:01,120 --> 01:20:03,600
That particular land race can be squirrely.

1283
01:20:03,600 --> 01:20:05,320
And it's unfortunate because you miss out

1284
01:20:05,320 --> 01:20:08,080
on some of the outliers like you had identified

1285
01:20:08,080 --> 01:20:09,880
with the terpenes.

1286
01:20:09,880 --> 01:20:13,280
It had a 13% or 14% THC value, which a lot of people

1287
01:20:13,280 --> 01:20:14,040
would overlook.

1288
01:20:14,040 --> 01:20:18,320
But there could be some entourage effect

1289
01:20:18,320 --> 01:20:21,520
or that double amount of whatever that terpene value

1290
01:20:21,520 --> 01:20:24,880
was to put you on your butt or heal you or do

1291
01:20:24,880 --> 01:20:29,080
a specific need or something.

1292
01:20:29,080 --> 01:20:31,720
So yeah.

1293
01:20:31,720 --> 01:20:34,640
You've got my mind jogging.

1294
01:20:34,640 --> 01:20:37,960
So I'll be thinking about this for next week.

1295
01:20:37,960 --> 01:20:39,960
A potential thing is let's just maybe

1296
01:20:39,960 --> 01:20:42,880
just try to augment some of these variables

1297
01:20:42,880 --> 01:20:46,800
that the producers may care more about.

1298
01:20:46,800 --> 01:20:49,600
So say go to Seed Finder and get things

1299
01:20:49,600 --> 01:20:53,960
like what's the expected flowering date?

1300
01:20:53,960 --> 01:20:57,960
Or maybe they have the expected height or expected yield.

1301
01:20:57,960 --> 01:21:01,320
So what if we got some of those interesting data points

1302
01:21:01,320 --> 01:21:05,240
and then once again tied them in and started to see

1303
01:21:05,240 --> 01:21:08,080
what insights we can draw?

1304
01:21:08,080 --> 01:21:11,560
Maybe there is some predictive factor

1305
01:21:11,560 --> 01:21:15,480
about how long something is going to be growing

1306
01:21:15,480 --> 01:21:17,960
and how much is going to be sold.

1307
01:21:17,960 --> 01:21:20,800
And so for example, once again, my mind's

1308
01:21:20,800 --> 01:21:22,760
racing a million miles a minute.

1309
01:21:22,760 --> 01:21:25,680
But we were looking at factors that

1310
01:21:25,680 --> 01:21:28,560
make the cultivator successful.

1311
01:21:28,560 --> 01:21:32,280
So we may want to look at all the successful cultivators

1312
01:21:32,280 --> 01:21:36,760
and look at what varieties are they growing?

1313
01:21:36,760 --> 01:21:41,280
Are they growing quick flowering strains?

1314
01:21:41,280 --> 01:21:42,760
Or who knows?

1315
01:21:42,760 --> 01:21:44,920
Maybe some of the successful cultivators

1316
01:21:44,920 --> 01:21:48,400
are growing some of the slow flowering strains.

1317
01:21:48,400 --> 01:21:52,920
So I think there's a million and one things we could look at.

1318
01:21:52,920 --> 01:21:56,440
So I encourage you all to explore the data.

1319
01:21:59,600 --> 01:22:02,560
Well, let's get on this.

1320
01:22:02,560 --> 01:22:06,680
As I said, my mind's racing, land racing.

1321
01:22:06,680 --> 01:22:09,080
So let's keep this up.

1322
01:22:09,080 --> 01:22:11,680
Keep the conversation going until next week.

1323
01:22:11,680 --> 01:22:15,320
I'll make sure to get all the lab results published

1324
01:22:15,320 --> 01:22:17,400
and delivered to you today.

1325
01:22:17,400 --> 01:22:20,280
So thank you for coming.

1326
01:22:20,280 --> 01:22:22,640
That's one of the reasons you came was rich cannabis data.

1327
01:22:22,640 --> 01:22:25,440
And I'll make sure to get that to you.

1328
01:22:25,440 --> 01:22:28,640
Until next week, thank you for helping

1329
01:22:28,640 --> 01:22:30,520
advance cannabis science.

1330
01:22:30,520 --> 01:22:32,240
Thank you for coming.

1331
01:22:32,240 --> 01:22:34,000
Thank you for being awesome.

1332
01:22:34,000 --> 01:22:37,680
Now go on and go enjoy your day.

1333
01:22:37,680 --> 01:22:38,360
Way cool.

1334
01:22:38,360 --> 01:22:40,520
Thank you, Keegan.

1335
01:22:40,520 --> 01:22:42,040
Thank you all.

1336
01:22:42,040 --> 01:22:42,560
Thank you.

1337
01:22:42,560 --> 01:22:43,760
See you next week.

1338
01:22:43,760 --> 01:22:46,160
Until next week, everyone.

1339
01:22:46,160 --> 01:22:47,680
Keep your nation to the grindstone

1340
01:22:47,680 --> 01:22:50,080
and have fun while you do it.

1341
01:22:50,080 --> 01:22:51,360
And rock your day.

1342
01:22:51,360 --> 01:22:54,360
["The Star-Spangled Banner"]

