1
00:00:00,000 --> 00:00:17,720
Well Heather, just to fill you in, so for the Snap-Fu yesterday, took Canlytics to Las

2
00:00:17,720 --> 00:00:21,480
Vegas to attend the Cannabis Conference here.

3
00:00:21,480 --> 00:00:28,000
So doing a bit of networking and also trying to learn as much as I can, you know, that

4
00:00:28,000 --> 00:00:33,560
we can share with the group and hopefully everybody can learn from that.

5
00:00:33,560 --> 00:00:38,480
So it's good to see you.

6
00:00:38,480 --> 00:00:41,320
I see Keegan.

7
00:00:41,320 --> 00:00:42,600
Hopefully more people roll in.

8
00:00:42,600 --> 00:00:44,360
Should be good.

9
00:00:44,360 --> 00:00:55,080
Well I can go ahead and share a bit and then we can just take it from there.

10
00:00:55,080 --> 00:01:07,960
Maybe a shorter one this morning, but something's better than nothing.

11
00:01:07,960 --> 00:01:10,840
Yay.

12
00:01:10,840 --> 00:01:19,600
I heard a person.

13
00:01:19,600 --> 00:01:38,360
Great, so just to share a bit with you, so here at the Cannabis Conference here in Las

14
00:01:38,360 --> 00:01:43,240
Vegas, so I just thought I would share with you just some of the, you know, the topics

15
00:01:43,240 --> 00:01:50,120
that have come up and, you know, show where the data that's publicly available could be

16
00:01:50,120 --> 00:01:56,560
used to analyze some of the topics that were brought up.

17
00:01:56,560 --> 00:02:16,400
So it's a lot of cultivation, so it's a good place to really get motivated to start a cultivation.

18
00:02:16,400 --> 00:02:24,480
And so what I'll take away from this is, you know, now is as good a time as any to go ahead

19
00:02:24,480 --> 00:02:28,560
and apply for a license and get going.

20
00:02:28,560 --> 00:02:38,600
So there's, of course, you want to get a lot of planning in, and so, you know, Sun Tzu's

21
00:02:38,600 --> 00:02:42,360
got a motto that, you know, whoever plans the most tends to win.

22
00:02:42,360 --> 00:02:47,720
And so that's a key, but from what I'm seeing, the people who've been successful are the

23
00:02:47,720 --> 00:02:55,800
people who started, you know, they just started somehow, some way, and they've done quite

24
00:02:55,800 --> 00:02:57,320
well for themselves.

25
00:02:57,320 --> 00:03:08,200
So lesson learned is go ahead and start, and there may be a learning by doing, so there

26
00:03:08,200 --> 00:03:14,760
may be a learning curve as you go, but now is as good as time as ever to jump in.

27
00:03:14,760 --> 00:03:24,800
And so, you know, so the question is, you know, where and how.

28
00:03:24,800 --> 00:03:26,720
So that's, you know, the licensing.

29
00:03:26,720 --> 00:03:33,320
And then, of course, they talked about, okay, as a cultivator, you know, what are some of

30
00:03:33,320 --> 00:03:34,920
the things that you may be looking for?

31
00:03:34,920 --> 00:03:38,160
So what are some of these data points that you're looking for?

32
00:03:38,160 --> 00:03:49,560
And so one thing that they talked about that we can analyze here is basically we're trying

33
00:03:49,560 --> 00:03:54,640
to break down cannabis into, you know, four or five groups here.

34
00:03:54,640 --> 00:04:01,120
So that way, you know, you can kind of distinguish between, okay, what are you getting?

35
00:04:01,120 --> 00:04:04,040
What are going to be the effects?

36
00:04:04,040 --> 00:04:09,960
So they've identified essentially five main types.

37
00:04:09,960 --> 00:04:13,400
You've got your high THC to CBD.

38
00:04:13,400 --> 00:04:17,560
So this is going to be the intoxicating cannabis.

39
00:04:17,560 --> 00:04:24,720
And you know, you may not be on the couch.

40
00:04:24,720 --> 00:04:30,080
Next, there's the near unitary THC to CBD.

41
00:04:30,080 --> 00:04:31,960
And so these are going to be your hybrids.

42
00:04:31,960 --> 00:04:34,920
And so these are going to give you a mixed effect.

43
00:04:34,920 --> 00:04:39,560
So these, you know, you may wind up on the couch.

44
00:04:39,560 --> 00:04:45,240
And you know, they're going to be a bit more sedative, relaxing, if that's what you're

45
00:04:45,240 --> 00:04:47,120
looking for.

46
00:04:47,120 --> 00:04:52,920
Then there's type three, which is your high CBD to THC.

47
00:04:52,920 --> 00:04:58,560
And so, you know, of course, the higher the CBD to THC ratio, then the less intoxicating

48
00:04:58,560 --> 00:05:02,560
it's going to be, and these are going to be the people looking for the more medicinal

49
00:05:02,560 --> 00:05:09,560
effects, people looking for soothing relaxation, passive relaxation, and you know, they're

50
00:05:09,560 --> 00:05:12,520
still looking to be quite present.

51
00:05:12,520 --> 00:05:13,840
CBG.

52
00:05:13,840 --> 00:05:18,120
And so those are the three main types.

53
00:05:18,120 --> 00:05:20,080
So you can really stick to those.

54
00:05:20,080 --> 00:05:26,520
And then just if you were going to toss in two more types, you could toss in the CBG,

55
00:05:26,520 --> 00:05:32,240
where you know, CBG is present and ideally present at above average concentrations.

56
00:05:32,240 --> 00:05:38,040
And then the fifth type is, you know, no major cannabinoids.

57
00:05:38,040 --> 00:05:46,200
And then this will be a little trickier to parse out and identify.

58
00:05:46,200 --> 00:05:52,440
So I thought today, why don't we try to identify the first four types?

59
00:05:52,440 --> 00:06:00,080
And we can do this with Washington state data so we can get quite granular data.

60
00:06:00,080 --> 00:06:04,520
We can actually get every data point through the Freedom of Information Act.

61
00:06:04,520 --> 00:06:13,760
So we can get the whole market snapshot and we can look at these four types here.

62
00:06:13,760 --> 00:06:19,600
So without further ado, just going to go ahead and read in the data.

63
00:06:19,600 --> 00:06:23,160
And we'll be off to the recess.

64
00:06:23,160 --> 00:06:33,560
Okay, so we've got a handful of data points.

65
00:06:33,560 --> 00:06:39,360
So let's give it 30 seconds or so.

66
00:06:39,360 --> 00:06:44,840
And then the first thing I'm going to do is just look at all the flower data.

67
00:06:44,840 --> 00:06:48,400
We could look at concentrates and we may actually do that next.

68
00:06:48,400 --> 00:06:52,680
But just for simplicity sake, just so we're looking at apples to apples, we'll be looking

69
00:06:52,680 --> 00:06:59,200
at flower to flower.

70
00:06:59,200 --> 00:07:13,240
So almost two million observations, about a million flower samples of the, you know,

71
00:07:13,240 --> 00:07:27,720
approximately 2400, 2400, 2400, 2400,000 flower samples, about 17,000 or almost 18,000

72
00:07:27,720 --> 00:07:30,360
at CBG.

73
00:07:30,360 --> 00:07:35,680
So that's about seven and a half percent of all the samples.

74
00:07:35,680 --> 00:07:46,640
So you may even classify this whole group here as your type four, your cannabis with

75
00:07:46,640 --> 00:07:50,200
CBG.

76
00:07:50,200 --> 00:07:56,560
What I technically defined it as was CBG above average concentrations.

77
00:07:56,560 --> 00:08:06,600
So what we can do is, well, we can just take the average CBG present in the flower with

78
00:08:06,600 --> 00:08:11,600
CBG, which is about 0.8%.

79
00:08:11,600 --> 00:08:27,640
So then we could identify all the samples with above average CBG.

80
00:08:27,640 --> 00:08:43,880
Let's see if we can look at that data.

81
00:08:43,880 --> 00:08:47,800
So we have no CBG.

82
00:08:47,800 --> 00:08:50,720
Let's see what columns we do have here.

83
00:08:50,720 --> 00:08:59,960
Oh, that makes a lot of sense here.

84
00:08:59,960 --> 00:09:19,960
Now let's just look at a histogram just to see what it may look like.

85
00:09:19,960 --> 00:09:21,720
So this is a factor.

86
00:09:21,720 --> 00:09:24,760
So we may have some with miscoding.

87
00:09:24,760 --> 00:09:29,520
So we're going to go ahead and drop the outliers.

88
00:09:29,520 --> 00:09:34,160
That way we may be able to get a better plot here.

89
00:09:34,160 --> 00:09:48,360
So real quick, let's just restrict this to the bottom 95%, the bottom 95% core type.

90
00:09:48,360 --> 00:10:00,280
So they're with me here while we get this nice and plotted.

91
00:10:00,280 --> 00:10:09,800
So we're just restricting this.

92
00:10:09,800 --> 00:10:26,120
And then we'll see if we can't get a decent histogram once we exclude that outline.

93
00:10:26,120 --> 00:10:30,400
So here's a better histogram here.

94
00:10:30,400 --> 00:10:39,060
And so these are going to be the presence of CBG in our type 4.

95
00:10:39,060 --> 00:10:49,560
So we're basically saying, OK, type 4 is basically going to be about 1%.

96
00:10:49,560 --> 00:10:51,400
So we actually defined it earlier as 0.8%.

97
00:10:51,400 --> 00:11:04,200
So it's going to be about 0.8%, about 1% or above CBG content with a logarithmic distribution

98
00:11:04,200 --> 00:11:19,520
here, where you've got the top end of the distribution just shy of 5%, so about 4.5%.

99
00:11:19,520 --> 00:11:36,480
So let's just go ahead and look at the whole distribution of CBG.

100
00:11:36,480 --> 00:11:37,480
That's right.

101
00:11:37,480 --> 00:11:42,180
And let's exclude these for outliers as well.

102
00:11:42,180 --> 00:11:47,400
So bear with me while we just look at it.

103
00:11:47,400 --> 00:12:11,080
Let's look at CBG with the outliers as well.

104
00:12:11,080 --> 00:12:25,060
So once again, we actually see a logarithmic scale for CBG with the mean at about 0.8%.

105
00:12:25,060 --> 00:12:32,200
So I think that's the concentration you're looking for if you're trying to grow a CBG

106
00:12:32,200 --> 00:12:37,520
type 4 strain, is you're looking for 0.8%, 1% or above.

107
00:12:37,520 --> 00:12:42,240
OK, so we've drummed that horse into the ground.

108
00:12:42,240 --> 00:12:50,100
So now let's look at type 1, 2, and 3.

109
00:12:50,100 --> 00:12:57,320
So we can probably make short work of this, believe it or not.

110
00:12:57,320 --> 00:13:22,640
So we basically need to define the THC to CBD ratio here.

111
00:13:22,640 --> 00:13:35,640
So here we're just taking, actually we actually want a CBD to THC ratio.

112
00:13:35,640 --> 00:13:49,680
So let's reverse this real quick.

113
00:13:49,680 --> 00:13:56,960
Now we're going to calculate a CBD to THC ratio in the flower.

114
00:13:56,960 --> 00:14:00,960
Let's see if we can do that without too much trouble here.

115
00:14:00,960 --> 00:14:06,240
All right.

116
00:14:06,240 --> 00:14:14,280
We can, once again, we may need to drop outliers for plotting, but let's see if we can't get

117
00:14:14,280 --> 00:14:23,200
it to Instagram nonetheless.

118
00:14:23,200 --> 00:14:29,440
All right, so let's go ahead and identify our three types here before we drop the outliers.

119
00:14:29,440 --> 00:14:34,480
So simple enough.

120
00:14:34,480 --> 00:14:38,480
Define type 1 as high THC to CBD.

121
00:14:38,480 --> 00:14:43,000
Other ways, low CBD to THC.

122
00:14:43,000 --> 00:14:50,600
So we're going to define low as less than 0.9 CBD to THC ratio.

123
00:14:50,600 --> 00:14:56,240
So this is just an arbitrary number that I just pulled out of my hat.

124
00:14:56,240 --> 00:15:03,200
So this may need more rigorous definitions here.

125
00:15:03,200 --> 00:15:14,480
So just starting somewhere just to show you proof of concept.

126
00:15:14,480 --> 00:15:18,520
So we're going to locate all the data.

127
00:15:18,520 --> 00:15:27,400
So we want basically all of the flower data where the THC, or actually where the CBD to

128
00:15:27,400 --> 00:15:33,680
THC ratio is less than or equal to 0.9.

129
00:15:33,680 --> 00:15:41,400
So simple enough.

130
00:15:41,400 --> 00:15:45,800
Let's count how many there are.

131
00:15:45,800 --> 00:15:48,280
Great.

132
00:15:48,280 --> 00:16:01,880
And so let's calculate, OK, what is that out of all of the flower samples?

133
00:16:01,880 --> 00:16:14,560
Well, we may need, this could either just be the breakdown of the market or we may need

134
00:16:14,560 --> 00:16:18,880
to change our definitions of CBD to THC ratio.

135
00:16:18,880 --> 00:16:29,240
So it's looking here like about 92.5% of all of the flower sold is type 1.

136
00:16:29,240 --> 00:16:34,280
And that may very well be the case.

137
00:16:34,280 --> 00:16:42,120
You may, type 1 flower may be in high demand.

138
00:16:42,120 --> 00:16:53,760
So let's go ahead and identify these other two types just for completeness sake.

139
00:16:53,760 --> 00:16:58,000
Bear with me as I code this up in front of you.

140
00:16:58,000 --> 00:17:18,480
So basically we want the CBD THC ratio to be greater than 0.9 and we want it to be less

141
00:17:18,480 --> 00:17:21,480
than 1.1.

142
00:17:21,480 --> 00:17:30,800
And that should be our type 2.

143
00:17:30,800 --> 00:17:37,000
And let's calculate that as a proportion of the total.

144
00:17:37,000 --> 00:17:39,680
About 2%.

145
00:17:39,680 --> 00:17:47,880
Like I said, we may need to widen these bounds a bit for the type 2 identifier.

146
00:17:47,880 --> 00:17:57,160
So that's why it may be nice to actually look at a scatter plot here.

147
00:17:57,160 --> 00:17:59,280
That would actually probably be a good idea here.

148
00:17:59,280 --> 00:18:04,440
So let's identify type 3 and then let's look at some scatter plots to begin to visualize

149
00:18:04,440 --> 00:18:06,640
the data.

150
00:18:06,640 --> 00:18:09,840
So real quick, type 3.

151
00:18:09,840 --> 00:18:21,760
It's going to be everything with the CBD THC ratio greater than or equal to 1.

152
00:18:21,760 --> 00:18:25,760
Simple enough.

153
00:18:25,760 --> 00:18:34,720
Calculate that as a percentage of the total for completeness sake.

154
00:18:34,720 --> 00:18:43,560
About half a percent.

155
00:18:43,560 --> 00:18:46,840
So that may just be how the market shakes out.

156
00:18:46,840 --> 00:18:55,400
The market may just be looking for a low proportion of the high CBD THC.

157
00:18:55,400 --> 00:18:58,640
And so that could be a quite interesting data point.

158
00:18:58,640 --> 00:19:09,920
One second, they're saying, I apologize for the hot rod.

159
00:19:09,920 --> 00:19:12,440
Okay.

160
00:19:12,440 --> 00:19:19,120
So that may be saying, okay, the market may actually not be demanding very much high CBD

161
00:19:19,120 --> 00:19:20,840
cannabis.

162
00:19:20,840 --> 00:19:28,420
So if you're in Washington, you may want to look at these numbers and try to position

163
00:19:28,420 --> 00:19:29,420
yourself well.

164
00:19:29,420 --> 00:19:30,420
Right?

165
00:19:30,420 --> 00:19:37,140
You may, there may not be much competition for the high CBD flower if there's only a

166
00:19:37,140 --> 00:19:38,860
small amount being grown.

167
00:19:38,860 --> 00:19:43,040
So it could be a place to niche yourself.

168
00:19:43,040 --> 00:19:49,760
And there was a speaker here at the cannabis conference who did just that in Oregon.

169
00:19:49,760 --> 00:19:53,380
And so it can be done.

170
00:19:53,380 --> 00:19:56,920
So maybe there's more room for type 3.

171
00:19:56,920 --> 00:20:00,200
Maybe type 1 is quite saturated in the market.

172
00:20:00,200 --> 00:20:04,400
So possibility.

173
00:20:04,400 --> 00:20:07,240
Enough of just looking at the raw data.

174
00:20:07,240 --> 00:20:10,240
Let's look at a couple of plots here.

175
00:20:10,240 --> 00:20:19,040
So let's see if we can make a few scatter plots here.

176
00:20:19,040 --> 00:20:27,560
So first things first, let's see if we can't just look at a scatter plot of just the flower

177
00:20:27,560 --> 00:20:34,960
data as a whole.

178
00:20:34,960 --> 00:20:47,480
So here we just want to look at flower data.

179
00:20:47,480 --> 00:20:55,520
Well we will need this plotting library.

180
00:20:55,520 --> 00:20:57,740
And yes, it's getting a little rowdy here.

181
00:20:57,740 --> 00:21:08,520
So hopefully we can finish our analysis without too much more noise for you guys here.

182
00:21:08,520 --> 00:21:11,260
We've got a lot of data that we're plotting.

183
00:21:11,260 --> 00:21:20,520
So we may, it looks like, want to exclude outliers.

184
00:21:20,520 --> 00:21:28,600
So let's do that and see if we can't get a better visualization of this data.

185
00:21:28,600 --> 00:21:39,040
So as you've noticed, excluding outliers is a go-to strategy of mine, at least for creating

186
00:21:39,040 --> 00:21:41,520
visualizations.

187
00:21:41,520 --> 00:21:48,400
It just, as you see these outliers, they ruin the scale.

188
00:21:48,400 --> 00:21:51,920
So it's hard to visualize the data.

189
00:21:51,920 --> 00:21:55,920
So simple enough.

190
00:21:55,920 --> 00:22:02,680
We'll restrict to the bottom 95% quartile.

191
00:22:02,680 --> 00:22:07,160
Because we're basically saying, okay, that top 5% they're outliers.

192
00:22:07,160 --> 00:22:15,000
We're not sure if that's miscoded data or what have you.

193
00:22:15,000 --> 00:22:18,400
Albeit that can be an interesting segment of the data set.

194
00:22:18,400 --> 00:22:27,480
So you don't want to just throw away the top 5% in your analysis because they're often

195
00:22:27,480 --> 00:22:33,360
some of the most dynamic observations.

196
00:22:33,360 --> 00:22:41,120
So let's create a quick plot here.

197
00:22:41,120 --> 00:22:49,680
Here's an upper bound where...

198
00:22:49,680 --> 00:23:04,760
So we may actually need to restrict this twice, once on CBD and once on THC.

199
00:23:04,760 --> 00:23:08,440
Okay.

200
00:23:08,440 --> 00:23:19,720
So bear with me here.

201
00:23:19,720 --> 00:23:26,600
Bear with me.

202
00:23:26,600 --> 00:23:28,080
So this is just sort of ad hoc.

203
00:23:28,080 --> 00:23:30,240
We could clean this up later.

204
00:23:30,240 --> 00:23:41,600
We're just trying to get this plot made real quick.

205
00:23:41,600 --> 00:23:43,600
Great.

206
00:23:43,600 --> 00:24:05,480
Do we now have...

207
00:24:05,480 --> 00:24:10,440
That is quite the scatter plot, isn't it?

208
00:24:10,440 --> 00:24:13,880
It looks like there's something just all across the board here.

209
00:24:13,880 --> 00:24:25,080
So that...

210
00:24:25,080 --> 00:24:43,680
Let's put CBD on the y-axis here because that would make more sense.

211
00:24:43,680 --> 00:24:47,680
Let's try this one more time.

212
00:24:47,680 --> 00:24:58,800
So this is just going to be a plot of CBD to THC of all flower data in Washington State.

213
00:24:58,800 --> 00:25:11,560
So let's go ahead and put some labels on this just so we can see what we're looking at here.

214
00:25:11,560 --> 00:25:21,880
So CBD to THC.

215
00:25:21,880 --> 00:25:37,280
Like I said, hot rods.

216
00:25:37,280 --> 00:25:38,280
Okay.

217
00:25:38,280 --> 00:25:45,720
So here we have a plot of CBD to THC.

218
00:25:45,720 --> 00:25:50,560
And so as you can see, we basically have something all across the map.

219
00:25:50,560 --> 00:25:55,000
There's strains with high CBD, low THC.

220
00:25:55,000 --> 00:25:59,760
There's strains with high THC, high CBD.

221
00:25:59,760 --> 00:26:01,860
So there's something all over the map.

222
00:26:01,860 --> 00:26:06,480
So that's why we're trying to break this, hopefully, into four groups.

223
00:26:06,480 --> 00:26:15,360
So we basically have quadrant one.

224
00:26:15,360 --> 00:26:16,840
So we'll start with quadrant one.

225
00:26:16,840 --> 00:26:23,880
So quadrant one is high THC, high CBD.

226
00:26:23,880 --> 00:26:33,440
So those are going to be your near unitary, perhaps, your near unitary cannabis.

227
00:26:33,440 --> 00:26:40,120
Then you're going to have quadrant two, low THC, high CBD.

228
00:26:40,120 --> 00:26:44,600
So that's going to be your high CBD strains.

229
00:26:44,600 --> 00:26:55,400
And then it looks like the quadrant three and quadrant four are type one, where it's

230
00:26:55,400 --> 00:27:03,020
basically high THC, low CBD.

231
00:27:03,020 --> 00:27:13,240
So let's see if we can't plot the four different types and perhaps be able to have a bit more

232
00:27:13,240 --> 00:27:14,240
definition.

233
00:27:14,240 --> 00:27:24,400
Once again, there may have to be some fancy footwork done to handle outliers, but we'll

234
00:27:24,400 --> 00:27:31,440
deal with that as the need arises.

235
00:27:31,440 --> 00:27:37,240
So bear with me putting CBD on the y-axis.

236
00:27:37,240 --> 00:27:38,840
Right?

237
00:27:38,840 --> 00:27:54,280
So now let's see if we can't plot the four types.

238
00:27:54,280 --> 00:27:57,920
And this could be interesting here.

239
00:27:57,920 --> 00:28:10,840
Right, so let's do, we've done it, but now let's repeat it, restricting the outliers,

240
00:28:10,840 --> 00:28:15,560
just so we can visualize the majority of the data better.

241
00:28:15,560 --> 00:28:22,880
So what we can simply do, toss the restriction.

242
00:28:22,880 --> 00:28:33,840
We'll just define the restriction before we define the types, simple enough.

243
00:28:33,840 --> 00:28:46,320
And we'll simply define our types with the restricted data set here.

244
00:28:46,320 --> 00:28:53,320
And hopefully we get a decent looking figure.

245
00:28:53,320 --> 00:29:01,160
This is the first time I've, type four will remain the same.

246
00:29:01,160 --> 00:29:05,640
We may just want to use type four data where we've restricted the outliers.

247
00:29:05,640 --> 00:29:12,760
So let's recalculate these three types here, simple enough.

248
00:29:12,760 --> 00:29:19,920
And we're going to use type four data, which has been restricted.

249
00:29:19,920 --> 00:29:22,040
And let's make this figure one more time.

250
00:29:22,040 --> 00:29:30,480
See if we have a bit more definition.

251
00:29:30,480 --> 00:29:39,320
Interesting.

252
00:29:39,320 --> 00:29:59,200
So something didn't quite take here.

253
00:29:59,200 --> 00:30:10,040
Somehow that didn't work quite well.

254
00:30:10,040 --> 00:30:23,080
Okay, so another strategy that we're going to do is let's just calculate types with the

255
00:30:23,080 --> 00:30:28,120
full data set.

256
00:30:28,120 --> 00:30:38,120
So going back and we'll just basically going to do what we did here.

257
00:30:38,120 --> 00:30:42,840
And like I said, I'm just doing this kind of quick and dirty, just so we can get these

258
00:30:42,840 --> 00:30:44,840
figures up.

259
00:30:44,840 --> 00:31:03,040
Let's go ahead and restrict these types to the 95% quartile.

260
00:31:03,040 --> 00:31:10,360
Okay.

261
00:31:10,360 --> 00:31:20,640
So this may or may not work, but it's worth a shot here because I think we could get a

262
00:31:20,640 --> 00:31:26,920
really, really interesting figure.

263
00:31:26,920 --> 00:31:35,280
So just bear with me as I put this up and 30 likes.

264
00:31:35,280 --> 00:31:42,320
Okay, so three.

265
00:31:42,320 --> 00:31:46,960
Simple enough.

266
00:31:46,960 --> 00:32:00,680
We may have removed the outliers from the data to the extent that we may now be able to plot

267
00:32:00,680 --> 00:32:11,720
it in a decent manner here.

268
00:32:11,720 --> 00:32:23,040
And let's see what we have.

269
00:32:23,040 --> 00:32:24,040
Okay.

270
00:32:24,040 --> 00:32:50,440
We still have not, I still not ideally removed the, I still have not successfully removed

271
00:32:50,440 --> 00:32:52,400
these outliers here.

272
00:32:52,400 --> 00:32:54,320
That's okay.

273
00:32:54,320 --> 00:33:02,040
We'll do what we did with the restricted flower data essentially.

274
00:33:02,040 --> 00:33:10,120
So bear with me five more minutes or maybe hopefully less than that.

275
00:33:10,120 --> 00:33:12,680
And then we can get this sorted away.

276
00:33:12,680 --> 00:33:15,680
All right.

277
00:33:15,680 --> 00:33:23,080
Thank you.

278
00:33:23,080 --> 00:33:33,280
The traffic's picking up a bit, so we'll get this plot made and then we may go ahead and

279
00:33:33,280 --> 00:33:34,920
conclude a bit early for today.

280
00:33:34,920 --> 00:33:40,080
So my apologies for sort of this impromptu meetup.

281
00:33:40,080 --> 00:33:46,920
So next week we'll be back to the regular crunching numbers back in Oklahoma actually.

282
00:33:46,920 --> 00:33:49,160
So that will be fun.

283
00:33:49,160 --> 00:33:55,800
So thanks for bearing with me for this sort of impromptu meetup and hopefully we can get

284
00:33:55,800 --> 00:33:59,160
back on track for next week.

285
00:33:59,160 --> 00:34:09,960
So just to go ahead and see this through.

286
00:34:09,960 --> 00:34:21,960
We can finish this really quick without too much trouble.

287
00:34:21,960 --> 00:34:28,280
And the nice thing about coding, reuse, reuse, reuse.

288
00:34:28,280 --> 00:34:48,680
So bear with me as we get this sorted up.

289
00:34:48,680 --> 00:35:04,720
And the third time does the charm.

290
00:35:04,720 --> 00:35:20,240
This maybe will have excluded the outliers sufficiently that we can get a decent plot.

291
00:35:20,240 --> 00:35:37,800
By the way, that's a T. What type of T?

292
00:35:37,800 --> 00:35:40,200
Well this is bizarre.

293
00:35:40,200 --> 00:35:44,320
So this is our plot.

294
00:35:44,320 --> 00:35:54,680
And so apparently our categories may not have quite classified the cannabis as we may have

295
00:35:54,680 --> 00:35:56,840
desired.

296
00:35:56,840 --> 00:36:04,960
Because we were thinking, okay, this may be quadrant one, two, three, and four here.

297
00:36:04,960 --> 00:36:10,440
And here what we're seeing is basically we're defining everything.

298
00:36:10,440 --> 00:36:15,480
And we'll actually want to toss some legends on here.

299
00:36:15,480 --> 00:36:35,960
So we'll need to see if we can do a legend on this real quick.

300
00:36:35,960 --> 00:36:43,120
I think we can just hopefully we can just hopefully we can just toss a label on this

301
00:36:43,120 --> 00:36:52,160
data here.

302
00:36:52,160 --> 00:36:53,920
I'm not certain.

303
00:36:53,920 --> 00:36:58,160
So let's just try this real quick.

304
00:36:58,160 --> 00:37:05,920
It would be nice to know which one type two and which is type three.

305
00:37:05,920 --> 00:37:14,040
Which one's the green exactly?

306
00:37:14,040 --> 00:37:21,480
So bear with me while we get this legend.

307
00:37:21,480 --> 00:37:44,520
Okay, if we toss this legend in here.

308
00:37:44,520 --> 00:37:48,400
Let's see if.

309
00:37:48,400 --> 00:37:53,160
Okay this didn't quite do the trick.

310
00:37:53,160 --> 00:37:59,200
It doesn't work.

311
00:37:59,200 --> 00:38:07,840
It's not the end of the world.

312
00:38:07,840 --> 00:38:09,680
We just kind of.

313
00:38:09,680 --> 00:38:17,700
Okay so there we are.

314
00:38:17,700 --> 00:38:24,600
So as I was thinking the type one is we're defining that is you know that's what we're

315
00:38:24,600 --> 00:38:27,840
defining is 90% of all cannabis.

316
00:38:27,840 --> 00:38:33,600
Interestingly, so the type two almost non-existent.

317
00:38:33,600 --> 00:38:41,000
At least with our definition we need many to broaden our cutoff here.

318
00:38:41,000 --> 00:38:43,120
And we'll do that here in one second.

319
00:38:43,120 --> 00:38:51,520
And then we're basically saying okay the type three is you know this bottom segment.

320
00:38:51,520 --> 00:38:52,920
So quite interesting.

321
00:38:52,920 --> 00:38:55,040
Let's add in type four.

322
00:38:55,040 --> 00:39:04,160
See if the visualization stays decent.

323
00:39:04,160 --> 00:39:13,720
Heavy outliers one hopes.

324
00:39:13,720 --> 00:39:19,560
Okay it looks like type four has outliers as well.

325
00:39:19,560 --> 00:39:28,080
So just for completeness sake and I'm curious to see type four plotted.

326
00:39:28,080 --> 00:39:37,240
So why don't we just go ahead and basically we'll just restrict the type four data removing

327
00:39:37,240 --> 00:39:48,480
the outliers the same way we did with type one two and three.

328
00:39:48,480 --> 00:40:04,960
Okay and so now we're treating all the data the same which is.

329
00:40:04,960 --> 00:40:27,800
Interesting.

330
00:40:27,800 --> 00:40:41,320
So honestly what I make of this is it looks like these high CBG strains also have high

331
00:40:41,320 --> 00:40:50,440
high CBG strains also that appears to have high CBD.

332
00:40:50,440 --> 00:40:59,520
Let's just make sure I did this correctly.

333
00:40:59,520 --> 00:41:05,600
Everything yeah everything looks about right.

334
00:41:05,600 --> 00:41:07,800
So that's just quite odd.

335
00:41:07,800 --> 00:41:13,480
And so that's why maybe the type four don't fit so nicely into this box here is because

336
00:41:13,480 --> 00:41:18,280
they are in fact you know a whole nother type of their own.

337
00:41:18,280 --> 00:41:23,760
So that's maybe maybe this plot.

338
00:41:23,760 --> 00:41:28,840
It may make the most sense just type one two and three.

339
00:41:28,840 --> 00:41:35,840
So yes yes so that's an interesting interesting observation.

340
00:41:35,840 --> 00:41:40,680
And just to round this out finish it up.

341
00:41:40,680 --> 00:41:50,520
Let's just go ahead and remember our our our limits here were sort of ad hoc.

342
00:41:50,520 --> 00:41:59,760
So why don't we just sort of define those and then we can you know see if we can't.

343
00:41:59,760 --> 00:42:05,940
Parameter hunt essentially do to see if we can you know define these into quadrants.

344
00:42:05,940 --> 00:42:08,360
This isn't really how I'd recommend doing things.

345
00:42:08,360 --> 00:42:11,360
It may be nice to have a bit more of a theoretical approach.

346
00:42:11,360 --> 00:42:17,840
But for now let's just let's just crunch some numbers with the time we have here.

347
00:42:17,840 --> 00:42:22,640
So let's just define these limits here.

348
00:42:22,640 --> 00:42:28,360
Maybe the high THC limit.

349
00:42:28,360 --> 00:42:40,040
Maybe we'll lower that to zero point five and then the high CBD limit.

350
00:42:40,040 --> 00:42:44,680
Maybe we'll raise that to one point five.

351
00:42:44,680 --> 00:42:55,520
So now we can substitute these into our analysis to make it dynamic.

352
00:42:55,520 --> 00:43:10,120
So I can find type two and then we'll identify type three.

353
00:43:10,120 --> 00:43:12,840
All right.

354
00:43:12,840 --> 00:43:20,460
Let's basically just rerun this chunk of code and just redo our analysis.

355
00:43:20,460 --> 00:43:22,640
This may take a hot minute.

356
00:43:22,640 --> 00:43:30,960
Maybe 10 seconds and then.

357
00:43:30,960 --> 00:43:40,480
Well we have a bit more distinction here but it's still this quite similar quite similar

358
00:43:40,480 --> 00:43:51,800
look here where you've got real small segment of the market as type two near unitary.

359
00:43:51,800 --> 00:43:59,240
Then you have a small amount of the market type three.

360
00:43:59,240 --> 00:44:05,800
The majority of the market type four.

361
00:44:05,800 --> 00:44:14,680
So let's just lower these limits real low.

362
00:44:14,680 --> 00:44:30,240
And see what this may look like.

363
00:44:30,240 --> 00:44:38,200
Basically did three scenarios narrow bounds medium bounds wide bounds or oh the wide bounds

364
00:44:38,200 --> 00:44:40,940
is quite interesting here.

365
00:44:40,940 --> 00:44:48,160
So if we if we do wide bounds then we actually get a bit better distinction.

366
00:44:48,160 --> 00:44:53,600
OK what's type two what's type three and what's type one.

367
00:44:53,600 --> 00:45:08,000
And so I think this is a interesting interesting plot here because it does begin to show how

368
00:45:08,000 --> 00:45:14,920
Lauer appears to have at least two different types.

369
00:45:14,920 --> 00:45:22,600
It looks like type one and type I mean type two and type three maybe similar.

370
00:45:22,600 --> 00:45:26,320
They're they're kind of clustering together.

371
00:45:26,320 --> 00:45:27,320
They may be distinct.

372
00:45:27,320 --> 00:45:33,480
However, they're definitely different than type one.

373
00:45:33,480 --> 00:45:37,760
So quite interesting observation.

374
00:45:37,760 --> 00:45:41,320
And then what's going on with type four.

375
00:45:41,320 --> 00:45:47,400
Is this an outlier issue or are they a special group of their own.

376
00:45:47,400 --> 00:45:51,600
And so the type four remember we defined as.

377
00:45:51,600 --> 00:46:04,480
And so here let's look at just type four in isolation.

378
00:46:04,480 --> 00:46:10,740
So type one two and three and then type four.

379
00:46:10,740 --> 00:46:21,160
Is just real real peculiar where you have some with just staggering high levels of CBD.

380
00:46:21,160 --> 00:46:27,240
So just so you know like 16 percent CBD is staggeringly high.

381
00:46:27,240 --> 00:46:33,320
I think that's perfectly possible.

382
00:46:33,320 --> 00:46:34,320
It is.

383
00:46:34,320 --> 00:46:35,320
It is.

384
00:46:35,320 --> 00:46:36,320
Yes.

385
00:46:36,320 --> 00:46:40,960
I've definitely seen CBD strains testing that high.

386
00:46:40,960 --> 00:46:43,480
It's rare.

387
00:46:43,480 --> 00:46:55,000
Right. These are you know this this rare this this rare group of strains of cannabis that

388
00:46:55,000 --> 00:47:03,680
produce CBG which is uncommon and they produce some of them staggeringly high levels of CBD.

389
00:47:03,680 --> 00:47:10,760
And so this may be something to keep an eye out for in the future.

390
00:47:10,760 --> 00:47:16,120
And then type five no major cannabinoids.

391
00:47:16,120 --> 00:47:20,400
We may be able to say that.

392
00:47:20,400 --> 00:47:22,520
Maybe this is type five.

393
00:47:22,520 --> 00:47:23,520
Right.

394
00:47:23,520 --> 00:47:24,520
Hold on.

395
00:47:24,520 --> 00:47:27,000
Let me re plot this.

396
00:47:27,000 --> 00:47:29,800
Something's going.

397
00:47:29,800 --> 00:47:36,800
So my.

398
00:47:36,800 --> 00:47:40,680
My console is not behaving super well.

399
00:47:40,680 --> 00:47:44,240
So let's just plot this one more time.

400
00:47:44,240 --> 00:47:55,040
Just so I can point at it while I'm talking about it.

401
00:47:55,040 --> 00:48:03,680
So basically what I was saying was this may actually be type five where you know the delta

402
00:48:03,680 --> 00:48:15,880
and also now I'm starting to wonder did we did we exclude the concentrate successfully

403
00:48:15,880 --> 00:48:21,280
here because cannabis shouldn't have that high a level of THC.

404
00:48:21,280 --> 00:48:37,820
So we may have we may not have successfully isolated the flower data.

405
00:48:37,820 --> 00:48:47,680
So I want to revisit that because the delta nine really should not be above above you

406
00:48:47,680 --> 00:48:50,200
know 30 35 percent.

407
00:48:50,200 --> 00:48:55,880
So what I'm thinking here is we've got type one type two type one type two type three

408
00:48:55,880 --> 00:49:00,120
and then I'm thinking these must be concentrates.

409
00:49:00,120 --> 00:49:04,280
So how they got mixed in.

410
00:49:04,280 --> 00:49:08,480
I'm not 100 percent certain that I want to read.

411
00:49:08,480 --> 00:49:10,560
Look at this analysis here.

412
00:49:10,560 --> 00:49:18,280
So now after casting complete uncertainty over the entire analysis here I'll go ahead and

413
00:49:18,280 --> 00:49:21,320
be wrapping it up.

414
00:49:21,320 --> 00:49:29,920
So long story short there's something going on here with these different types of cannabis

415
00:49:29,920 --> 00:49:36,600
but I think it clearly requires further analysis at least you know here in the Washington state

416
00:49:36,600 --> 00:49:37,600
data.

417
00:49:37,600 --> 00:49:39,760
So there's plenty to be done here.

418
00:49:39,760 --> 00:49:49,440
So this was just a crude you know proof proof concept that you can take this data you can

419
00:49:49,440 --> 00:49:56,520
calculate these statistics and you can create some quite interesting visualizations.

420
00:49:56,520 --> 00:50:04,720
So so that's you know what's been going on here at the you know the cannabis conference

421
00:50:04,720 --> 00:50:10,760
you know I'll I'll try to share my notes with you and you know just share with you what's

422
00:50:10,760 --> 00:50:11,760
going on.

423
00:50:11,760 --> 00:50:20,560
But you know a lot of it's just it's just kind of time to look at the data especially

424
00:50:20,560 --> 00:50:22,240
if you're on the cultivation side right.

425
00:50:22,240 --> 00:50:30,000
So data and planning go hand in hand.

426
00:50:30,000 --> 00:50:39,280
All righty then well Heather thank you for coming back today for an impromptu meet up

427
00:50:39,280 --> 00:50:41,480
of the cannabis data science group.

428
00:50:41,480 --> 00:50:48,520
So so thank you for coming back and we'll try to get the rest of the crowd back back

429
00:50:48,520 --> 00:50:49,940
for next week.

430
00:50:49,940 --> 00:50:55,280
So it was just it was a snafu yesterday.

431
00:50:55,280 --> 00:51:02,480
I meant to be here but you know you live and you learn and you know you can only you know

432
00:51:02,480 --> 00:51:09,720
approach the and this actually may be a good example of sort of the lessons learned in

433
00:51:09,720 --> 00:51:11,040
the cannabis industry right.

434
00:51:11,040 --> 00:51:17,960
So nothing ever goes to plan your contingency plan needs a contingency plan.

435
00:51:17,960 --> 00:51:23,200
So and that's sort of the lesson that people are learning in the cannabis industry no matter

436
00:51:23,200 --> 00:51:27,000
how much you plan nothing ever goes to plan.

437
00:51:27,000 --> 00:51:32,480
You're going to have to think you're going to have to move quickly on your feet and there

438
00:51:32,480 --> 00:51:38,080
could be some disasters there could be there could be some fires that need to be put out.

439
00:51:38,080 --> 00:51:49,640
And so I think that's that's just what to keep in mind is move quick get going.

440
00:51:49,640 --> 00:51:55,360
Now is better than never and take your adversities as they come.

441
00:51:55,360 --> 00:51:58,080
So and you'll be successful.

442
00:51:58,080 --> 00:52:04,560
So those I think are the major takeaways and you know learn from smart people.

443
00:52:04,560 --> 00:52:10,800
So that's what I do try to find people that are smarter than yourself and and immerse

444
00:52:10,800 --> 00:52:11,800
yourself with them.

445
00:52:11,800 --> 00:52:12,800
So that's why I'm here.

446
00:52:12,800 --> 00:52:16,640
There's a bunch of people here that are much more knowledgeable in the cannabis industry

447
00:52:16,640 --> 00:52:17,860
than myself.

448
00:52:17,860 --> 00:52:24,940
So just here to learn and then try to convey some of that knowledge the best I can to everyone

449
00:52:24,940 --> 00:52:28,680
else in the group.

450
00:52:28,680 --> 00:52:35,920
So I think I'll go ahead and wrap it up a little early today taking up a seat here at

451
00:52:35,920 --> 00:52:41,320
the cafe and I'm sure the waiters looking to get a yeah someone else here.

452
00:52:41,320 --> 00:52:43,400
So I'll quit taking up space.

453
00:52:43,400 --> 00:52:50,360
So thank you for thank you for attending and next week we'll be back to normal back in

454
00:52:50,360 --> 00:52:58,600
Tulsa Oklahoma Frenching numbers and we'll be doing we'll be back next week to doing

455
00:52:58,600 --> 00:53:06,800
our cannabis data science meetup group estimate of the size of the entire industry.

456
00:53:06,800 --> 00:53:15,520
So it's not saying we'll conclude next week but we'll be back at it.

457
00:53:15,520 --> 00:53:17,480
Thanks awesome Heather.

458
00:53:17,480 --> 00:53:19,560
All right.

459
00:53:19,560 --> 00:53:24,440
Thanks for bearing with us and thank you for attending and keep your nose to the grindstone.

460
00:53:24,440 --> 00:53:25,440
Thank you.

461
00:53:25,440 --> 00:53:26,440
Bye.

462
00:53:26,440 --> 00:53:36,600
Bye now.

