1
00:00:00,000 --> 00:00:07,660
Welcome to Cannabis Data Science.

2
00:00:07,660 --> 00:00:10,560
Thanks for holding tight.

3
00:00:10,560 --> 00:00:12,660
Promise it's going to be worth your while today.

4
00:00:12,660 --> 00:00:17,680
So have yet another new idea to share with you today.

5
00:00:17,680 --> 00:00:21,880
And also want to share a lot of ideas about the group and kind of get everyone on the

6
00:00:21,880 --> 00:00:22,880
same page.

7
00:00:22,880 --> 00:00:25,720
Kind of talk about what we're doing here.

8
00:00:25,720 --> 00:00:29,820
Like what's the point of cannabis data at the end of the day?

9
00:00:29,820 --> 00:00:32,040
So I'll get into that momentarily.

10
00:00:32,040 --> 00:00:38,920
But before you formally kick off, for those of you who are new, my name is Keegan.

11
00:00:38,920 --> 00:00:40,240
Started Cannlytics.

12
00:00:40,240 --> 00:00:45,880
So anyone can help out with statistics in the cannabis space.

13
00:00:45,880 --> 00:00:50,420
I think the cannabis space is a place for everyone.

14
00:00:50,420 --> 00:00:59,260
So grab your skills, grab any know-how you know, and come and join in on the fun.

15
00:00:59,260 --> 00:01:05,040
So the idea is it's not an exclusive club.

16
00:01:05,040 --> 00:01:10,560
Some industries are really hard to break into.

17
00:01:10,560 --> 00:01:20,000
And I think it would be real cool to see the cannabis industry be a place that's open to

18
00:01:20,000 --> 00:01:21,000
everybody.

19
00:01:21,000 --> 00:01:24,560
And so Cannlytics is leading by doing.

20
00:01:24,560 --> 00:01:28,500
And so Cannlytics is open to everyone.

21
00:01:28,500 --> 00:01:29,560
Any skill set.

22
00:01:29,560 --> 00:01:39,200
And so the idea is, I mean, myself, by and far, far from the greatest of whatever you

23
00:01:39,200 --> 00:01:48,880
may want to call one's self, whether it's a statistician, economist, data scientists.

24
00:01:48,880 --> 00:01:54,720
I may use those words, but people may disagree, you know, if I am any of those things.

25
00:01:54,720 --> 00:02:00,460
But the idea is I think that we can put all of our minds together with these awesome tools

26
00:02:00,460 --> 00:02:01,800
that we have here, right?

27
00:02:01,800 --> 00:02:04,080
The internet, computers.

28
00:02:04,080 --> 00:02:07,920
And I think we can make a positive impact to help a lot of people.

29
00:02:07,920 --> 00:02:15,040
So the idea is, you know, why should we just, you know, stand by and let all these golden

30
00:02:15,040 --> 00:02:17,240
opportunities get snatched up?

31
00:02:17,240 --> 00:02:22,280
So this is a place that I think should be open.

32
00:02:22,280 --> 00:02:27,120
And you know, anybody is welcome to come here, share your ideas, share your research.

33
00:02:27,120 --> 00:02:34,440
And the idea is by bouncing off our ideas together in an open manner where anyone can

34
00:02:34,440 --> 00:02:43,120
participate, values open to everybody, that we here can generate more value than you can

35
00:02:43,120 --> 00:02:47,920
with a treasure chest of six hundred million dollars.

36
00:02:47,920 --> 00:02:51,400
And I think we've already demonstrated that.

37
00:02:51,400 --> 00:02:54,560
And I would love to continue demonstrating that.

38
00:02:54,560 --> 00:02:59,320
And so I'm going to try to drive that point home here today.

39
00:02:59,320 --> 00:03:02,120
But that's what the group's about.

40
00:03:02,120 --> 00:03:09,400
And so without stealing the show, we'd like to spend today as a pretty open discussion

41
00:03:09,400 --> 00:03:11,480
so everyone has a chance to speak.

42
00:03:11,480 --> 00:03:13,640
So you don't have to.

43
00:03:13,640 --> 00:03:16,880
You're welcome to be a fly on the wall and listen.

44
00:03:16,880 --> 00:03:20,240
But if you want to speak up, speak up as much as you wish.

45
00:03:20,240 --> 00:03:26,680
So in no particular order, just starting in my top corner, Jacob, welcome to the group.

46
00:03:26,680 --> 00:03:32,120
Would you be interested in introducing yourself and what you hope to get out of the group

47
00:03:32,120 --> 00:03:43,680
or do in the canvas space?

48
00:03:43,680 --> 00:03:46,260
And no, no pressure, as I said.

49
00:03:46,260 --> 00:03:50,880
Or if you're taking a second to toggle your dials, I know it took me a second to then

50
00:03:50,880 --> 00:03:54,440
just chime in when you get those set.

51
00:03:54,440 --> 00:03:56,200
How about you, Candace?

52
00:03:56,200 --> 00:03:57,520
Anything on your mind?

53
00:03:57,520 --> 00:04:01,160
Anything that you're hoping to achieve this week?

54
00:04:01,160 --> 00:04:02,160
Thank you again.

55
00:04:02,160 --> 00:04:03,160
Let's see.

56
00:04:03,160 --> 00:04:10,800
I'm still working on the ERD, the markdown, and I did get the Panda profile reports for

57
00:04:10,800 --> 00:04:17,240
all the Washington State 2021 data set, and I'm just going to stuff it into a little PPT

58
00:04:17,240 --> 00:04:22,320
and we can start, you know, just kind of keep chunking away on that Washington State data

59
00:04:22,320 --> 00:04:25,160
set and who knows, you know, my call pretty cool.

60
00:04:25,160 --> 00:04:27,920
And then there's some other things.

61
00:04:27,920 --> 00:04:33,600
And then once we do some data analysis on it or get the data better, you know, we can

62
00:04:33,600 --> 00:04:40,000
start coming up with a bit of a plan and then I can, you know, put it up for everybody.

63
00:04:40,000 --> 00:04:42,280
And this is phenomenal.

64
00:04:42,280 --> 00:04:48,040
So a big complaint is that people trying to learn data science, they don't have good data

65
00:04:48,040 --> 00:04:52,440
sets to work on or they're all working on the same data sets, right?

66
00:04:52,440 --> 00:04:57,560
You all worked on the same Twitter posts and it's like, you know, how much can we scrape

67
00:04:57,560 --> 00:04:59,760
these Twitter posts?

68
00:04:59,760 --> 00:05:05,440
And I think that's one of the fun things about the Cannabis Data Science Group is where we

69
00:05:05,440 --> 00:05:08,000
have novel data sets.

70
00:05:08,000 --> 00:05:14,520
And we have to scrape them out of PDFs or APIs or what have you.

71
00:05:14,520 --> 00:05:15,520
But they're fun.

72
00:05:15,520 --> 00:05:16,520
It's real data.

73
00:05:16,520 --> 00:05:18,200
So we get real challenges.

74
00:05:18,200 --> 00:05:24,200
The Washington State data is one of the best examples of messy, messy data that a good

75
00:05:24,200 --> 00:05:30,000
data scientist needs, is needed to clean up and analyze.

76
00:05:30,000 --> 00:05:39,960
So your work on that is going to go light years because once you get that data set approachable,

77
00:05:39,960 --> 00:05:48,520
then people all across the world can use this, even if it's just as a test, right?

78
00:05:48,520 --> 00:05:54,360
So instead of testing your machine learning models on Twitter posts, you can test your

79
00:05:54,360 --> 00:06:00,840
machine learning models on Washington State traceability data, which is kind of cool in

80
00:06:00,840 --> 00:06:01,880
my opinion.

81
00:06:01,880 --> 00:06:04,540
So fantastic work, Candice.

82
00:06:04,540 --> 00:06:06,720
This will pay off here in the coming weeks.

83
00:06:06,720 --> 00:06:07,720
So thank you.

84
00:06:07,720 --> 00:06:12,360
Oh, and yeah, Ryan had put in a note and he said Canalytics.

85
00:06:12,360 --> 00:06:17,480
This one is C-A-N-N-L-Y-T-I-C-S, Ryan.

86
00:06:17,480 --> 00:06:20,080
But you can take it.

87
00:06:20,080 --> 00:06:21,080
Sorry, Keegan.

88
00:06:21,080 --> 00:06:22,080
Ah, so Canalytics.

89
00:06:22,080 --> 00:06:25,080
Yeah, Canalytics.ai.ai.

90
00:06:25,080 --> 00:06:28,840
So thank you.

91
00:06:28,840 --> 00:06:31,120
Oh, you're welcome.

92
00:06:31,120 --> 00:06:33,720
Thank you so much, Keegan, for what you're doing.

93
00:06:33,720 --> 00:06:37,720
And everybody, everybody that tips into Jerry, Joe, and everybody.

94
00:06:37,720 --> 00:06:41,320
It's everyone, I mean, awesome.

95
00:06:41,320 --> 00:06:42,320
Thank you.

96
00:06:42,320 --> 00:06:52,040
Candice, can I ask you a question about the Washington data set you're working with?

97
00:06:52,040 --> 00:06:54,480
Sure, John, if that's okay, Candice.

98
00:06:54,480 --> 00:06:56,720
Oh, yeah, sorry.

99
00:06:56,720 --> 00:06:59,440
Playing with, I guess, toggling the switches.

100
00:06:59,440 --> 00:07:00,440
Yeah.

101
00:07:00,440 --> 00:07:07,760
No, quick question is that our own organization has an interest in ingested products at this

102
00:07:07,760 --> 00:07:14,880
point and kind of typing the classes.

103
00:07:14,880 --> 00:07:22,600
We've done an initial pass with just survey data that we've collected from users in California

104
00:07:22,600 --> 00:07:28,160
that are patients of my clinical colleague, Jean Taleran.

105
00:07:28,160 --> 00:07:39,200
Does the Washington data set, does it have fields that speak to ingested data and ingested

106
00:07:39,200 --> 00:07:45,880
products and potentially we could look at that to try and figure out the type and frequency?

107
00:07:45,880 --> 00:07:53,280
Well, they're not doing like per customer dosing per se.

108
00:07:53,280 --> 00:07:59,800
No, no, but do they talk about gummies versus chocolates versus mints versus beverages?

109
00:07:59,800 --> 00:08:07,680
Yes, and Keegan has used that data set to in past YouTube videos showing that too, I

110
00:08:07,680 --> 00:08:13,640
think, the way with Washington State, right, we've done Washington State, Oklahoma, all

111
00:08:13,640 --> 00:08:15,440
kinds of data sets.

112
00:08:15,440 --> 00:08:18,000
But yes, the answer is yes.

113
00:08:18,000 --> 00:08:20,440
There's all different types of flour.

114
00:08:20,440 --> 00:08:21,440
There's edibles.

115
00:08:21,440 --> 00:08:23,800
There's, I don't think they actually break it down.

116
00:08:23,800 --> 00:08:25,160
Keegan, correct me if I'm wrong.

117
00:08:25,160 --> 00:08:32,040
I don't think they break it down into like whether they're chocolates, gummies, but they

118
00:08:32,040 --> 00:08:37,640
do kind of, yeah, they do indicate that.

119
00:08:37,640 --> 00:08:44,680
Well, many thoughts because you two both raised many good ideas here and points.

120
00:08:44,680 --> 00:08:46,240
So where to begin?

121
00:08:46,240 --> 00:08:51,600
I would begin by saying once you find one gold nugget in a data set, then there could

122
00:08:51,600 --> 00:08:53,100
easily be more.

123
00:08:53,100 --> 00:08:58,800
So as you raised out, I raised the point, there's probably more gold in the Washington

124
00:08:58,800 --> 00:09:00,640
State data.

125
00:09:00,640 --> 00:09:08,080
Sure enough, John, I think we can approach your question because we can look at solid

126
00:09:08,080 --> 00:09:10,140
and liquid edibles.

127
00:09:10,140 --> 00:09:16,500
If we wanted to get fancy, we could use natural language processing and try to figure out

128
00:09:16,500 --> 00:09:20,520
which ones are chocolates, which ones are gummies, so on and so forth.

129
00:09:20,520 --> 00:09:23,080
But that would be a cherry on top.

130
00:09:23,080 --> 00:09:26,360
And then as you said, so basically, John's after the types.

131
00:09:26,360 --> 00:09:30,200
So this would be THC to CBD ratio.

132
00:09:30,200 --> 00:09:32,760
And sure enough, we've got that data.

133
00:09:32,760 --> 00:09:38,500
So with the 2022 data, we still have to make that link.

134
00:09:38,500 --> 00:09:42,200
So we're still missing some IDs that we still need to request.

135
00:09:42,200 --> 00:09:51,280
But up to the end of 2021, we can match lab results to edibles that were sold.

136
00:09:51,280 --> 00:10:00,640
So we can see the distribution of THC to CBD in edibles that were sold in Washington State

137
00:10:00,640 --> 00:10:03,440
of the full population.

138
00:10:03,440 --> 00:10:06,220
Excellent.

139
00:10:06,220 --> 00:10:10,560
So easier said than done, wrangling this data.

140
00:10:10,560 --> 00:10:15,200
So Candice, now you actually have a demanded statistic.

141
00:10:15,200 --> 00:10:24,080
So we now want to know the THC to CBD ratio, which is a statistic for every single edible

142
00:10:24,080 --> 00:10:25,840
that was sold.

143
00:10:25,840 --> 00:10:29,280
And then we can subsequently create more statistics.

144
00:10:29,280 --> 00:10:41,920
So you can get fancy from there, but maybe there's a pattern in rural versus urban or

145
00:10:41,920 --> 00:10:42,920
who knows.

146
00:10:42,920 --> 00:10:48,440
So that's where you can start getting fun thinking of research questions.

147
00:10:48,440 --> 00:10:51,440
But first things first, we'll get the first statistic.

148
00:10:51,440 --> 00:10:53,880
So Candice, you want to start on that?

149
00:10:53,880 --> 00:10:54,880
You're welcome to.

150
00:10:54,880 --> 00:10:58,440
What do you mean on the first statistic that we have?

151
00:10:58,440 --> 00:11:03,600
Are you on the THC ratio for all Washington State data 2021 edibles?

152
00:11:03,600 --> 00:11:06,400
Is that what you said?

153
00:11:06,400 --> 00:11:08,920
Essentially, yeah, we'll have to jump through a couple hoops.

154
00:11:08,920 --> 00:11:10,720
We'll have to get all the edibles.

155
00:11:10,720 --> 00:11:11,720
Yeah.

156
00:11:11,720 --> 00:11:14,800
Oh yeah, yeah, we'll get it done.

157
00:11:14,800 --> 00:11:23,280
And then essentially just take the maybe total THC divided by total CBD.

158
00:11:23,280 --> 00:11:26,400
Sounds good.

159
00:11:26,400 --> 00:11:37,120
If I might just make a point here, if the data contains non-THC, non-CBD cannabinoid

160
00:11:37,120 --> 00:11:44,720
content fields, that would be exactly the goal that we're trying to approach.

161
00:11:44,720 --> 00:11:51,120
So maybe when this moves to a point where you're willing to present this, maybe that

162
00:11:51,120 --> 00:11:57,400
would be a good opportunity to see what can be mined from it.

163
00:11:57,400 --> 00:12:01,560
So what you're talking about are like other ones like CBN, CBG.

164
00:12:01,560 --> 00:12:05,040
Yeah, CBG, CBN, THCV.

165
00:12:05,040 --> 00:12:09,520
If that in fact are fields in the Washington State data.

166
00:12:09,520 --> 00:12:15,120
Right, or like the fields that are in skunk effects.

167
00:12:15,120 --> 00:12:17,640
I'm sorry, say that last one again, Candice.

168
00:12:17,640 --> 00:12:19,440
Like the fields in skunk effects.

169
00:12:19,440 --> 00:12:21,240
Can you hear me okay?

170
00:12:21,240 --> 00:12:22,240
Yeah, yeah, yeah.

171
00:12:22,240 --> 00:12:23,240
Am I too loud?

172
00:12:23,240 --> 00:12:24,240
No, no, you're perfect.

173
00:12:24,240 --> 00:12:30,960
Okay, I did test.

174
00:12:30,960 --> 00:12:35,640
Just not to rain on your parade, but I think you may have the best luck with the THC and

175
00:12:35,640 --> 00:12:45,800
CBD because I mean it would be fun to see if there are any CBG, etc. edibles, but just

176
00:12:45,800 --> 00:12:51,800
from my experience seeing a bunch of edible lab results, I just don't...

177
00:12:51,800 --> 00:13:02,640
I mean the predominant ones you see of course are CBD, CBDA, THC, THCA.

178
00:13:02,640 --> 00:13:04,840
Wait a minute, repeat that real quick.

179
00:13:04,840 --> 00:13:09,560
So okay, CBDA, CBN, CBG.

180
00:13:09,560 --> 00:13:16,000
So basically you should expect to see CBD, CBDA.

181
00:13:16,000 --> 00:13:20,360
CBDA, okay, got that.

182
00:13:20,360 --> 00:13:24,000
THC, and then THCA.

183
00:13:24,000 --> 00:13:26,880
THCA, okay.

184
00:13:26,880 --> 00:13:29,720
So you'd expect to see...

185
00:13:29,720 --> 00:13:31,220
And then THCB.

186
00:13:31,220 --> 00:13:36,600
You would expect to see mostly THC and CBD.

187
00:13:36,600 --> 00:13:41,280
It would be curious to see if there's any acidic left.

188
00:13:41,280 --> 00:13:52,080
So CBDA and THCA, I would expect to be zero because the idea is when you process it into

189
00:13:52,080 --> 00:14:00,280
an edible, you typically have it go through decarboxylation.

190
00:14:00,280 --> 00:14:04,440
So you turn the acidic to their...

191
00:14:04,440 --> 00:14:10,040
John, you probably know the correct word for this.

192
00:14:10,040 --> 00:14:12,360
Carboxylated or neutral form.

193
00:14:12,360 --> 00:14:14,980
So they're psychoactive at that point.

194
00:14:14,980 --> 00:14:16,720
So you would expect...

195
00:14:16,720 --> 00:14:21,160
But I mean they still may have trace amounts of CBDA and THCA.

196
00:14:21,160 --> 00:14:29,560
But long story short is these are typically ingredients that would be added as isolates

197
00:14:29,560 --> 00:14:35,680
or distillates into a soda or into an edible.

198
00:14:35,680 --> 00:14:49,200
So it would be unlikely for somebody to be adding in CBG, THCV, CBN, et cetera, but they

199
00:14:49,200 --> 00:14:50,200
may.

200
00:14:50,200 --> 00:14:51,200
So for example...

201
00:14:51,200 --> 00:14:53,240
And Keegan, they are.

202
00:14:53,240 --> 00:14:58,600
They're rarer products and that's why it becomes of interest because we're trying to tease

203
00:14:58,600 --> 00:15:00,600
out the story on those.

204
00:15:00,600 --> 00:15:02,480
No, they definitely exist.

205
00:15:02,480 --> 00:15:06,920
We have them and others have them in their product databases that we've been looking

206
00:15:06,920 --> 00:15:07,920
at.

207
00:15:07,920 --> 00:15:12,600
But as you point out, of course, it's just rarer because the market hasn't moved in that

208
00:15:12,600 --> 00:15:14,680
direction yet so much.

209
00:15:14,680 --> 00:15:20,480
But it is a very fruitful, interesting area and there are big players looking at this.

210
00:15:20,480 --> 00:15:23,920
So let's just keep an eye on...

211
00:15:23,920 --> 00:15:26,600
They're called rare cannabinoids for a reason.

212
00:15:26,600 --> 00:15:28,600
And you know what else is rare?

213
00:15:28,600 --> 00:15:29,600
Gold.

214
00:15:29,600 --> 00:15:31,920
And so that's basically what we're doing.

215
00:15:31,920 --> 00:15:34,080
We're panning for gold here.

216
00:15:34,080 --> 00:15:36,440
So we've got the tools.

217
00:15:36,440 --> 00:15:41,160
So basically we can look through this data set and see if are there any edibles that

218
00:15:41,160 --> 00:15:44,240
were sold with CBN.

219
00:15:44,240 --> 00:15:52,760
So I think I've even seen some at the store and basically I guess it's not really a claim,

220
00:15:52,760 --> 00:16:01,920
but the link that's trying to be advertised is maybe CBN maybe correlated with a sleepy

221
00:16:01,920 --> 00:16:03,920
effect.

222
00:16:03,920 --> 00:16:10,480
So I don't know how people are marketing the compounds, but this is something that we can

223
00:16:10,480 --> 00:16:11,480
investigate.

224
00:16:11,480 --> 00:16:13,880
So we've got some cool tools in our belt.

225
00:16:13,880 --> 00:16:19,580
So what I would do is look at natural language processing plus these rare compounds.

226
00:16:19,580 --> 00:16:27,040
So try to find any edibles that have CBN, etc. in them.

227
00:16:27,040 --> 00:16:33,640
Use natural language processing on the sample name and try to figure out are they capsules?

228
00:16:33,640 --> 00:16:38,120
Are they chocolates?

229
00:16:38,120 --> 00:16:39,120
So on and so forth.

230
00:16:39,120 --> 00:16:47,280
So that's where you can get fun with it to see if there's any insights that you could

231
00:16:47,280 --> 00:16:48,280
pull out.

232
00:16:48,280 --> 00:16:52,600
And then finally I think the business question is what's selling?

233
00:16:52,600 --> 00:17:00,480
So if you can find some CBN edibles that are selling off the shelf in Seattle, then that

234
00:17:00,480 --> 00:17:03,120
would be a valuable find.

235
00:17:03,120 --> 00:17:08,400
And the data is just sitting there and I don't know if anybody's looked at that, at least

236
00:17:08,400 --> 00:17:10,200
not publicly.

237
00:17:10,200 --> 00:17:15,040
And so now we've got a bunch of awesome data scientists here.

238
00:17:15,040 --> 00:17:17,600
The question's been posed.

239
00:17:17,600 --> 00:17:20,680
We've got all the open source tools in our tool belt.

240
00:17:20,680 --> 00:17:24,400
So I think we can answer this question for you.

241
00:17:24,400 --> 00:17:36,160
So a coming question or topic I think for next week is so how do we get compensated

242
00:17:36,160 --> 00:17:37,480
for our work?

243
00:17:37,480 --> 00:17:49,720
And a nifty tool that I'll be sharing with you is essentially...

244
00:17:49,720 --> 00:17:51,760
Okay I'll go ahead and let the cat out of the bag.

245
00:17:51,760 --> 00:18:02,840
So basically what we're looking at here is basically the word has gotten a little overused,

246
00:18:02,840 --> 00:18:04,240
a little hyped.

247
00:18:04,240 --> 00:18:14,880
But the idea is we can basically create non-fungible tokens, NFTs, for data as well as algorithms.

248
00:18:14,880 --> 00:18:25,600
So the idea is we could all write little bitty algorithms that help collect data.

249
00:18:25,600 --> 00:18:31,880
You would have ownership of the algorithms you wrote and you could potentially buy or

250
00:18:31,880 --> 00:18:32,880
sell them.

251
00:18:32,880 --> 00:18:37,400
And then people could then pay you to use your algorithms.

252
00:18:37,400 --> 00:18:43,120
And so the idea is Candice may write a nifty algorithm.

253
00:18:43,120 --> 00:18:47,240
I may use another algorithm.

254
00:18:47,240 --> 00:18:53,640
Maybe that incorporates Candice's and maybe I pay her a fee to use hers.

255
00:18:53,640 --> 00:18:55,200
Or maybe I write my own.

256
00:18:55,200 --> 00:19:00,160
And then maybe John wants to come along and then use those two algorithms to get some

257
00:19:00,160 --> 00:19:01,160
data.

258
00:19:01,160 --> 00:19:07,680
And so the idea is John could pay a nominal fee to Candice to use her algorithm or to

259
00:19:07,680 --> 00:19:08,680
me.

260
00:19:08,680 --> 00:19:12,560
Hopefully the fee's low enough that it's well worth John's while.

261
00:19:12,560 --> 00:19:15,840
So pennies on the dollar is the idea.

262
00:19:15,840 --> 00:19:20,480
So then Candice gets compensated for writing the algorithm.

263
00:19:20,480 --> 00:19:22,800
John gets the data he needs.

264
00:19:22,800 --> 00:19:25,600
It's all perfectly traceable.

265
00:19:25,600 --> 00:19:33,720
So there's no question about where did this data come from, how is the data cleaned.

266
00:19:33,720 --> 00:19:40,320
It's all essentially on the blockchain.

267
00:19:40,320 --> 00:19:42,720
We can decide which blockchain.

268
00:19:42,720 --> 00:19:50,960
So I'll talk specifics about this more next week because this is something I've...

269
00:19:50,960 --> 00:19:53,040
I'm standing on the shoulders of giants.

270
00:19:53,040 --> 00:19:57,720
So this is a project that somebody's already done here.

271
00:19:57,720 --> 00:20:00,880
So long story short, this is a project that somebody's already done.

272
00:20:00,880 --> 00:20:04,680
We're just wrapping it up into Canlittix.

273
00:20:04,680 --> 00:20:09,600
So if you're particularly interested in this, let me know because I'll share more about

274
00:20:09,600 --> 00:20:11,240
the project details with you.

275
00:20:11,240 --> 00:20:15,120
I'm trying to keep it a little hush hush.

276
00:20:15,120 --> 00:20:20,480
But at the same time, I think because we're this open group, I don't think closed groups

277
00:20:20,480 --> 00:20:21,640
can compete with us.

278
00:20:21,640 --> 00:20:25,080
So that's why I'm fairly open about our ideas.

279
00:20:25,080 --> 00:20:31,720
So for example, this SkunkFX model that we just built, we built that in a matter of two

280
00:20:31,720 --> 00:20:33,280
weeks or so.

281
00:20:33,280 --> 00:20:35,760
And yes, it's got its flaws.

282
00:20:35,760 --> 00:20:38,040
Yes, it can be improved.

283
00:20:38,040 --> 00:20:42,360
But this is a model that would have taken a corporation.

284
00:20:42,360 --> 00:20:43,360
What?

285
00:20:43,360 --> 00:20:47,500
I mean, they would have waited to file the trademark.

286
00:20:47,500 --> 00:20:50,320
They would have waited to file a patent.

287
00:20:50,320 --> 00:20:56,880
This would have been six months to two years out the door easily and millions of dollars

288
00:20:56,880 --> 00:20:59,240
spent on what?

289
00:20:59,240 --> 00:21:07,480
And so here we just effortlessly made a prediction model in a matter of weeks that companies

290
00:21:07,480 --> 00:21:09,880
just can't touch.

291
00:21:09,880 --> 00:21:16,520
So that's why it's been sort of slow demonstrating the power of Canlittix.

292
00:21:16,520 --> 00:21:22,360
And I mean, we have all the greatest data scientists in the world on this.

293
00:21:22,360 --> 00:21:26,400
And the idea is now we just need to get them compensated.

294
00:21:26,400 --> 00:21:33,840
And that's exactly what the data NFTs, the algorithm NFTs bring to the picture.

295
00:21:33,840 --> 00:21:40,980
It finally adds a mechanism for all of you great contributors to finally get paid.

296
00:21:40,980 --> 00:21:44,320
So now you can get paid for your algorithms.

297
00:21:44,320 --> 00:21:46,560
You can get paid for finding data.

298
00:21:46,560 --> 00:21:49,440
You can get paid for sharing data.

299
00:21:49,440 --> 00:21:55,920
And I think this has been a critical missing piece of the puzzle.

300
00:21:55,920 --> 00:21:59,080
So this will be getting built in the coming weeks.

301
00:21:59,080 --> 00:22:03,160
So it's sort of a call to action right now.

302
00:22:03,160 --> 00:22:11,860
So if any of you want to help contribute, then at the moment can add you on to authorship.

303
00:22:11,860 --> 00:22:13,880
You can share in the copyright.

304
00:22:13,880 --> 00:22:19,620
So at the moment, we can share copyrights and get this built.

305
00:22:19,620 --> 00:22:27,760
And then at that point, you can actually get paid for writing algorithms or cleaning data.

306
00:22:27,760 --> 00:22:31,200
And I think this is going to be revolutionary.

307
00:22:31,200 --> 00:22:35,120
So as I said, this is a project that somebody is just doing in general.

308
00:22:35,120 --> 00:22:42,180
It's just general data NFTs, general algorithm NFTs.

309
00:22:42,180 --> 00:22:47,460
And I think we can get it to catch on here in the cannabis space.

310
00:22:47,460 --> 00:22:52,120
And I foresee things like this catching on in most fields.

311
00:22:52,120 --> 00:22:57,160
And then that way, data scientists like yourself could actually get compensated.

312
00:22:57,160 --> 00:22:59,400
So I think it's a cool project.

313
00:22:59,400 --> 00:23:02,560
We'll just be an example of it.

314
00:23:02,560 --> 00:23:06,560
Keegan, can I just follow on on that?

315
00:23:06,560 --> 00:23:13,400
Because as this gets discussed within this group, I would just point out that there's

316
00:23:13,400 --> 00:23:21,040
an organization called the Ethical Data Alliance, which is absolutely doing this, has some grants

317
00:23:21,040 --> 00:23:28,000
from some IT organizations to help move this along.

318
00:23:28,000 --> 00:23:33,240
It's in the cannabis space because that's their focus.

319
00:23:33,240 --> 00:23:37,720
And they've been talking about or they are creating something called the Ethical Eden,

320
00:23:37,720 --> 00:23:41,940
whatever that is, the network that this would be on.

321
00:23:41,940 --> 00:23:51,320
So I would just say it's probably worth looking at or talking to folks in EDA just to see

322
00:23:51,320 --> 00:23:57,160
how it overlaps or complements or whatever.

323
00:23:57,160 --> 00:23:58,280
This is going on.

324
00:23:58,280 --> 00:24:01,760
And so it's worth being aware of it.

325
00:24:01,760 --> 00:24:03,240
Exactly.

326
00:24:03,240 --> 00:24:09,720
And I kind of want to tie this into something where I often say here, oh, you know, we're

327
00:24:09,720 --> 00:24:14,480
advancing cannabis science and we're advancing statistics and this and that.

328
00:24:14,480 --> 00:24:20,120
Does that mean that on the greatest mine out there, advancing the greatest theories, doing

329
00:24:20,120 --> 00:24:21,640
the greatest work?

330
00:24:21,640 --> 00:24:25,160
No, no, no, no, no, no, far from it.

331
00:24:25,160 --> 00:24:31,240
That was sort of the point at the very beginning is we may be some of the least skilled.

332
00:24:31,240 --> 00:24:35,160
And we're just doing the best we can.

333
00:24:35,160 --> 00:24:42,680
But the idea is if we get a bunch of small fish together, we can form a giant school

334
00:24:42,680 --> 00:24:46,040
of fish.

335
00:24:46,040 --> 00:24:49,920
And the whole idea is stand on the shoulders of giants.

336
00:24:49,920 --> 00:24:55,480
So the idea is these are awesome tools that some of the brightest minds in the world are

337
00:24:55,480 --> 00:24:56,480
making.

338
00:24:56,480 --> 00:24:59,440
We don't necessarily need to make them.

339
00:24:59,440 --> 00:25:04,720
If we use these tools, that helps advance the field.

340
00:25:04,720 --> 00:25:11,600
So in my opinion, so if we use statistics, I think that helps advance statistics.

341
00:25:11,600 --> 00:25:18,080
If we apply statistical models to the cannabis industry in a new way that nobody else has

342
00:25:18,080 --> 00:25:21,440
done before, I think that helps advance the fields.

343
00:25:21,440 --> 00:25:23,600
Are our statistics perfect?

344
00:25:23,600 --> 00:25:24,600
No.

345
00:25:24,600 --> 00:25:26,680
Does the analysis need to be done again?

346
00:25:26,680 --> 00:25:27,680
Yes.

347
00:25:27,680 --> 00:25:29,880
Does the analysis need to be done more rigorously?

348
00:25:29,880 --> 00:25:30,880
Yes.

349
00:25:30,880 --> 00:25:32,880
Do we need to do more homework?

350
00:25:32,880 --> 00:25:33,880
Yes.

351
00:25:33,880 --> 00:25:35,960
Are we asking novel questions?

352
00:25:35,960 --> 00:25:37,720
I hope so.

353
00:25:37,720 --> 00:25:40,680
Are we applying methods in novel ways?

354
00:25:40,680 --> 00:25:41,920
I hope so.

355
00:25:41,920 --> 00:25:48,280
So I think that's sort of the idea behind the contribution we make is somebody needs

356
00:25:48,280 --> 00:25:51,800
to come back and do it much, much better.

357
00:25:51,800 --> 00:25:54,840
But it's sort of like the law of large numbers, right?

358
00:25:54,840 --> 00:26:01,880
You need to do a scientific test, right, thousands and thousands of times.

359
00:26:01,880 --> 00:26:05,800
But the first one provides a lot of value.

360
00:26:05,800 --> 00:26:08,400
And it could be wildly, wildly off.

361
00:26:08,400 --> 00:26:10,440
But it gets people thinking.

362
00:26:10,440 --> 00:26:13,480
And then somebody can come along and test it again.

363
00:26:13,480 --> 00:26:16,040
And then test it again, test it again, test it again.

364
00:26:16,040 --> 00:26:20,400
And so what is the actual result?

365
00:26:20,400 --> 00:26:23,280
Easily, could change over time.

366
00:26:23,280 --> 00:26:27,000
So as I said, take all of our analysis as a grain of salt.

367
00:26:27,000 --> 00:26:35,040
But I think the value we add is we may introduce an idea into the space, at which point better

368
00:26:35,040 --> 00:26:39,000
and greater minds can iterate upon this.

369
00:26:39,000 --> 00:26:46,920
So that's what I mean when I say I think we're advancing cannabis science is I think we're

370
00:26:46,920 --> 00:26:49,800
just treading on ground that no one's gone before.

371
00:26:49,800 --> 00:26:52,400
And I think there's value to that.

372
00:26:52,400 --> 00:26:56,600
But we do need to do so in an ethical manner, right?

373
00:26:56,600 --> 00:27:02,000
When you're treading on new ground, you can't just be leaving a bunch of garbage all over

374
00:27:02,000 --> 00:27:09,360
the place or making a mess for people who may wish to follow in your footsteps.

375
00:27:09,360 --> 00:27:15,400
So we do need to tread carefully.

376
00:27:15,400 --> 00:27:17,560
So that's getting a little abstract.

377
00:27:17,560 --> 00:27:25,960
So if you're interested, I've got a quick 15 minute presentation to share with you.

378
00:27:25,960 --> 00:27:29,840
And then John, do you have any work to share or?

379
00:27:29,840 --> 00:27:33,720
I have a couple points I wouldn't mind bringing up to the group at this point.

380
00:27:33,720 --> 00:27:38,760
So sure, but if there's time, would you mind?

381
00:27:38,760 --> 00:27:46,120
Well, would you mind if I ran through just a please?

382
00:27:46,120 --> 00:27:53,560
Well, actually, actually, if you're OK, I'll just run through just a teaser of the work

383
00:27:53,560 --> 00:27:55,520
I'll be working on.

384
00:27:55,520 --> 00:28:01,880
And if any of you want to lend a hand and then, John, I'll give it the floor to you

385
00:28:01,880 --> 00:28:04,800
for the last 15 minutes or so for you to.

386
00:28:04,800 --> 00:28:06,520
I don't even need that much.

387
00:28:06,520 --> 00:28:14,520
OK, well, without further ado, I'll just share with you real, real quick what I had been

388
00:28:14,520 --> 00:28:16,320
thinking about.

389
00:28:16,320 --> 00:28:22,240
So this you can think of this what it was, but this is sort of the point about canlytics

390
00:28:22,240 --> 00:28:24,800
is what do I value?

391
00:28:24,800 --> 00:28:31,600
So this is just a fictitious quote that I heard one time that I really liked a lot.

392
00:28:31,600 --> 00:28:37,800
And so to me, I personally think code is maybe one of the most valuable things out there.

393
00:28:37,800 --> 00:28:40,760
So that's what I value.

394
00:28:40,760 --> 00:28:43,720
And that's what canlytics values.

395
00:28:43,720 --> 00:28:49,560
So if any of you want to share in the copyright of the code, I think that's a valuable thing.

396
00:28:49,560 --> 00:28:53,000
And so I'm more than happy to share that with you.

397
00:28:53,000 --> 00:28:59,460
So please, please email me if you have any contribution and a contribution can mean anything.

398
00:28:59,460 --> 00:29:05,240
So really, you know, if you help help share ideas or share data sets with me, I would

399
00:29:05,240 --> 00:29:07,280
love to share authorship with you.

400
00:29:07,280 --> 00:29:10,040
And I haven't been very good about that in the past.

401
00:29:10,040 --> 00:29:14,640
So please be in touch with me and I'll be happy to share authorship with people.

402
00:29:14,640 --> 00:29:15,640
Abundance.

403
00:29:15,640 --> 00:29:23,700
Next, going to be rigorous here about following the scientific process.

404
00:29:23,700 --> 00:29:29,300
So it's just going to outline real quick the steps of a quick research question.

405
00:29:29,300 --> 00:29:36,120
So basically, the research question that that I posed, but I kind of want to formally look

406
00:29:36,120 --> 00:29:43,800
through it in a scientific manner is I've got this hypothesis that a cannabis consumer's

407
00:29:43,800 --> 00:29:50,560
personality affects the type of cannabis that they choose to purchase.

408
00:29:50,560 --> 00:30:00,680
So just to kind of just, you know, stab in the dark, I was thinking that, oh, you know,

409
00:30:00,680 --> 00:30:08,120
maybe just, you know, a hypothesis is maybe a little more than a guess, right?

410
00:30:08,120 --> 00:30:17,400
It's a guess or, you know, my belief from just what I've observed in the world.

411
00:30:17,400 --> 00:30:21,680
And basically, that right, there's introverts and extroverts.

412
00:30:21,680 --> 00:30:23,960
It's a bit of a scale.

413
00:30:23,960 --> 00:30:26,560
It's not a zero or one thing.

414
00:30:26,560 --> 00:30:31,680
So we're expecting a scale and there's other personality traits, but that's maybe one of

415
00:30:31,680 --> 00:30:33,160
them.

416
00:30:33,160 --> 00:30:37,300
And maybe they like cannabis differently.

417
00:30:37,300 --> 00:30:45,640
So I've got this hunch that maybe extroverts may like more of the indica type strains.

418
00:30:45,640 --> 00:30:52,560
And then maybe the introverts may like more of the sativa like strains.

419
00:30:52,560 --> 00:30:54,320
Is there anything to that?

420
00:30:54,320 --> 00:30:55,540
Who knows?

421
00:30:55,540 --> 00:30:59,640
We can kind of test it statistically.

422
00:30:59,640 --> 00:31:01,800
So how do we go about doing that?

423
00:31:01,800 --> 00:31:09,960
Well, first, we need to look at the literature to see who's talked about this before and

424
00:31:09,960 --> 00:31:16,440
what concepts in the scientific literature can we use.

425
00:31:16,440 --> 00:31:26,280
And the big one I wanted to use is one, revealed preference, in that an economist would say

426
00:31:26,280 --> 00:31:34,280
you can't just ask somebody about themselves, about their preferences.

427
00:31:34,280 --> 00:31:37,640
You have to determine that from looking at their actions.

428
00:31:37,640 --> 00:31:44,720
So the idea here is we can't just ask somebody what their personality is.

429
00:31:44,720 --> 00:31:51,800
We have to determine what their personality is from their actions.

430
00:31:51,800 --> 00:31:53,440
Same with purchasing patterns.

431
00:31:53,440 --> 00:31:56,720
We can't just ask people what they want to purchase.

432
00:31:56,720 --> 00:32:00,160
We actually have to observe what they purchase.

433
00:32:00,160 --> 00:32:07,040
Next, what are we even talking about when we're talking about personality?

434
00:32:07,040 --> 00:32:13,640
Well, this is cool because now is where we kind of get the sample from different fields.

435
00:32:13,640 --> 00:32:22,320
So we get to sample from psychology and say, cool, the psychologists have studied personality

436
00:32:22,320 --> 00:32:31,320
and they've widely agreed upon five personality traits that kind of are on a scale.

437
00:32:31,320 --> 00:32:34,360
So I'm treating it as zero to one.

438
00:32:34,360 --> 00:32:46,160
So five traits on a scale, widely criticized and also widely referenced by psychologists.

439
00:32:46,160 --> 00:32:47,160
Cool.

440
00:32:47,160 --> 00:32:54,840
So we've got some shoulders of giants to stand upon.

441
00:32:54,840 --> 00:32:59,600
Next, we need our actual methodology.

442
00:32:59,600 --> 00:33:06,440
So what's our actual statistical framework that we'll be using?

443
00:33:06,440 --> 00:33:15,280
So I picked the simplest statistical model in the book, the ordinary least squares regression.

444
00:33:15,280 --> 00:33:22,700
I suppose difference of means or something like that may technically be a simpler method

445
00:33:22,700 --> 00:33:26,800
of analysis, but this is pretty simple.

446
00:33:26,800 --> 00:33:33,160
The idea here is how are we even measuring indica versus sativa?

447
00:33:33,160 --> 00:33:42,520
Well, thanks to awesome work being done by John, it's appearing that the ratio between

448
00:33:42,520 --> 00:33:51,880
beta-pinene and d-limonene may better characterize the sativa indica dichotomy.

449
00:33:51,880 --> 00:34:02,120
So instead of using a dichotomy, we'll use a continuous variable R, which is going to

450
00:34:02,120 --> 00:34:07,840
be the ratio between beta-pinene and d-limonene.

451
00:34:07,840 --> 00:34:10,680
We'll take the average by user.

452
00:34:10,680 --> 00:34:17,920
So maybe a user may pick a bunch of different strains, but maybe on average they'll pick

453
00:34:17,920 --> 00:34:21,600
out hopefully strains that they'd like.

454
00:34:21,600 --> 00:34:27,960
Or they may accidentally pick out a strain that's high in d-limonene and not like it.

455
00:34:27,960 --> 00:34:33,620
Or they may accidentally pick out a strain that's high in beta-pinene, love it, not like

456
00:34:33,620 --> 00:34:35,000
it, who knows?

457
00:34:35,000 --> 00:34:42,280
The idea is after they try many, many, many samples, they'll on average hopefully be picking

458
00:34:42,280 --> 00:34:45,780
out the ones that they like.

459
00:34:45,780 --> 00:34:48,040
Will this hold up to the data?

460
00:34:48,040 --> 00:34:49,840
We'll see.

461
00:34:49,840 --> 00:35:00,680
And then the idea is we can now have the five personality traits, openness, conscientiousness,

462
00:35:00,680 --> 00:35:06,280
extraversion, agreeableness, and neuroticism, which is basically sensitivity.

463
00:35:06,280 --> 00:35:14,160
I think they called it emotional sensitivity or something like that.

464
00:35:14,160 --> 00:35:16,360
Closed to new ideas is neuroticism.

465
00:35:16,360 --> 00:35:18,920
Say that one more time.

466
00:35:18,920 --> 00:35:23,000
Closed to new ideas.

467
00:35:23,000 --> 00:35:26,600
The emotional one, the neuroticism?

468
00:35:26,600 --> 00:35:34,360
I think the comparison is openness or closeness to new ideas.

469
00:35:34,360 --> 00:35:36,520
And I think that's what falls into neurotic.

470
00:35:36,520 --> 00:35:40,880
It's considered closure, a lack of openness.

471
00:35:40,880 --> 00:35:44,080
So I've got to do more research myself.

472
00:35:44,080 --> 00:35:48,640
I think closeness would be the converse to openness.

473
00:35:48,640 --> 00:35:52,680
I think this is basically emotional stability.

474
00:35:52,680 --> 00:36:00,560
So are you highly emotional or are you stoic?

475
00:36:00,560 --> 00:36:07,720
Do you let your emotions show or not is the neuroticism one, is the way I understand that

476
00:36:07,720 --> 00:36:10,240
one.

477
00:36:10,240 --> 00:36:18,360
And then agreeableness, agreeable, disagreeable, extraversion, introversion.

478
00:36:18,360 --> 00:36:26,480
I don't really know what the, I think the conscientious versus intuitive.

479
00:36:26,480 --> 00:36:31,680
So is this, like I said, I don't know too much about this.

480
00:36:31,680 --> 00:36:35,560
This one may be the logic versus feeling one.

481
00:36:35,560 --> 00:36:42,040
But anywho, please read up on these because this is a deep, deep subject, right?

482
00:36:42,040 --> 00:36:49,480
I mean, this is the field of psychology, so there's a lot of meat here.

483
00:36:49,480 --> 00:36:55,720
But the idea is if we scale these zero to one, right?

484
00:36:55,720 --> 00:36:56,720
Probably no one's...

485
00:36:56,720 --> 00:36:57,720
I have a question about that.

486
00:36:57,720 --> 00:37:03,440
If you go in and take all of those personality types and scale it zero to one, you should

487
00:37:03,440 --> 00:37:06,960
have two words there and not just one term.

488
00:37:06,960 --> 00:37:16,520
It should be openness, closeness, conscientiousness, whatever the opposite of that is.

489
00:37:16,520 --> 00:37:17,840
You're 100% right.

490
00:37:17,840 --> 00:37:24,120
And so I just grabbed these from Wikipedia, but please double check me on this because

491
00:37:24,120 --> 00:37:33,160
the idea is these words are supposed to convey a scale.

492
00:37:33,160 --> 00:37:39,960
It's basically like zero to one extraversion, zero to one agreeableness.

493
00:37:39,960 --> 00:37:46,440
So would zero be introversion or extraversion and one, which would one be?

494
00:37:46,440 --> 00:37:47,440
Good point.

495
00:37:47,440 --> 00:37:55,200
I'm basically treating one as extroverted and zero as introverted.

496
00:37:55,200 --> 00:38:05,440
Can I just make a comment here because this is precisely what we're facing as we talk

497
00:38:05,440 --> 00:38:10,320
about these kinds of scales in our dosing project work.

498
00:38:10,320 --> 00:38:17,120
We have a really great collaborator who's a clinical psychologist in the form of Jeffrey

499
00:38:17,120 --> 00:38:20,040
Terrence out of Eugene, Oregon.

500
00:38:20,040 --> 00:38:22,800
And so we do spend time talking about this.

501
00:38:22,800 --> 00:38:27,760
And point of fact, in some of our most recent discussions, he says, no, don't go binary

502
00:38:27,760 --> 00:38:29,600
on this.

503
00:38:29,600 --> 00:38:33,560
It's better to do it as a bit of an ordinal scale.

504
00:38:33,560 --> 00:38:37,320
It's got a term, I think it's called Langford or something, scale.

505
00:38:37,320 --> 00:38:44,640
And he's convinced us that as we do survey work, we need to not be doing it binary, but

506
00:38:44,640 --> 00:38:51,560
in fact asking folks for if they're responding in the moment or whatever, but it needs to

507
00:38:51,560 --> 00:38:56,760
be kind of a scale, Jerry, kind of along the way you're saying.

508
00:38:56,760 --> 00:38:57,760
I went into this.

509
00:38:57,760 --> 00:38:59,280
I'm so impressed with the binary.

510
00:38:59,280 --> 00:39:02,800
That just poses something close to the link that I looked at.

511
00:39:02,800 --> 00:39:07,240
And I think that explains it very well.

512
00:39:07,240 --> 00:39:08,240
Okay.

513
00:39:08,240 --> 00:39:11,280
But you're 100% right.

514
00:39:11,280 --> 00:39:15,320
I put it in the chat.

515
00:39:15,320 --> 00:39:19,480
So there's different words for these.

516
00:39:19,480 --> 00:39:24,760
And so that's actually one of the challenges you'll run into here in the psychology literature

517
00:39:24,760 --> 00:39:30,760
is they're kind of always using different terminology for these.

518
00:39:30,760 --> 00:39:37,400
But I just tried to standardize it the best I could.

519
00:39:37,400 --> 00:39:39,960
But you're 100% right, Jerry.

520
00:39:39,960 --> 00:39:40,960
And that...

521
00:39:40,960 --> 00:39:48,240
Well, if you look at what Candice linked you put in the chat, it gives you the both sides

522
00:39:48,240 --> 00:39:49,240
of the scale.

523
00:39:49,240 --> 00:39:50,240
Spot on.

524
00:39:50,240 --> 00:39:53,960
So we clearly raised...

525
00:39:53,960 --> 00:40:00,360
So this is what's awesome about presenting this research to you is we get to find out

526
00:40:00,360 --> 00:40:04,880
really quickly what points need more clarification.

527
00:40:04,880 --> 00:40:08,880
And that's what's awesome about working with you all.

528
00:40:08,880 --> 00:40:11,600
So this is not just...

529
00:40:11,600 --> 00:40:14,160
This is a team effort.

530
00:40:14,160 --> 00:40:22,960
In my book, this is team research because your feedback is, in my opinion, part of the

531
00:40:22,960 --> 00:40:24,640
most critical step.

532
00:40:24,640 --> 00:40:32,320
So here, I presented an idea to you and then you bounced back that, hey, we need to much

533
00:40:32,320 --> 00:40:37,080
clearly define these personality traits.

534
00:40:37,080 --> 00:40:43,520
The scale seems like a good idea, but we need to clearly define both ends of the scale.

535
00:40:43,520 --> 00:40:44,520
Awesome.

536
00:40:44,520 --> 00:40:53,200
So that's essentially all the points you raised have already improved the research.

537
00:40:53,200 --> 00:41:01,720
So just by coming and sharing your thoughts, you, in my opinion, helped advance this research

538
00:41:01,720 --> 00:41:05,920
question, which I feel is advancing cannabis science.

539
00:41:05,920 --> 00:41:07,320
So that's what I mean by...

540
00:41:07,320 --> 00:41:13,840
When I say, by you coming, I think you're helping advance cannabis science.

541
00:41:13,840 --> 00:41:21,160
As I said, it may be a snowflake, but snowflakes add up.

542
00:41:21,160 --> 00:41:30,600
So Keegan, if this really is of interest, there are established scales for this.

543
00:41:30,600 --> 00:41:36,300
In particular, we're very familiar with one called Brunel Mood that we have used with

544
00:41:36,300 --> 00:41:38,640
cannabis data.

545
00:41:38,640 --> 00:41:46,720
It allows respondents to respond to a series of words that roll up into key moods.

546
00:41:46,720 --> 00:41:55,040
We've looked, or we can continue to look, at how the acute cannabis response changes

547
00:41:55,040 --> 00:41:56,220
your mood.

548
00:41:56,220 --> 00:41:58,880
But one needs to select a scale.

549
00:41:58,880 --> 00:42:00,760
Scale means a survey.

550
00:42:00,760 --> 00:42:10,080
So for example, if folks want to go down this pathway, I would advocate looking very seriously

551
00:42:10,080 --> 00:42:14,000
at established scales like Brunel Mood.

552
00:42:14,000 --> 00:42:17,240
It's quite robust.

553
00:42:17,240 --> 00:42:18,240
I love it, John.

554
00:42:18,240 --> 00:42:23,080
And I love that you've already researched this topic.

555
00:42:23,080 --> 00:42:28,800
I could present on this at a future meeting because, yeah, we have a good number of years

556
00:42:28,800 --> 00:42:32,440
working together with Jeff Tarrant and Eugene on this.

557
00:42:32,440 --> 00:42:38,280
So if this is relevant, let's, in a future meeting, I'll walk you through some of our

558
00:42:38,280 --> 00:42:40,920
Brunel's data or Brunel's interpretations.

559
00:42:40,920 --> 00:42:44,320
Well, we may have an awesome platform for you.

560
00:42:44,320 --> 00:42:53,360
So hold tight because we may actually have a good time for you to present your...

561
00:42:53,360 --> 00:42:58,240
John, could you put that scale in the chat so we get the correct spelling?

562
00:42:58,240 --> 00:43:03,480
You know, Jerry, I don't do that kind of stuff very well.

563
00:43:03,480 --> 00:43:06,800
Can I just suggest that you look up Brunel Mood?

564
00:43:06,800 --> 00:43:07,800
I can spell it.

565
00:43:07,800 --> 00:43:09,800
Spell it for me, yeah.

566
00:43:09,800 --> 00:43:11,800
Yeah, B-R-U-N-E-L.

567
00:43:11,800 --> 00:43:12,800
Yep.

568
00:43:12,800 --> 00:43:15,440
You'll have no problem finding it.

569
00:43:15,440 --> 00:43:17,120
Okay, Brunel Mood.

570
00:43:17,120 --> 00:43:23,360
Brunel Mood or Brums, it's sometimes called Brunel Mood scale, B-R-U-M-S.

571
00:43:23,360 --> 00:43:24,640
We'll find it relevant for it.

572
00:43:24,640 --> 00:43:25,640
Thank you.

573
00:43:25,640 --> 00:43:26,640
Very widely available.

574
00:43:26,640 --> 00:43:36,680
And John, you've now helped our literature review because with my literature review failed

575
00:43:36,680 --> 00:43:40,720
to unearth, you effortlessly provided.

576
00:43:40,720 --> 00:43:46,840
So now here I'm shooting in the dark and all of a sudden you just now pointed us to, oh,

577
00:43:46,840 --> 00:43:49,480
you know, this is the metric you need to use.

578
00:43:49,480 --> 00:43:55,040
And so this is sort of my point is, you know, instead of me now going off on a wild goose

579
00:43:55,040 --> 00:44:05,200
chase for months on end, which I see happen all the time, like at universities and whatnot,

580
00:44:05,200 --> 00:44:06,200
right?

581
00:44:06,200 --> 00:44:09,720
I could have gone on a six month goose chase.

582
00:44:09,720 --> 00:44:13,560
And then I would have then presented my work to John and he said, oh, have you tried the

583
00:44:13,560 --> 00:44:16,320
Brunel Mood test scale?

584
00:44:16,320 --> 00:44:19,720
And then I would have been like, oh, well, no, back to the drawing board.

585
00:44:19,720 --> 00:44:26,680
And so basically now instead of wasting all that time, I thought of this and then one

586
00:44:26,680 --> 00:44:32,320
day later get to bounce it off of John, gets bounced back to my court.

587
00:44:32,320 --> 00:44:43,480
So the idea is we can just do this rapid, rapid research that when you spend a bit of

588
00:44:43,480 --> 00:44:54,920
time around, you know, university professors, one time a university professor specifically told me like, you know, things move slowly here.

589
00:44:54,920 --> 00:44:56,840
You just have to get used to that.

590
00:44:56,840 --> 00:45:01,720
So I'm sort of a go, go, go type of person.

591
00:45:01,720 --> 00:45:07,480
And so whether research should be done that way or not, that's an opinion.

592
00:45:07,480 --> 00:45:12,560
But I think in this open manner, we can do research quickly.

593
00:45:12,560 --> 00:45:21,360
And at the Brunel Mood, and I just put the link in the chat of an assessment, and it

594
00:45:21,360 --> 00:45:31,440
may be too, it may be so detailed as to make any result that we would come up with be insignificant.

595
00:45:31,440 --> 00:45:40,560
I mean, it has 24 different mood characteristics everywhere from panicky to uncertain.

596
00:45:40,560 --> 00:45:43,080
We want to add rigor to our analysis, Jerry.

597
00:45:43,080 --> 00:45:44,400
So this could be a good way to do it.

598
00:45:44,400 --> 00:45:49,360
So Jerry, we have we use it in the acute setting.

599
00:45:49,360 --> 00:45:54,240
So we ask folks before they get high to fill this out.

600
00:45:54,240 --> 00:45:57,680
You fill it out really fast because you're you're simple.

601
00:45:57,680 --> 00:46:04,680
It's like a button click for does the word apply or how much the word applies at that moment.

602
00:46:04,680 --> 00:46:09,280
I think what what Keegan's looking for is personality type in that mood.

603
00:46:09,280 --> 00:46:15,040
Yeah, and that so brooms is an example of mood.

604
00:46:15,040 --> 00:46:21,280
It would be if we're doing personality, we would probably find another scale or need

605
00:46:21,280 --> 00:46:22,280
to.

606
00:46:22,280 --> 00:46:27,880
I'm just pointing out that we have experience with the mood, the acute mood change using

607
00:46:27,880 --> 00:46:28,880
Brunel.

608
00:46:28,880 --> 00:46:30,880
OK, it's actually interesting.

609
00:46:30,880 --> 00:46:37,040
The real point I was trying to make is not binary, but this Langford scale or whatever

610
00:46:37,040 --> 00:46:41,360
is being important because I think I hope that's clear.

611
00:46:41,360 --> 00:46:45,880
You know, the binary assessment is such a cool way of doing things.

612
00:46:45,880 --> 00:46:54,480
But I think we're we lose, unfortunately, detail that way or the ability to to find

613
00:46:54,480 --> 00:46:57,200
correlates or factor analysis or whatever.

614
00:46:57,200 --> 00:47:03,720
If we're in binary mode is what Terrence was discussing with us this past week.

615
00:47:03,720 --> 00:47:11,160
I like your mood analysis and that actually will be relevant to a quick a quick something

616
00:47:11,160 --> 00:47:13,200
quick I want to show you today.

617
00:47:13,200 --> 00:47:18,580
But the mood analysis could also be done to see if that's relevant.

618
00:47:18,580 --> 00:47:24,920
But the other point I wanted to raise is the economist would come in and say, well, well,

619
00:47:24,920 --> 00:47:30,480
well, well, well, well, you know, you can't just be asking people about, you know, their

620
00:47:30,480 --> 00:47:33,560
personality traits here.

621
00:47:33,560 --> 00:47:39,960
You know, what if somebody I wanted to mislead you for for some reason or another.

622
00:47:39,960 --> 00:47:47,680
So or what if somebody just doesn't have time to take a personality test or this or that?

623
00:47:47,680 --> 00:47:54,960
Well, this was a novel idea that other people have had, but we can now use it here in the

624
00:47:54,960 --> 00:47:56,600
cannabis industry.

625
00:47:56,600 --> 00:48:05,960
So I found a handful of examples of people doing this, as I said, on Twitter posts and

626
00:48:05,960 --> 00:48:09,080
I saw eBay things.

627
00:48:09,080 --> 00:48:14,800
So there's many uses for this, but no one's done it in the cannabis industry yet.

628
00:48:14,800 --> 00:48:18,200
So I think we could maybe take a stab at it.

629
00:48:18,200 --> 00:48:26,980
And OK, the idea is we've collected the consumer purchases and the lab results.

630
00:48:26,980 --> 00:48:32,440
We also have the reviews for these purchases.

631
00:48:32,440 --> 00:48:40,880
So now what we could do is we could use natural language processing.

632
00:48:40,880 --> 00:48:50,200
Look at somebody's review and try to scale it on zero to one for each of the five personality

633
00:48:50,200 --> 00:48:51,200
traits.

634
00:48:51,200 --> 00:48:54,720
And why do I think that we can do this?

635
00:48:54,720 --> 00:49:02,720
Well, there's been psychologists that have hypothesized that personality traits would

636
00:49:02,720 --> 00:49:14,520
probably show up in language and specifically that perhaps, you know, really important personality

637
00:49:14,520 --> 00:49:20,320
traits may often get boiled down to single words.

638
00:49:20,320 --> 00:49:23,440
So it may not be the case.

639
00:49:23,440 --> 00:49:25,200
It's not tested.

640
00:49:25,200 --> 00:49:27,800
This is just a hypothesis.

641
00:49:27,800 --> 00:49:29,880
Or it's not a theory yet.

642
00:49:29,880 --> 00:49:31,160
It's just a hypothesis.

643
00:49:31,160 --> 00:49:33,080
We need to rigorously test it.

644
00:49:33,080 --> 00:49:40,880
But, I mean, what if it's the case that certain words are only used by really extroverted

645
00:49:40,880 --> 00:49:49,880
people or maybe people who are really high in the degree of openness may use the word

646
00:49:49,880 --> 00:49:56,360
open a lot or, you know, who knows what this is?

647
00:49:56,360 --> 00:50:02,800
The idea is, you know, we could just read through a bunch of people's reviews and do

648
00:50:02,800 --> 00:50:04,960
the best we could at it.

649
00:50:04,960 --> 00:50:13,320
But we could, the idea is, use natural language processing and spend much, much more time

650
00:50:13,320 --> 00:50:14,320
on it.

651
00:50:14,320 --> 00:50:20,240
So instead of reading through each sentence, you know, really, really carefully, you know,

652
00:50:20,240 --> 00:50:28,480
we can just let the computer read through these sentences at just lightning fast rate,

653
00:50:28,480 --> 00:50:33,440
perform logical operations at lightning fast rates.

654
00:50:33,440 --> 00:50:40,640
So all it's doing is what a human would do, which would be read the review.

655
00:50:40,640 --> 00:50:41,880
Okay.

656
00:50:41,880 --> 00:50:48,760
You know, we could train this on some data so we can actually say, okay, here's a data

657
00:50:48,760 --> 00:50:52,360
set where someone's taken a psychology test.

658
00:50:52,360 --> 00:50:55,600
Here are their actual traits.

659
00:50:55,600 --> 00:51:01,600
And then we can, you know, see how well we can predict those with the language they use.

660
00:51:01,600 --> 00:51:05,280
So that's the idea.

661
00:51:05,280 --> 00:51:13,200
And the platform for this is, so I'll just show you an example.

662
00:51:13,200 --> 00:51:19,520
And I was thinking about, I was going to float it by the group here before, you know, formally

663
00:51:19,520 --> 00:51:25,720
booking it, but I was thinking, okay, Saturday morning is maybe not the best time for people.

664
00:51:25,720 --> 00:51:33,320
So I was thinking about moving Saturday morning statistics to Thursday.

665
00:51:33,320 --> 00:51:39,200
And I was just going to see if anyone's thoughts about this and if this time may work for you.

666
00:51:39,200 --> 00:51:48,360
But maybe, you know, Thursday, I should have said from 420 to about 530 Eastern Standard

667
00:51:48,360 --> 00:51:49,360
Time.

668
00:51:49,360 --> 00:51:53,640
So the first 10 minutes or so will be pretty casual.

669
00:51:53,640 --> 00:51:59,200
And then we'll do about an hour of, you know, rigorous analytics.

670
00:51:59,200 --> 00:52:04,360
So that way, you know, the Wednesday group, we're just kind of having this light discussion.

671
00:52:04,360 --> 00:52:12,960
And then, you know, next Thursday, we'll actually get into the nitty gritty of this and actually

672
00:52:12,960 --> 00:52:19,720
go through all of the, right, because, you know, a criticism that's been levied is, okay,

673
00:52:19,720 --> 00:52:21,720
we don't, we're not rigorous enough.

674
00:52:21,720 --> 00:52:25,240
We don't do enough training and testing of our models.

675
00:52:25,240 --> 00:52:34,200
And we may be providing misleading information because we're giving the illusion that we're

676
00:52:34,200 --> 00:52:42,920
doing statistics, but the statistics aren't robust and people may be drawing incorrect

677
00:52:42,920 --> 00:52:45,720
inferences from those statistics.

678
00:52:45,720 --> 00:52:51,240
So the idea is we'll do a really, really robust session here.

679
00:52:51,240 --> 00:53:02,600
So that way, you know, the Wednesday group, pretty casual and then hardcore on Thursday

680
00:53:02,600 --> 00:53:11,360
where rigorous statistics, you get the results, we actually will have some faith in our results,

681
00:53:11,360 --> 00:53:12,360
right?

682
00:53:12,360 --> 00:53:21,520
So instead of just saying, oh, this is a mock or this is a proof of concept, no, like where

683
00:53:21,520 --> 00:53:31,160
this is a rigorous analysis, we've tested, we've trained, we're confident, as confident

684
00:53:31,160 --> 00:53:33,120
as we can be.

685
00:53:33,120 --> 00:53:36,880
And you know, this is advanced cannabis analytics, right?

686
00:53:36,880 --> 00:53:44,080
We're taking the most rigorous statistics out there, applying it rigorously to the cannabis

687
00:53:44,080 --> 00:53:53,000
field in an open manner that anyone can participate in, anyone can contribute.

688
00:53:53,000 --> 00:53:55,400
So that's the idea here.

689
00:53:55,400 --> 00:54:07,040
My question about language is different people from different social groups use language

690
00:54:07,040 --> 00:54:17,200
differently, especially non-English language, you know, first like either non-English speakers

691
00:54:17,200 --> 00:54:21,040
or people whose English is not their first language.

692
00:54:21,040 --> 00:54:27,840
So you raise this point, Jerry, and I don't know if it's a good or bad thing, but the

693
00:54:27,840 --> 00:54:35,520
first thing that comes to my mind is, and this is from a statistical point of view,

694
00:54:35,520 --> 00:54:44,840
a statistics professor once told me he looks for variability in his data, because if there's

695
00:54:44,840 --> 00:54:49,640
variability, you'll be able to do good statistics with it, right?

696
00:54:49,640 --> 00:54:53,200
Statistics is the science of uncertainty.

697
00:54:53,200 --> 00:54:59,240
So the more uncertainty you have, the better statistics is a tool for that job.

698
00:54:59,240 --> 00:55:03,900
If there's not much uncertainty, you can just go to math.

699
00:55:03,900 --> 00:55:11,440
So I think we could actually use that essentially to our advantage in that, okay, that may very

700
00:55:11,440 --> 00:55:17,680
well be the case, but introverts and extroverts may still use language differently.

701
00:55:17,680 --> 00:55:24,880
So maybe English is somebody's second language, but maybe that may even make them even more

702
00:55:24,880 --> 00:55:29,060
predisposed to using certain words.

703
00:55:29,060 --> 00:55:36,720
So it could even be the case that we could predict personality traits even better if

704
00:55:36,720 --> 00:55:39,080
English is somebody's second language.

705
00:55:39,080 --> 00:55:48,720
May not be the case, but I think it does add more variability, but I think potentially

706
00:55:48,720 --> 00:55:52,920
could use it to our advantage, but it's definitely a good consideration.

707
00:55:52,920 --> 00:55:57,400
Can I make a couple of points?

708
00:55:57,400 --> 00:55:59,680
And I think there's an elephant in this room.

709
00:55:59,680 --> 00:56:02,920
I mean, it's an interesting idea.

710
00:56:02,920 --> 00:56:13,640
What I might propose is, so you brought up earlier the chemotyping dichotomy or chemotyping

711
00:56:13,640 --> 00:56:19,420
classification that I'm certainly advocating that folks adopt and ride.

712
00:56:19,420 --> 00:56:27,620
So let's just say, for example, that's one of our principal discriminators.

713
00:56:27,620 --> 00:56:33,800
So the chemotype of the botanical that's reported on, let's say that's point one.

714
00:56:33,800 --> 00:56:39,720
So you want to see if you can use natural language processing or some sort of computer

715
00:56:39,720 --> 00:56:42,920
algorithm to parse reviews.

716
00:56:42,920 --> 00:56:51,400
Maybe the parameter there is how active or passively written the text is.

717
00:56:51,400 --> 00:56:58,000
And what I mean by active is I used this bud to get high as active.

718
00:56:58,000 --> 00:57:04,880
I would say this product made me sleepy as passive.

719
00:57:04,880 --> 00:57:09,800
These are pretty important semantic distinctions.

720
00:57:09,800 --> 00:57:16,480
And it may be quite easy to distinguish between an active review and a passive review.

721
00:57:16,480 --> 00:57:18,120
So you score it that way.

722
00:57:18,120 --> 00:57:23,240
That's all fine and dandy, but it hangs when you're done.

723
00:57:23,240 --> 00:57:26,880
You've got to tie it to something for it to be relevant.

724
00:57:26,880 --> 00:57:28,400
And that's the dilemma I have.

725
00:57:28,400 --> 00:57:29,400
So what?

726
00:57:29,400 --> 00:57:35,400
You come up with at the end two thirds of people like this chemotype who write active

727
00:57:35,400 --> 00:57:38,880
reviews and one third like the other ones.

728
00:57:38,880 --> 00:57:39,880
Then what?

729
00:57:39,880 --> 00:57:42,920
I mean, where do you go from there?

730
00:57:42,920 --> 00:57:46,440
That's the dilemma that I see.

731
00:57:46,440 --> 00:57:49,080
It's just a pretty thing to hang out there.

732
00:57:49,080 --> 00:57:52,000
It doesn't tie to anything.

733
00:57:52,000 --> 00:57:56,600
And so I would encourage us to be thinking about, so what do you tie this kind of analysis

734
00:57:56,600 --> 00:57:58,800
or this kind of report to?

735
00:57:58,800 --> 00:58:02,300
The way you raised many good points.

736
00:58:02,300 --> 00:58:10,120
So to address them in reverse order, what I would tie it to is predicting our hat.

737
00:58:10,120 --> 00:58:24,960
So when you build this model, we often build coming from a learning statistics from a university,

738
00:58:24,960 --> 00:58:30,920
people are often interested in estimating beta because they just want to know the relationship

739
00:58:30,920 --> 00:58:39,160
between say openness and somebody's average ratio.

740
00:58:39,160 --> 00:58:44,800
But in the business world, you're more interested in prediction.

741
00:58:44,800 --> 00:58:49,660
So in the business world, you're more interested in predicting our hat.

742
00:58:49,660 --> 00:58:56,100
So given O, C, E, A, and N, what would be our hat?

743
00:58:56,100 --> 00:59:00,160
And so the idea is I think this could benefit a consumer.

744
00:59:00,160 --> 00:59:07,880
So most people generally know their, well actually, well I don't off the top of my head,

745
00:59:07,880 --> 00:59:10,040
so that may not be a correct statement.

746
00:59:10,040 --> 00:59:17,760
But the idea is as a consumer, you could basically take a personality test and one, you'd get

747
00:59:17,760 --> 00:59:19,040
your personality.

748
00:59:19,040 --> 00:59:24,980
And then two, you could say, oh, on average, people with your personality prefer this type

749
00:59:24,980 --> 00:59:26,620
of string.

750
00:59:26,620 --> 00:59:34,460
So that way, before you've even maybe before you've even tried cannabis for the first time,

751
00:59:34,460 --> 00:59:42,120
you now know what people with your personality, what type of strains they may gravitate towards.

752
00:59:42,120 --> 00:59:45,120
It may be, there may be absolutely no difference.

753
00:59:45,120 --> 00:59:49,100
Maybe everybody, maybe personality has no effect.

754
00:59:49,100 --> 00:59:52,920
So remember all these betas, these could all be zero.

755
00:59:52,920 --> 00:59:55,120
So that's the idea is we want to test to see.

756
00:59:55,120 --> 01:00:01,600
I would say that people who market and produce and sell cannabis would want this information.

757
01:00:01,600 --> 01:00:09,320
And that's the idea is, oh, these psychology traits, we're saying, oh, they get incorporated

758
01:00:09,320 --> 01:00:11,200
into language.

759
01:00:11,200 --> 01:00:15,960
Well marketers are all about their marketing type.

760
01:00:15,960 --> 01:00:24,320
So it's like, oh, if you've got a strain that you think extroverts will love, maybe

761
01:00:24,320 --> 01:00:27,600
you can somehow market that to extroverts.

762
01:00:27,600 --> 01:00:32,480
Say, oh, you'll love this as a party or something like that.

763
01:00:32,480 --> 01:00:37,360
I'm just, once I said, I'm just kind of stabbing there.

764
01:00:37,360 --> 01:00:42,440
Psychologies I need to do a lot of reading up on.

765
01:00:42,440 --> 01:00:48,520
But the idea is maybe like Jerry said, maybe marketers could use this in tailoring products

766
01:00:48,520 --> 01:00:51,960
to certain personality types.

767
01:00:51,960 --> 01:01:01,600
And then as I said, maybe consumers could use it to maybe make an educated guess about

768
01:01:01,600 --> 01:01:09,480
what product they may like.

769
01:01:09,480 --> 01:01:15,120
So that's the main idea in a nutshell.

770
01:01:15,120 --> 01:01:18,520
As I said, a lot of the code is already written.

771
01:01:18,520 --> 01:01:23,040
But your first point was how do we actually assign the traits?

772
01:01:23,040 --> 01:01:28,480
And that's what I want to end up spending most of the time on Thursday talking about.

773
01:01:28,480 --> 01:01:31,920
Because that's sort of where all the fun comes.

774
01:01:31,920 --> 01:01:36,280
Because I said, I'm not the first one who's done this, but basically kind of like what

775
01:01:36,280 --> 01:01:44,520
John's saying is, do people use passive or active sentences, so on and so forth?

776
01:01:44,520 --> 01:01:51,440
And do those choices help predict personality?

777
01:01:51,440 --> 01:01:54,840
And then there's a couple training datasets we can use.

778
01:01:54,840 --> 01:02:01,840
So there's basically a training dataset where someone's written an essay and then they've

779
01:02:01,840 --> 01:02:04,320
also taken a personality test.

780
01:02:04,320 --> 01:02:09,520
So we can basically build a training, so we can train our model on the training dataset

781
01:02:09,520 --> 01:02:14,080
and then we'll use it on our actual dataset.

782
01:02:14,080 --> 01:02:18,520
So that's how we'll go about using natural language processing.

783
01:02:18,520 --> 01:02:22,880
But that could almost use a whole meetup of its own.

784
01:02:22,880 --> 01:02:27,320
So instead of Saturday, it'll be next Thursday.

785
01:02:27,320 --> 01:02:30,560
And I think this one's going to be cool.

786
01:02:30,560 --> 01:02:38,080
So today presented the methodology and then next Thursday we'll actually estimate it.

787
01:02:38,080 --> 01:02:44,200
So now I think I'm just kind of just going on and I want to make sure everybody can get

788
01:02:44,200 --> 01:02:45,520
on with their day.

789
01:02:45,520 --> 01:02:50,920
So any last thoughts, comments, questions before we call it a day here?

790
01:02:50,920 --> 01:03:02,520
For NLP processing, is it difficult to get personality traits of introverted persons

791
01:03:02,520 --> 01:03:14,040
from language because they mostly don't review products or they don't participate in social

792
01:03:14,040 --> 01:03:20,400
media and I guess it's difficult to get personality of introverted persons.

793
01:03:20,400 --> 01:03:22,520
You raised an awesome point.

794
01:03:22,520 --> 01:03:31,040
And basically what you raised is, is there a systemic difference between people who are

795
01:03:31,040 --> 01:03:33,960
actually going about answering reviews?

796
01:03:33,960 --> 01:03:37,100
And this easily could be the case.

797
01:03:37,100 --> 01:03:45,880
So I talked about this where like the positivity bias, so maybe people who had a positive experience

798
01:03:45,880 --> 01:03:47,680
are more likely to leave reviews.

799
01:03:47,680 --> 01:03:53,680
And as you mentioned, maybe people with certain personality traits are more likely to leave

800
01:03:53,680 --> 01:03:55,640
reviews.

801
01:03:55,640 --> 01:04:01,360
I think that should be okay as long as our training data isn't biased.

802
01:04:01,360 --> 01:04:06,880
So our training data should be about equal equal.

803
01:04:06,880 --> 01:04:10,760
So actually I don't even know if that necessarily matters.

804
01:04:10,760 --> 01:04:16,840
I don't know if you need an equidistribution of observations.

805
01:04:16,840 --> 01:04:20,400
So I think it's an awesome point.

806
01:04:20,400 --> 01:04:22,600
I'll need to think about this.

807
01:04:22,600 --> 01:04:28,720
I want to think it's, I think we can work with this.

808
01:04:28,720 --> 01:04:31,480
It'll just be our observed data.

809
01:04:31,480 --> 01:04:35,160
There just may be more extroverts that we observe.

810
01:04:35,160 --> 01:04:36,660
That should be okay.

811
01:04:36,660 --> 01:04:39,000
But I think what you've raised is an excellent point.

812
01:04:39,000 --> 01:04:42,500
And it kind of hits on sampling bias.

813
01:04:42,500 --> 01:04:45,380
So this is a critical bias you've got to watch out for.

814
01:04:45,380 --> 01:04:47,380
So thank you.

815
01:04:47,380 --> 01:04:51,720
So thank you for bringing this point up.

816
01:04:51,720 --> 01:04:59,680
I think if one goes down this pathway and so at the end the results that you get, you

817
01:04:59,680 --> 01:05:08,560
get a predictive model or you get a frequency count or an incidence count or something.

818
01:05:08,560 --> 01:05:14,520
So after you've done all your machinations, you come up with okay two-thirds fall in this

819
01:05:14,520 --> 01:05:16,920
group, one-third fall in this group.

820
01:05:16,920 --> 01:05:17,920
So what?

821
01:05:17,920 --> 01:05:22,400
You have to, I believe, then pair this with something else.

822
01:05:22,400 --> 01:05:29,560
And that might be actually some kind of interventional study or interventional trial where you then

823
01:05:29,560 --> 01:05:35,560
have, it sets the hypothesis for the group and then you have to go out and actually do

824
01:05:35,560 --> 01:05:39,560
it with people and see how well it matches.

825
01:05:39,560 --> 01:05:40,940
Maybe that's the approach.

826
01:05:40,940 --> 01:05:47,760
So what you're building initially is a hypothesis with the motivation to then go fund and do

827
01:05:47,760 --> 01:05:50,820
some kind of interventional study.

828
01:05:50,820 --> 01:05:56,760
Maybe that's its utility, but I'm concerned if you just do it in isolation, it hangs and

829
01:05:56,760 --> 01:06:01,320
it's just another shiny pretty picture on the wall.

830
01:06:01,320 --> 01:06:04,880
In John, you raised two critical points.

831
01:06:04,880 --> 01:06:14,880
The first being is even this beta-pinene-d-lemonene ratio, the key metric we should look at or

832
01:06:14,880 --> 01:06:17,580
maybe we want to look at product type, et cetera.

833
01:06:17,580 --> 01:06:24,560
So our outcome variable we may want to think about, so maybe extroverts are more likely

834
01:06:24,560 --> 01:06:26,160
to try, so does.

835
01:06:26,160 --> 01:06:29,320
So there's many research questions we could ask there.

836
01:06:29,320 --> 01:06:34,880
The second more important point you raised is, is this essentially just a hypothesis?

837
01:06:34,880 --> 01:06:37,360
And my answer is absolutely.

838
01:06:37,360 --> 01:06:45,160
It's essentially the value we're adding here is I think we're bringing forth or informalizing,

839
01:06:45,160 --> 01:06:48,080
thankfully, it's not a novel hypothesis.

840
01:06:48,080 --> 01:06:54,000
At least we're hopefully stating it slightly more formally.

841
01:06:54,000 --> 01:06:58,680
But this is, like I said, a hypothesis that's been on my mind, so I just kind of wanted

842
01:06:58,680 --> 01:07:01,080
to formally state it.

843
01:07:01,080 --> 01:07:06,440
And then exactly, it's sort of that picture of the law of large numbers that we saw.

844
01:07:06,440 --> 01:07:14,360
We're just t equals one, so we're just showing how it could be done with this ad-hoc dataset

845
01:07:14,360 --> 01:07:18,520
that we happen to have in the public domain.

846
01:07:18,520 --> 01:07:26,920
But if you then go and do a clinical study or do a study with some of your own data,

847
01:07:26,920 --> 01:07:31,680
then that'll be t equals two, t equals three, so on and so forth.

848
01:07:31,680 --> 01:07:40,160
And so the more and more you test this, the more likely the tests are to converge on the

849
01:07:40,160 --> 01:07:41,840
actual effect.

850
01:07:41,840 --> 01:07:47,440
So do I think we're going to actually measure the effect, the true effect, our first time

851
01:07:47,440 --> 01:07:50,680
we study this, the first model we estimate?

852
01:07:50,680 --> 01:07:51,760
Unlikely.

853
01:07:51,760 --> 01:07:56,760
But hopefully the idea is we can get a lot of great minds thinking about this, and then

854
01:07:56,760 --> 01:08:02,120
many, many people can estimate this model, test out this relationship, and then in the

855
01:08:02,120 --> 01:08:09,820
long run, hopefully we can figure out if there's any rhyme or reason to personality in cannabis.

856
01:08:09,820 --> 01:08:11,880
Is there anything to it?

857
01:08:11,880 --> 01:08:13,880
I think it's interesting to think about.

858
01:08:13,880 --> 01:08:17,720
So I'll let you all chew on this until next week.

859
01:08:17,720 --> 01:08:23,040
Well, we've gone over, so I'll let you get out of here and enjoy your day.

860
01:08:23,040 --> 01:08:24,040
Just want you to thank you.

861
01:08:24,040 --> 01:08:25,040
Thank you all for coming.

862
01:08:25,040 --> 01:08:26,040
Very interesting again, Keenan.

863
01:08:26,040 --> 01:08:27,040
See you later.

864
01:08:27,040 --> 01:08:28,040
Good discussion.

865
01:08:28,040 --> 01:08:29,040
Thanks.

866
01:08:29,040 --> 01:08:30,040
Thank you, Keegan.

867
01:08:30,040 --> 01:08:31,040
Thank you, everyone.

868
01:08:31,040 --> 01:08:32,040
Bye now.

869
01:08:32,040 --> 01:08:37,800
Bye now.

