1
00:00:00,000 --> 00:00:12,840
We've got a bunch of brand new faces today, so I'll introduce myself and then we can do

2
00:00:12,840 --> 00:00:15,520
a round of introductions.

3
00:00:15,520 --> 00:00:17,680
So my name is Keegan.

4
00:00:17,680 --> 00:00:22,760
I've been in the cannabis space since the beginning of 2018.

5
00:00:22,760 --> 00:00:26,960
Started as a laboratory analyst at a lab that tests for cannabis.

6
00:00:26,960 --> 00:00:32,760
Then I started developing software and then I decided, you know what, it's time to just

7
00:00:32,760 --> 00:00:34,600
help everybody in the cannabis industry.

8
00:00:34,600 --> 00:00:40,640
So I started CanLinux to principally help laboratories, but really anyone who needs

9
00:00:40,640 --> 00:00:43,000
help with data.

10
00:00:43,000 --> 00:00:49,920
Now I would love to hear about what your interests are and what brings you to the group and what

11
00:00:49,920 --> 00:00:51,600
you may hope to learn.

12
00:00:51,600 --> 00:00:54,040
So I'll just start in my top left corner.

13
00:00:54,040 --> 00:00:58,080
Jerry, would you be interested in introducing yourself to the group?

14
00:00:58,080 --> 00:00:59,080
Sure.

15
00:00:59,080 --> 00:01:04,920
I'm a marketing communications and data analytics consultant.

16
00:01:04,920 --> 00:01:09,320
Just finished a very big project and don't have anything to do.

17
00:01:09,320 --> 00:01:17,640
So I want to do something, you know, get involved in doing an analytics project just to keep

18
00:01:17,640 --> 00:01:20,020
my hand in the game.

19
00:01:20,020 --> 00:01:23,760
And I saw the meetup and I said, wow, yeah, maybe I should do something with cannabis.

20
00:01:23,760 --> 00:01:27,400
And new and growing industry is probably something very interesting.

21
00:01:27,400 --> 00:01:30,720
Awesome to have you, Jerry.

22
00:01:30,720 --> 00:01:32,400
You're in the right place for certain.

23
00:01:32,400 --> 00:01:34,680
Yeah, I hope so.

24
00:01:34,680 --> 00:01:35,680
You are.

25
00:01:35,680 --> 00:01:38,320
You said, would you be interested in introducing yourself to the group?

26
00:01:38,320 --> 00:01:39,320
Hey, everyone.

27
00:01:39,320 --> 00:01:40,320
My name is Youssef.

28
00:01:40,320 --> 00:01:46,160
Actually, I work right now as a cloud solution consultant.

29
00:01:46,160 --> 00:01:51,600
And I discovered the meetup yesterday and told myself, why not try to meet people eager

30
00:01:51,600 --> 00:01:56,120
to learn like me, because I'm doing I'm trying to learn data science on the side on myself

31
00:01:56,120 --> 00:01:57,120
on my own.

32
00:01:57,120 --> 00:02:05,120
So yeah, it's a great opportunity to meet like minded people and ask for help and advice.

33
00:02:05,120 --> 00:02:06,120
Awesome.

34
00:02:06,120 --> 00:02:10,400
Great place to learn data science.

35
00:02:10,400 --> 00:02:15,040
Patrick, would you be interested in introducing yourself, please?

36
00:02:15,040 --> 00:02:16,040
Yeah, sure.

37
00:02:16,040 --> 00:02:17,040
Thanks.

38
00:02:17,040 --> 00:02:20,280
My name is Patrick Callahan in Wilmington, Delaware.

39
00:02:20,280 --> 00:02:22,760
It's 20 minutes outside of Philadelphia.

40
00:02:22,760 --> 00:02:27,960
And we have some clients in the cannabis space and we're a data science and analytics company.

41
00:02:27,960 --> 00:02:30,000
I compass read.

42
00:02:30,000 --> 00:02:31,000
Sorry.

43
00:02:31,000 --> 00:02:32,000
Awesome.

44
00:02:32,000 --> 00:02:33,000
Awesome.

45
00:02:33,000 --> 00:02:35,000
Delaware is on the agenda for today.

46
00:02:35,000 --> 00:02:37,480
So in fact, we're going to be talking about each and every state.

47
00:02:37,480 --> 00:02:38,480
So stay tuned.

48
00:02:38,480 --> 00:02:40,880
It's going to be an exciting day.

49
00:02:40,880 --> 00:02:45,200
Michelle, would you be interested in introducing yourself?

50
00:02:45,200 --> 00:02:47,200
Hello.

51
00:02:47,200 --> 00:02:52,680
Like some people said, I saw the meetup and I was like, this is a yes for me.

52
00:02:52,680 --> 00:02:58,520
There's a couple of companies that I follow to get information on a couple of doctors

53
00:02:58,520 --> 00:03:03,600
who have companies in other states who have been practicing medicinal use of marijuana

54
00:03:03,600 --> 00:03:05,440
for like a decade.

55
00:03:05,440 --> 00:03:07,760
So I'm basically just trying to get as much exposure as I can.

56
00:03:07,760 --> 00:03:13,680
I started my own cannabis coaching company last year to help people and teach people

57
00:03:13,680 --> 00:03:20,600
how to microdose cannabis for ADHD, anxiety, depression.

58
00:03:20,600 --> 00:03:23,560
I'm autistic, so it helps with that.

59
00:03:23,560 --> 00:03:27,040
Also, trauma, yada, yada, yada.

60
00:03:27,040 --> 00:03:33,680
And I created my own dosing system and it's an oral delivery and very easy to dose.

61
00:03:33,680 --> 00:03:35,480
It's not by milligrams, it's by ounces.

62
00:03:35,480 --> 00:03:37,280
So it's by volume.

63
00:03:37,280 --> 00:03:44,400
But anyway, it's helping people be able to microdose it without having to deal with being

64
00:03:44,400 --> 00:03:45,680
like affected and stuff.

65
00:03:45,680 --> 00:03:50,320
So I'm filling in gaps with science in this way also.

66
00:03:50,320 --> 00:03:51,760
Awesome.

67
00:03:51,760 --> 00:03:55,160
Your expertise is going to come in real handy in some future meetups.

68
00:03:55,160 --> 00:04:00,720
So we're going to be diving a lot into the production and consumption of cannabis here

69
00:04:00,720 --> 00:04:01,720
in 2022.

70
00:04:01,720 --> 00:04:05,500
So today we'll be wrapping up a sales forecast.

71
00:04:05,500 --> 00:04:10,600
So in future meetups, as well as today, your expertise will be valuable.

72
00:04:10,600 --> 00:04:12,600
Happy to help.

73
00:04:12,600 --> 00:04:13,600
Awesome.

74
00:04:13,600 --> 00:04:18,880
My board changed a little bit here, but Devon, would you be interested in introducing yourself?

75
00:04:18,880 --> 00:04:19,880
Hi.

76
00:04:19,880 --> 00:04:24,000
Yes, just working on my camera now.

77
00:04:24,000 --> 00:04:25,000
Devon Moftano.

78
00:04:25,000 --> 00:04:29,120
I'm working in Wilmington, Delaware at Compass Red.

79
00:04:29,120 --> 00:04:30,120
Awesome.

80
00:04:30,120 --> 00:04:36,920
Raul, would you be interested in introducing yourself, please?

81
00:04:36,920 --> 00:04:37,920
Yes.

82
00:04:37,920 --> 00:04:38,920
Hi.

83
00:04:38,920 --> 00:04:39,920
Good morning.

84
00:04:39,920 --> 00:04:40,920
My name is Raul Infante.

85
00:04:40,920 --> 00:04:50,320
I am a medical marijuana patient in Connecticut, and I am also a grower as Connecticut.

86
00:04:50,320 --> 00:04:53,540
Ever since October, we've been able to grow our own cannabis.

87
00:04:53,540 --> 00:05:00,400
I have volunteered and I have been an employee at multiple hemp farms ever since the year

88
00:05:00,400 --> 00:05:05,280
2019, helping them with the harvest and helping them with the seeding and with the taking

89
00:05:05,280 --> 00:05:07,980
care of the plant through vegetation stage.

90
00:05:07,980 --> 00:05:13,480
I got a few grows under my belt, even though the patient program just started in October.

91
00:05:13,480 --> 00:05:16,040
I've got a few grows.

92
00:05:16,040 --> 00:05:21,520
I've been growing since about 2017 in Massachusetts with my cousin for himself.

93
00:05:21,520 --> 00:05:26,400
I've shown and helped him how to grow.

94
00:05:26,400 --> 00:05:36,120
Right now, I am currently the head intern at a farm in Danbury called Seymourone Farmstead,

95
00:05:36,120 --> 00:05:40,100
S-E-A-M-A-R-R-O-N Farmstead.

96
00:05:40,100 --> 00:05:45,040
We will be growing hemp this upcoming spring season.

97
00:05:45,040 --> 00:05:47,440
That's just a little bit of background of where I'm coming from.

98
00:05:47,440 --> 00:05:52,920
I'll be starting horticulture classes in June in one of our local community colleges here

99
00:05:52,920 --> 00:05:55,600
in Southern Connecticut.

100
00:05:55,600 --> 00:05:56,600
That's just a little bit of background.

101
00:05:56,600 --> 00:05:57,600
I'm a father.

102
00:05:57,600 --> 00:05:58,600
I have a four-year-old.

103
00:05:58,600 --> 00:06:05,320
Yeah, things are pretty high energy at my crib, at the house.

104
00:06:05,320 --> 00:06:08,760
Thank you for having me and I appreciate you very much.

105
00:06:08,760 --> 00:06:09,760
Thank you for joining us.

106
00:06:09,760 --> 00:06:12,000
Love to have your expertise as well.

107
00:06:12,000 --> 00:06:14,000
We have a lot of data scientists.

108
00:06:14,000 --> 00:06:17,680
Sorry, I didn't mean to interrupt, sir.

109
00:06:17,680 --> 00:06:21,040
I'm probably just going to have you in my ear because I'm at work right now and I just

110
00:06:21,040 --> 00:06:23,080
wanted to be a part of it.

111
00:06:23,080 --> 00:06:24,080
Definitely.

112
00:06:24,080 --> 00:06:25,080
Definitely.

113
00:06:25,080 --> 00:06:27,840
You can listen in and then watch the recording later.

114
00:06:27,840 --> 00:06:33,720
Always happy to have your expertise too because it's so important to hear from the producers

115
00:06:33,720 --> 00:06:38,920
and processors and retailers and everybody in the industry.

116
00:06:38,920 --> 00:06:42,720
Really interested about the Northeast because I don't know as much about the industry.

117
00:06:42,720 --> 00:06:46,040
What part of the country are you in?

118
00:06:46,040 --> 00:06:47,040
California, sir?

119
00:06:47,040 --> 00:06:49,080
What's your name?

120
00:06:49,080 --> 00:06:53,040
Well, Kenlytics is based out of Olympia, Washington.

121
00:06:53,040 --> 00:06:55,880
Right now I'm visiting family in North Carolina.

122
00:06:55,880 --> 00:07:02,960
So worked with people all over the country, worked with a couple of people in Massachusetts,

123
00:07:02,960 --> 00:07:05,640
but I only know a cursory amount.

124
00:07:05,640 --> 00:07:07,480
So always happy to learn myself.

125
00:07:07,480 --> 00:07:08,480
Yeah, me too.

126
00:07:08,480 --> 00:07:09,480
Always.

127
00:07:09,480 --> 00:07:10,480
Me too.

128
00:07:10,480 --> 00:07:11,480
What's your name, sir?

129
00:07:11,480 --> 00:07:12,480
What can I call you?

130
00:07:12,480 --> 00:07:13,480
Kenlytics.

131
00:07:13,480 --> 00:07:14,480
What was that?

132
00:07:14,480 --> 00:07:15,480
Exactly.

133
00:07:15,480 --> 00:07:20,000
And I got into this space helping out laboratories and my background's in data science.

134
00:07:20,000 --> 00:07:22,680
So just trying to combine the two.

135
00:07:22,680 --> 00:07:23,680
What's your name?

136
00:07:23,680 --> 00:07:24,680
Megan.

137
00:07:24,680 --> 00:07:25,680
Megan?

138
00:07:25,680 --> 00:07:26,680
Yes.

139
00:07:26,680 --> 00:07:27,680
Megan?

140
00:07:27,680 --> 00:07:28,680
All right, Megan.

141
00:07:28,680 --> 00:07:29,680
Nice to meet you, sir.

142
00:07:29,680 --> 00:07:30,680
Good to meet you, Ronald.

143
00:07:30,680 --> 00:07:31,680
I'll be I'm gonna be on mute.

144
00:07:31,680 --> 00:07:32,680
All right.

145
00:07:32,680 --> 00:07:33,680
Okay.

146
00:07:33,680 --> 00:07:34,680
Thank you.

147
00:07:34,680 --> 00:07:36,680
David, would you be introduced?

148
00:07:36,680 --> 00:07:39,600
I need to think of something else to say.

149
00:07:39,600 --> 00:07:41,600
David, would you mind introducing yourself?

150
00:07:41,600 --> 00:07:42,600
Sure.

151
00:07:42,600 --> 00:07:44,760
So hello, everybody from Denver.

152
00:07:44,760 --> 00:07:51,760
So I work for the state of Colorado in marijuana enforcement division, and I am a data scientist.

153
00:07:51,760 --> 00:07:54,680
So I'm happy to be here.

154
00:07:54,680 --> 00:07:58,200
Awesome to have you here, especially from the state of Colorado.

155
00:07:58,200 --> 00:08:01,560
So love that you're joining.

156
00:08:01,560 --> 00:08:07,600
So hopefully you can feel free to share and then maybe you could learn a thing or two.

157
00:08:07,600 --> 00:08:09,600
So it's awesome to have you.

158
00:08:09,600 --> 00:08:12,280
So we're just here to learn from each other.

159
00:08:12,280 --> 00:08:13,280
Thanks.

160
00:08:13,280 --> 00:08:14,280
Awesome.

161
00:08:14,280 --> 00:08:16,680
So got four other people here.

162
00:08:16,680 --> 00:08:19,280
There's Kelvin, it looks like.

163
00:08:19,280 --> 00:08:22,280
Would you mind introducing yourself?

164
00:08:22,280 --> 00:08:28,680
Okay, let's see.

165
00:08:28,680 --> 00:08:36,600
Okay, we may come back to you, Kelvin.

166
00:08:36,600 --> 00:08:40,520
Matthew, would you mind introducing yourself?

167
00:08:40,520 --> 00:08:41,520
Yes, absolutely.

168
00:08:41,520 --> 00:08:44,520
So I'm a software developer.

169
00:08:44,520 --> 00:08:48,240
I'm actually one of the people also.

170
00:08:48,240 --> 00:08:54,520
Looks like we're about half the group here, which is kind of cool.

171
00:08:54,520 --> 00:08:57,240
But yeah, just this is kind of a unique meetup.

172
00:08:57,240 --> 00:09:01,280
And so we wanted to be here since we heard about it.

173
00:09:01,280 --> 00:09:03,280
So thanks for putting this on.

174
00:09:03,280 --> 00:09:04,520
I appreciate it.

175
00:09:04,520 --> 00:09:05,520
Thanks for joining.

176
00:09:05,520 --> 00:09:10,960
And I always say to everybody, there's a shortage of data scientists and software developers.

177
00:09:10,960 --> 00:09:15,120
So your value you can bring will go a long way.

178
00:09:15,120 --> 00:09:16,120
Thank you.

179
00:09:16,120 --> 00:09:19,120
So we've got Michelle.

180
00:09:19,120 --> 00:09:21,120
Let's see.

181
00:09:21,120 --> 00:09:24,040
Ben, would you be introduced?

182
00:09:24,040 --> 00:09:25,920
Would you like to introduce yourself?

183
00:09:25,920 --> 00:09:26,920
Yeah.

184
00:09:26,920 --> 00:09:27,920
Hi.

185
00:09:27,920 --> 00:09:35,160
Also from the Compass Red team here in Philadelphia, data analysts and product developer.

186
00:09:35,160 --> 00:09:41,560
On the team, like was mentioned, we have some clients in the cannabis space.

187
00:09:41,560 --> 00:09:47,280
Our team primarily works in R and R Shiny, as well as all of the dashboarding applications

188
00:09:47,280 --> 00:09:49,920
under the sun.

189
00:09:49,920 --> 00:09:57,720
The focus of my work has primarily been in data visualization in the R world as well

190
00:09:57,720 --> 00:09:58,720
as D3.

191
00:09:58,720 --> 00:09:59,720
So happy to be here.

192
00:09:59,720 --> 00:10:00,720
Awesome to have you.

193
00:10:00,720 --> 00:10:05,480
And we've had a lot of people who've worked with R in the past, and I always think it's

194
00:10:05,480 --> 00:10:10,280
just a programming languages are a tool, a means to an end.

195
00:10:10,280 --> 00:10:13,040
So I do a lot of work in Python.

196
00:10:13,040 --> 00:10:15,680
So I've got a bit of code to show you today in Python.

197
00:10:15,680 --> 00:10:21,040
However, it'd be awesome if you were ambitious enough to rewrite it in R. So that would go

198
00:10:21,040 --> 00:10:24,040
a long way too.

199
00:10:24,040 --> 00:10:26,880
All right.

200
00:10:26,880 --> 00:10:32,440
I may have butchered the pronunciation of that.

201
00:10:32,440 --> 00:10:36,680
Would you mind introducing yourself if you would like?

202
00:10:36,680 --> 00:10:40,080
OK.

203
00:10:40,080 --> 00:10:47,600
So like D or Kelvin, you're welcome to speak up at any point if you want to introduce yourself.

204
00:10:47,600 --> 00:10:48,600
You don't have to.

205
00:10:48,600 --> 00:10:51,320
So options always there on the team.

206
00:10:51,320 --> 00:10:52,520
All right.

207
00:10:52,520 --> 00:10:57,360
Well, for those of you who are new to the group, I don't really like to steal the show.

208
00:10:57,360 --> 00:10:59,360
It just ends up happening.

209
00:10:59,360 --> 00:11:02,800
It is a meet up here after all.

210
00:11:02,800 --> 00:11:08,640
But long story short, I usually prepare a script and a short presentation just to sort

211
00:11:08,640 --> 00:11:11,120
of give the group a direction.

212
00:11:11,120 --> 00:11:14,120
And then usually that takes around 30 minutes or so.

213
00:11:14,120 --> 00:11:17,720
And then have a 15 minute or so discussion at the end.

214
00:11:17,720 --> 00:11:20,660
However, we've got a huge group today.

215
00:11:20,660 --> 00:11:26,280
So if you want to chime in at any point, just start talking up.

216
00:11:26,280 --> 00:11:31,040
And we can start a dialogue at any point.

217
00:11:31,040 --> 00:11:39,040
But without further ado, I'll just go ahead and share my screen just so you can follow

218
00:11:39,040 --> 00:11:40,660
along.

219
00:11:40,660 --> 00:11:46,760
So for those of you who are new, everything's available on GitHub.

220
00:11:46,760 --> 00:11:51,240
So you can check out all the work we've done in 2021.

221
00:11:51,240 --> 00:11:55,560
And then I try to give more than 20 minutes.

222
00:11:55,560 --> 00:12:04,100
But I try to get the script for the day uploaded beforehand so that way you can follow along.

223
00:12:04,100 --> 00:12:13,280
And then just put together just a short presentation just to start talking about the subject for

224
00:12:13,280 --> 00:12:14,280
the day.

225
00:12:14,280 --> 00:12:20,200
And so last week, we started talking about, OK, what is data science and where did it

226
00:12:20,200 --> 00:12:22,280
come from?

227
00:12:22,280 --> 00:12:27,240
And I said, OK, it's a mixture of statistics and computer science.

228
00:12:27,240 --> 00:12:29,400
And my background is in economics.

229
00:12:29,400 --> 00:12:32,720
So I like to splash in a little bit of economics.

230
00:12:32,720 --> 00:12:35,800
But really, it just picks from a lot of different fields.

231
00:12:35,800 --> 00:12:37,800
So I did a little research.

232
00:12:37,800 --> 00:12:46,400
And it looks like, OK, data science has been around 40 plus years now.

233
00:12:46,400 --> 00:12:54,160
So we've got Peter Noir, who didn't like the term computer science.

234
00:12:54,160 --> 00:12:58,300
And he thought it would be better called data science.

235
00:12:58,300 --> 00:12:59,420
So that's interesting.

236
00:12:59,420 --> 00:13:07,760
And then you've got CF Jeff Wu, who is an engineering statistician at the Georgia Institute

237
00:13:07,760 --> 00:13:08,760
for Technology.

238
00:13:08,760 --> 00:13:16,760
And he was the first one to use data science as an alternative name for statistics.

239
00:13:16,760 --> 00:13:20,360
So just a little bit of history.

240
00:13:20,360 --> 00:13:27,160
It's always kind of interesting to know where these things came about from.

241
00:13:27,160 --> 00:13:36,760
Latonya Sweeney, she was the chief technologist at the Federal Trade Commission, I believe.

242
00:13:36,760 --> 00:13:37,760
For a period.

243
00:13:37,760 --> 00:13:41,760
And I believe she's currently a Harvard professor.

244
00:13:41,760 --> 00:13:48,760
So she's probably forgotten more than I know.

245
00:13:48,760 --> 00:13:50,760
And she's just an expert here.

246
00:13:50,760 --> 00:13:55,760
And so she said, OK, there's three trends to watch for in data science.

247
00:13:55,760 --> 00:13:58,760
One, we're just going to be collecting more data.

248
00:13:58,760 --> 00:14:06,480
Two, the data is going to be getting more granular, to the point where it's sort of

249
00:14:06,480 --> 00:14:09,560
like collected if you can.

250
00:14:09,560 --> 00:14:12,160
The data is getting so granular.

251
00:14:12,160 --> 00:14:20,560
And there's so much of it that we've now reached what some are calling the data deluge, which

252
00:14:20,560 --> 00:14:25,920
is there's just, I like to call it a fire hose of data.

253
00:14:25,920 --> 00:14:31,560
So think of a fire hydrant that's bust a cap, and it's just spraying out data.

254
00:14:31,560 --> 00:14:35,240
And so first things first, you have to wrangle it.

255
00:14:35,240 --> 00:14:37,880
And then collect it.

256
00:14:37,880 --> 00:14:40,960
And then think about what you're going to do with it.

257
00:14:40,960 --> 00:14:44,520
So essentially, you need to design your data.

258
00:14:44,520 --> 00:14:46,240
And you need to collect it.

259
00:14:46,240 --> 00:14:49,960
Then you can't forget about analyzing your data.

260
00:14:49,960 --> 00:14:54,560
So there's a little bit of data science for today.

261
00:14:54,560 --> 00:15:01,120
And now what we're all here for is about the cannabis industry.

262
00:15:01,120 --> 00:15:08,360
So as a good Bayesian, I thought I would ask some of you what your priors may be.

263
00:15:08,360 --> 00:15:15,040
So here are three topics that we'll be talking about today.

264
00:15:15,040 --> 00:15:19,380
And we can make forecasts and check our forecasts.

265
00:15:19,380 --> 00:15:27,500
But before we begin, does anybody have any expectations of what states may permit medicine

266
00:15:27,500 --> 00:15:30,720
or adult use in 2022?

267
00:15:30,720 --> 00:15:38,680
And what you may expect sales to be in, say, New York or just in the US in general?

268
00:15:38,680 --> 00:15:39,680
Anyone?

269
00:15:39,680 --> 00:15:41,680
Feel free to...

270
00:15:41,680 --> 00:15:45,560
I'll speak on New York.

271
00:15:45,560 --> 00:15:46,560
I'm in Brooklyn.

272
00:15:46,560 --> 00:15:47,560
Yes.

273
00:15:47,560 --> 00:15:49,240
I'm from Western New York.

274
00:15:49,240 --> 00:15:51,840
So I'm just in this area in general.

275
00:15:51,840 --> 00:15:55,440
And it obviously recently became legal here.

276
00:15:55,440 --> 00:16:04,320
I've seen some head shops go up and come down really quickly, like block by block.

277
00:16:04,320 --> 00:16:05,320
There was so many.

278
00:16:05,320 --> 00:16:06,760
But you needed a card.

279
00:16:06,760 --> 00:16:10,800
So now that you don't need a card, it's a little different.

280
00:16:10,800 --> 00:16:16,920
But every single deli here sells in some form.

281
00:16:16,920 --> 00:16:21,500
Like liquid, like there's beverages.

282
00:16:21,500 --> 00:16:23,920
You can literally get it on every single corner.

283
00:16:23,920 --> 00:16:25,760
And it's not like weed for sale here.

284
00:16:25,760 --> 00:16:30,120
It's like we have this, but we call it this.

285
00:16:30,120 --> 00:16:34,960
But you can literally buy weed from any single deli that sells it.

286
00:16:34,960 --> 00:16:36,520
And then there's the other things.

287
00:16:36,520 --> 00:16:42,920
I don't know if people have heard of like kratom and all the other plants that are alternatives

288
00:16:42,920 --> 00:16:43,920
to marijuana.

289
00:16:43,920 --> 00:16:45,360
So there's a lot of that going on.

290
00:16:45,360 --> 00:16:48,760
So I feel like for me, it's really chaotic.

291
00:16:48,760 --> 00:16:52,680
And it's not something I pay too much attention to because my focus is more...

292
00:16:52,680 --> 00:16:54,160
My target is...

293
00:16:54,160 --> 00:16:57,680
I also have a marketing background and I'm a hairstylist of 22 years.

294
00:16:57,680 --> 00:17:00,280
So I have a chemistry brain in that field.

295
00:17:00,280 --> 00:17:12,440
But my target is mostly middle-aged women, two or three kids deep, or stoners.

296
00:17:12,440 --> 00:17:15,520
There's like no in between.

297
00:17:15,520 --> 00:17:23,520
But for me here, it's very chaotic because there's a lot of dealers here that are half

298
00:17:23,520 --> 00:17:24,840
legal and half not legal.

299
00:17:24,840 --> 00:17:28,120
So they'll do events out in the streets and they won't need any permits because they can

300
00:17:28,120 --> 00:17:30,400
just go before they get in trouble.

301
00:17:30,400 --> 00:17:33,400
But everyone loves it and nobody cares.

302
00:17:33,400 --> 00:17:34,400
Cops don't care about it.

303
00:17:34,400 --> 00:17:35,760
It's just a very bizarre vibe.

304
00:17:35,760 --> 00:17:41,080
So I can't even answer any of these questions because it still feels like chaos.

305
00:17:41,080 --> 00:17:45,160
And I feel like it's going to be like that here, at least in New York City for a while,

306
00:17:45,160 --> 00:17:46,160
like years.

307
00:17:46,160 --> 00:17:47,160
Yes.

308
00:17:47,160 --> 00:17:54,800
I'm also wondering in New York State, just passed a deadline for individual municipalities

309
00:17:54,800 --> 00:18:05,560
to opt in or opt out for allowing retail sales and consumption venues for recreational marijuana.

310
00:18:05,560 --> 00:18:07,120
And I don't know the results on that yet.

311
00:18:07,120 --> 00:18:08,120
That would be interesting.

312
00:18:08,120 --> 00:18:09,840
I know my...

313
00:18:09,840 --> 00:18:12,840
I live in a village within a town.

314
00:18:12,840 --> 00:18:14,080
The village voted yes.

315
00:18:14,080 --> 00:18:15,320
The town voted no.

316
00:18:15,320 --> 00:18:17,760
They voted to opt out.

317
00:18:17,760 --> 00:18:19,800
So talking about chaos.

318
00:18:19,800 --> 00:18:26,680
Then there's also the question, I had a client a while ago who makes non-consumable hemp

319
00:18:26,680 --> 00:18:27,680
products.

320
00:18:27,680 --> 00:18:31,040
He has a cannabis farm up here.

321
00:18:31,040 --> 00:18:35,920
So I'm just wondering how that relates because there are so many products that you can get

322
00:18:35,920 --> 00:18:41,600
from hemp that are not psychotropic.

323
00:18:41,600 --> 00:18:42,600
Is that the right word?

324
00:18:42,600 --> 00:18:45,000
I think so.

325
00:18:45,000 --> 00:18:46,000
Yes.

326
00:18:46,000 --> 00:18:53,520
Well, after listening to you two, I may have to update my priors.

327
00:18:53,520 --> 00:18:55,080
That's real informative.

328
00:18:55,080 --> 00:18:58,400
So it's going to be interesting to see...

329
00:18:58,400 --> 00:19:01,480
Let's just make sure this...

330
00:19:01,480 --> 00:19:12,520
It's going to be interesting to see, one, if the gray market in New York becomes legalized,

331
00:19:12,520 --> 00:19:15,840
by the end of 2022.

332
00:19:15,840 --> 00:19:25,480
And then as far as the towns or locales opting out, from the research I've done, the demand

333
00:19:25,480 --> 00:19:29,760
for cannabis appears to be fairly inelastic.

334
00:19:29,760 --> 00:19:35,120
So that would suggest that people will just drive to the next town over and just bear

335
00:19:35,120 --> 00:19:36,120
the cost.

336
00:19:36,120 --> 00:19:39,640
But only so many people will do that.

337
00:19:39,640 --> 00:19:44,120
And it still may curtail consumption a little bit.

338
00:19:44,120 --> 00:19:50,000
So that may curtail sales to a certain extent.

339
00:19:50,000 --> 00:19:53,640
Hard to say.

340
00:19:53,640 --> 00:19:55,140
Awesome predictions.

341
00:19:55,140 --> 00:20:01,280
So for me, my Bayesian prior was, I'm not certain...

342
00:20:01,280 --> 00:20:05,920
Just to be honest, I'm not certain New York was going to get online in 2022.

343
00:20:05,920 --> 00:20:11,960
But after hearing from you two, it sounds like maybe the ball is moving a bit quicker

344
00:20:11,960 --> 00:20:14,840
than I anticipated there.

345
00:20:14,840 --> 00:20:28,920
And then for the US sales, I just had really no idea what a good prior would be.

346
00:20:28,920 --> 00:20:34,640
So not the best prior, but basically if I had to give one, I would say between like

347
00:20:34,640 --> 00:20:38,680
15 and 60 billion.

348
00:20:38,680 --> 00:20:43,440
But we'll get to that in just one second.

349
00:20:43,440 --> 00:20:48,520
But some states to look out for.

350
00:20:48,520 --> 00:20:54,280
So just want to let you know that I got this from the Cannabis Business Times.

351
00:20:54,280 --> 00:21:00,840
They published an article recently, 15 states that could legalize in 2022.

352
00:21:00,840 --> 00:21:02,560
And here they are.

353
00:21:02,560 --> 00:21:07,800
There could be some wild cards that just come that you don't expect.

354
00:21:07,800 --> 00:21:09,480
That's happened before.

355
00:21:09,480 --> 00:21:13,360
But these are the ones that they said to watch out for.

356
00:21:13,360 --> 00:21:18,520
So of course, states like New York, they've already permitted adult use.

357
00:21:18,520 --> 00:21:24,720
We're just waiting for the rollout similar in New Jersey and so on and so forth.

358
00:21:24,720 --> 00:21:30,480
But just starting from the top, Delaware, Oklahoma, Mississippi, Maryland.

359
00:21:30,480 --> 00:21:35,880
I think maybe one of the more likely ones.

360
00:21:35,880 --> 00:21:36,880
Same for Oklahoma.

361
00:21:36,880 --> 00:21:44,720
I would, Oklahoma has such a permissible market that it's essentially already adult use.

362
00:21:44,720 --> 00:21:47,280
You still technically have to be a patient.

363
00:21:47,280 --> 00:21:53,120
So I wouldn't be surprised if they formally allow adult use in Oklahoma.

364
00:21:53,120 --> 00:21:58,360
Ohio, things are trying to move along there with signatures.

365
00:21:58,360 --> 00:22:00,320
Same with Wyoming.

366
00:22:00,320 --> 00:22:03,280
There's efforts in Pennsylvania.

367
00:22:03,280 --> 00:22:08,520
Rhode Island looks like another one that I would say is a bit more likely.

368
00:22:08,520 --> 00:22:11,840
So I think they're waiting on a special session there.

369
00:22:11,840 --> 00:22:18,200
But basically, a lot of trying to think about Rhode Island's geography.

370
00:22:18,200 --> 00:22:24,040
But you've got states nearby that are all allowing adult use.

371
00:22:24,040 --> 00:22:28,360
So right in that northeast, you've got Massachusetts.

372
00:22:28,360 --> 00:22:32,760
Connecticut is moving the ball forward, if not already has.

373
00:22:32,760 --> 00:22:35,000
Then of course, New York and New Jersey.

374
00:22:35,000 --> 00:22:37,960
So I think that'll put pressure on Rhode Island.

375
00:22:37,960 --> 00:22:41,440
And so I wouldn't be surprised if it moves quickly.

376
00:22:41,440 --> 00:22:45,160
Nebraska may allow medicinal.

377
00:22:45,160 --> 00:22:51,440
Arkansas is working on adult use, or advocates are working on it.

378
00:22:51,440 --> 00:22:59,520
Florida, people are trying to get adult use onto the 2022 ballot.

379
00:22:59,520 --> 00:23:03,120
Same for Missouri.

380
00:23:03,120 --> 00:23:05,320
Just going to allow some people in.

381
00:23:05,320 --> 00:23:08,840
In all of the newcomers, I'll try to give you a chance to introduce yourselves

382
00:23:08,840 --> 00:23:12,480
and talk here at the end.

383
00:23:12,480 --> 00:23:15,080
North Carolina, I was surprised about this.

384
00:23:15,080 --> 00:23:18,880
But there's some efforts there for medicinal cannabis.

385
00:23:18,880 --> 00:23:22,800
But moving quite slowly, that would be expected.

386
00:23:22,800 --> 00:23:25,800
And then South Dakota, an interesting one.

387
00:23:25,800 --> 00:23:29,200
They're trying to get adult use on the 2022 ballot.

388
00:23:29,200 --> 00:23:32,440
And Idaho.

389
00:23:32,440 --> 00:23:35,840
Idaho is an interesting one because you've

390
00:23:35,840 --> 00:23:39,840
got some neighbors like Washington and Oregon

391
00:23:39,840 --> 00:23:44,280
who have had adult use cannabis for a long time now.

392
00:23:44,280 --> 00:23:49,360
And so Idaho is thinking about getting medicinal.

393
00:23:49,360 --> 00:23:52,320
So these are some states that are coming along.

394
00:23:52,320 --> 00:24:01,640
And the reason why these matter is I saw CNBC put out a forecast for,

395
00:24:01,640 --> 00:24:02,760
they beat me to it.

396
00:24:02,760 --> 00:24:07,400
So they put out a forecast for 2022.

397
00:24:07,400 --> 00:24:12,280
And so as a Bayesian, I have to incorporate this into my prior.

398
00:24:12,280 --> 00:24:15,920
So I would have liked to had it made my forecast before seeing this.

399
00:24:15,920 --> 00:24:19,040
But oh, well, I saw these numbers.

400
00:24:19,040 --> 00:24:23,600
So CNBC is predicting total sales in the US

401
00:24:23,600 --> 00:24:27,800
is going to be $31 billion, up 28%.

402
00:24:27,800 --> 00:24:30,640
And that large growth is probably going

403
00:24:30,640 --> 00:24:33,680
to be from potentially new states coming online or things

404
00:24:33,680 --> 00:24:36,160
quickening in certain states.

405
00:24:36,160 --> 00:24:40,960
And they're estimating, it was hard to tell from the audio clip,

406
00:24:40,960 --> 00:24:43,680
but it sounded like they were estimating New York

407
00:24:43,680 --> 00:24:47,320
to hit $4 billion in 2022.

408
00:24:47,320 --> 00:24:50,640
Personally, I don't know how realistic that would be,

409
00:24:50,640 --> 00:24:53,280
but never rule anything out.

410
00:24:53,280 --> 00:24:57,160
So New York is a big populous state.

411
00:24:57,160 --> 00:25:01,280
So it may be possible.

412
00:25:01,280 --> 00:25:04,120
Question, comment?

413
00:25:04,120 --> 00:25:07,120
New York is eager to make money off of it for sure.

414
00:25:07,120 --> 00:25:09,160
So it doesn't surprise me the speed at which they

415
00:25:09,160 --> 00:25:11,160
think it's going to grow, because especially here

416
00:25:11,160 --> 00:25:14,200
in New York City, one of the other things that I was going to say

417
00:25:14,200 --> 00:25:15,840
was there's a lot of fraud here.

418
00:25:15,840 --> 00:25:17,640
So when CBD became a thing, and you

419
00:25:17,640 --> 00:25:20,760
could get that at every deli, then there

420
00:25:20,760 --> 00:25:23,440
was a realization about how much of that product

421
00:25:23,440 --> 00:25:27,800
was just being sold willy nilly, and it's not right.

422
00:25:27,800 --> 00:25:29,360
So I feel like the same thing is going

423
00:25:29,360 --> 00:25:31,920
to happen with cannabis, especially with carts

424
00:25:31,920 --> 00:25:34,480
and stuff like that.

425
00:25:34,480 --> 00:25:37,320
So I think that New York doesn't care.

426
00:25:37,320 --> 00:25:38,960
They're just interested in the revenue.

427
00:25:38,960 --> 00:25:40,720
So the faster everything can roll out.

428
00:25:40,720 --> 00:25:44,440
So maybe that's why the prediction is so large.

429
00:25:44,440 --> 00:25:45,360
You think?

430
00:25:45,360 --> 00:25:48,520
Yes, and like you said, it sounds like people are already

431
00:25:48,520 --> 00:25:49,320
set up.

432
00:25:49,320 --> 00:25:51,920
So the infrastructure may be in place.

433
00:25:51,920 --> 00:25:55,680
So once they permit it, it could go quickly.

434
00:25:55,680 --> 00:25:58,520
And I've said this in prior meetups,

435
00:25:58,520 --> 00:26:03,280
but I'd like to say states, there's always

436
00:26:03,280 --> 00:26:04,360
an opportunity choice.

437
00:26:04,360 --> 00:26:07,280
When you choose to do something or not to do something,

438
00:26:07,280 --> 00:26:08,880
there's an opportunity cost.

439
00:26:08,880 --> 00:26:12,480
And so what I basically am saying is like, hey, states,

440
00:26:12,480 --> 00:26:15,200
if you're not permitting this, then like you said,

441
00:26:15,200 --> 00:26:18,000
there is this, it's not even gray,

442
00:26:18,000 --> 00:26:23,320
it's a black market going on that you're not taxing.

443
00:26:23,320 --> 00:26:26,040
And so the tax revenue could potentially

444
00:26:26,040 --> 00:26:29,640
go to socially beneficial things.

445
00:26:29,640 --> 00:26:32,640
I always use the example of building schools

446
00:26:32,640 --> 00:26:36,280
because you can put a price tag on that,

447
00:26:36,280 --> 00:26:39,240
and you can say, OK, how many schools can we build?

448
00:26:39,240 --> 00:26:41,920
And I say, OK, that's your opportunity cost.

449
00:26:41,920 --> 00:26:45,040
You can either let a lot of money.

450
00:26:45,040 --> 00:26:47,240
So we've been doing the math, and it's

451
00:26:47,240 --> 00:26:50,760
hundreds of millions, if not billions of dollars flow

452
00:26:50,760 --> 00:26:53,680
into the hands of who knows who.

453
00:26:53,680 --> 00:26:57,200
Or you could build dozens of new schools.

454
00:26:57,200 --> 00:27:00,040
So the choice is yours.

455
00:27:00,040 --> 00:27:02,120
But now you know my perspective.

456
00:27:02,120 --> 00:27:11,160
But anywho, so those are the sales predictions for 2022.

457
00:27:11,160 --> 00:27:14,840
And what I always tell people here in the Meetup group

458
00:27:14,840 --> 00:27:21,600
is always take numbers on these data points at face value.

459
00:27:21,600 --> 00:27:26,160
So CNBC has put forward their prediction, 31 billion,

460
00:27:26,160 --> 00:27:27,560
4 billion.

461
00:27:27,560 --> 00:27:30,200
But it would be nice to see, OK, what

462
00:27:30,200 --> 00:27:32,400
was their underlying data?

463
00:27:32,400 --> 00:27:35,160
What was their forecasting model?

464
00:27:35,160 --> 00:27:37,120
How did they come to these conclusions?

465
00:27:37,120 --> 00:27:40,080
What assumptions did they make along the way?

466
00:27:40,080 --> 00:27:42,920
And so that's what we do here at the Cannabis Data Science

467
00:27:42,920 --> 00:27:44,400
Meetup group.

468
00:27:44,400 --> 00:27:47,120
We're going to do an open source forecast.

469
00:27:47,120 --> 00:27:49,520
So open source, open data.

470
00:27:49,520 --> 00:27:51,480
So that way, all of you can check it.

471
00:27:51,480 --> 00:27:52,800
We can all improve it.

472
00:27:52,800 --> 00:27:57,880
And then we can apply the whole open source mentality,

473
00:27:57,880 --> 00:28:00,160
where we can improve upon each other.

474
00:28:00,160 --> 00:28:04,200
So what I'm putting forth is not even a good.

475
00:28:04,200 --> 00:28:09,280
It's just it's in a crude initial draft of a forecast.

476
00:28:09,280 --> 00:28:14,760
But I always like to say something, a forecast,

477
00:28:14,760 --> 00:28:19,240
any forecast is better than no forecast.

478
00:28:19,240 --> 00:28:22,560
You can debate that, but that's my opinion.

479
00:28:22,560 --> 00:28:29,960
So without further ado, let's build a forecasting model

480
00:28:29,960 --> 00:28:31,040
here.

481
00:28:31,040 --> 00:28:36,440
And so I always like to give a shout out to John Sylvia

482
00:28:36,440 --> 00:28:38,760
and Azair Iqbal.

483
00:28:38,760 --> 00:28:43,000
They were two professors of mine at UNC Charlotte.

484
00:28:43,000 --> 00:28:46,840
And they wrote a book, Economic and Business Forecasting.

485
00:28:46,840 --> 00:28:49,360
I just keep this with me at all times.

486
00:28:49,360 --> 00:28:53,320
This is a really simple, easy to read book

487
00:28:53,320 --> 00:28:57,360
that is a great foundation for forecasting.

488
00:28:57,360 --> 00:29:00,920
So just would highly recommend this book.

489
00:29:00,920 --> 00:29:05,200
And so long story short, for long term forecasting,

490
00:29:05,200 --> 00:29:08,720
which is 2022, that's a whole year ahead.

491
00:29:08,720 --> 00:29:11,240
So we've done eight theoretical models

492
00:29:11,240 --> 00:29:14,640
for short term forecasts, which are useful for,

493
00:29:14,640 --> 00:29:18,120
which are incredibly good for month ahead,

494
00:29:18,120 --> 00:29:20,480
three month ahead forecasts.

495
00:29:20,480 --> 00:29:24,960
But once you get beyond three months or so,

496
00:29:24,960 --> 00:29:27,000
we're looking at a year here.

497
00:29:27,000 --> 00:29:31,240
And you want a bit more of a structural model

498
00:29:31,240 --> 00:29:35,560
that can capture some economic relationships.

499
00:29:35,560 --> 00:29:41,560
At the same time, they recommend keep it sophisticatedly simple.

500
00:29:41,560 --> 00:29:48,160
So it shouldn't be too technical or complex.

501
00:29:48,160 --> 00:29:52,800
And it shouldn't ignore economic theory.

502
00:29:52,800 --> 00:29:56,240
So you basically want something simple,

503
00:29:56,240 --> 00:29:59,320
but still incorporates economic theory.

504
00:29:59,320 --> 00:30:03,240
So not too complicated, but not too simple

505
00:30:03,240 --> 00:30:05,720
in ignoring economics.

506
00:30:05,720 --> 00:30:09,280
And then reading a good book here,

507
00:30:09,280 --> 00:30:13,760
and John Maynard Keynes is known for a different quote

508
00:30:13,760 --> 00:30:15,520
about the long run.

509
00:30:15,520 --> 00:30:19,800
But I thought this quote was a bit more optimistic.

510
00:30:19,800 --> 00:30:24,400
So in the long run, almost anything is possible.

511
00:30:24,400 --> 00:30:29,880
So I thought this ties in today because we're talking

512
00:30:29,880 --> 00:30:32,040
about states coming online.

513
00:30:32,040 --> 00:30:35,640
So if we're just using an eight theoretical approach,

514
00:30:35,640 --> 00:30:37,920
it's going to be hard to forecast

515
00:30:37,920 --> 00:30:41,120
these states coming online.

516
00:30:41,120 --> 00:30:44,200
So that's just a bit of background.

517
00:30:44,200 --> 00:30:46,800
Here's the actual model.

518
00:30:46,800 --> 00:30:52,320
So essentially, this is super simple, probably too simple,

519
00:30:52,320 --> 00:30:56,960
probably more economic variables can be included.

520
00:30:56,960 --> 00:31:00,440
In their book, Sylvia and Iqbal say

521
00:31:00,440 --> 00:31:04,320
they're using forecasting models with hundreds of variables.

522
00:31:04,320 --> 00:31:08,520
So this has five.

523
00:31:08,520 --> 00:31:12,000
So we could probably add more variables here

524
00:31:12,000 --> 00:31:13,760
to get this more complex.

525
00:31:13,760 --> 00:31:16,000
But you got to start somewhere.

526
00:31:16,000 --> 00:31:18,480
So these were variables that we have.

527
00:31:18,480 --> 00:31:23,000
And we can forecast them into the future pretty well.

528
00:31:23,000 --> 00:31:26,080
So what are we saying here?

529
00:31:26,080 --> 00:31:32,560
I'm basically saying, OK, sales is a function of population.

530
00:31:32,560 --> 00:31:36,480
The more people you have in the state, the higher demand is,

531
00:31:36,480 --> 00:31:40,280
the higher sales are going to be.

532
00:31:40,280 --> 00:31:44,280
Whether or not the state has adult use or not,

533
00:31:44,280 --> 00:31:47,480
just empirically, we've seen that states that

534
00:31:47,480 --> 00:31:52,120
have enacted adult use see their sales increase.

535
00:31:52,120 --> 00:31:54,880
So there must be something going on there,

536
00:31:54,880 --> 00:31:56,440
perhaps on the supply side.

537
00:31:56,440 --> 00:31:59,840
Well, actually, I mean, that's both a supply and a demand

538
00:31:59,840 --> 00:32:02,720
side variable.

539
00:32:02,720 --> 00:32:06,680
Then a couple more things that I was going to include

540
00:32:06,680 --> 00:32:11,640
were the months that this date has permitted medicinal.

541
00:32:11,640 --> 00:32:16,120
And I was thinking this would factor in learning

542
00:32:16,120 --> 00:32:17,400
by doing.

543
00:32:17,400 --> 00:32:20,520
So as you were talking about in New York,

544
00:32:20,520 --> 00:32:23,440
New York's built up the infrastructure

545
00:32:23,440 --> 00:32:28,240
to be able to roll out adult use fairly quickly.

546
00:32:28,240 --> 00:32:32,680
So I was thinking the months that it's had medicinal

547
00:32:32,680 --> 00:32:36,200
may factor into the amount of sales.

548
00:32:36,200 --> 00:32:41,320
Similarly, the months that the state has had adult use.

549
00:32:41,320 --> 00:32:49,240
So for example, Oklahoma and Nevada,

550
00:32:49,240 --> 00:32:53,040
and probably more, were states that enacted adult use,

551
00:32:53,040 --> 00:32:57,120
but the market couldn't quite, the supply

552
00:32:57,120 --> 00:33:00,440
couldn't meet up with the demand right out of the gates.

553
00:33:00,440 --> 00:33:03,640
So cannabis takes some time to grow.

554
00:33:03,640 --> 00:33:09,880
So you need some months to go by for the supply to build up.

555
00:33:09,880 --> 00:33:13,880
Plus, we would expect in the long term,

556
00:33:13,880 --> 00:33:17,120
costs are going to be coming down.

557
00:33:17,120 --> 00:33:20,880
People are going to be inventing new technologies.

558
00:33:20,880 --> 00:33:25,680
So this is sort of an abstract way to cover that.

559
00:33:25,680 --> 00:33:30,160
And then I just tossed in a month dummy variable

560
00:33:30,160 --> 00:33:34,600
just to control for a bit of seasonality.

561
00:33:34,600 --> 00:33:38,800
And not every month is the same.

562
00:33:38,800 --> 00:33:45,520
So without further ado, let's go ahead and estimate

563
00:33:45,520 --> 00:33:48,240
this model here.

564
00:33:48,240 --> 00:33:51,840
So I'm going to just be using Python,

565
00:33:51,840 --> 00:33:54,400
but I always think the data is more

566
00:33:54,400 --> 00:33:56,480
interesting than the software.

567
00:33:56,480 --> 00:34:00,440
So this is just a means to an end.

568
00:34:00,440 --> 00:34:03,880
So for example, if you're going to write this in R,

569
00:34:03,880 --> 00:34:07,240
it should be possible.

570
00:34:07,240 --> 00:34:09,920
So this is just Python.

571
00:34:09,920 --> 00:34:15,040
So just going to read in a handful of useful packages,

572
00:34:15,040 --> 00:34:17,120
just to name a few.

573
00:34:17,120 --> 00:34:21,760
Matplotlib for plotting, Pandas for data wrangling,

574
00:34:21,760 --> 00:34:25,240
stats models for doing statistics,

575
00:34:25,240 --> 00:34:30,560
GrabPopulation from the Federal Reserve, FedFred.

576
00:34:30,560 --> 00:34:33,000
So those are some useful ones, and the rest

577
00:34:33,000 --> 00:34:37,240
are just sort of software manipulation.

578
00:34:37,240 --> 00:34:41,360
Just going to define a couple of handy functions.

579
00:34:41,360 --> 00:34:46,800
So this function will just get data from the Federal Reserve

580
00:34:46,800 --> 00:34:49,400
whenever we need population data.

581
00:34:49,400 --> 00:34:52,120
And then this is just a little helper function,

582
00:34:52,120 --> 00:34:57,640
just to find the months elapsed.

583
00:34:57,640 --> 00:35:00,880
And then just going to read in the API key

584
00:35:00,880 --> 00:35:05,360
and format the plots.

585
00:35:05,360 --> 00:35:10,360
OK, as I said, the data is more interesting than the software.

586
00:35:10,360 --> 00:35:14,560
So let's look at some of this data here.

587
00:35:14,560 --> 00:35:17,600
So in prior weeks, we've just primarily

588
00:35:17,600 --> 00:35:19,560
been working with Excel workbooks,

589
00:35:19,560 --> 00:35:23,440
just because it's pretty simple to create an Excel workbook

590
00:35:23,440 --> 00:35:24,920
and look at it.

591
00:35:24,920 --> 00:35:29,920
But all these software developers and data scientists

592
00:35:29,920 --> 00:35:31,320
probably are already familiar.

593
00:35:31,320 --> 00:35:37,520
This is JSON, so JavaScript Object Notation.

594
00:35:37,520 --> 00:35:40,680
And it's just another way of displaying data.

595
00:35:40,680 --> 00:35:47,720
And it's quite handy, because you can do nested data.

596
00:35:47,720 --> 00:35:50,640
So I love JSON.

597
00:35:50,640 --> 00:35:57,400
So I think it really helps to think about these objects.

598
00:35:57,400 --> 00:36:03,480
So I'd like to use the analogy of a duck or ducks.

599
00:36:03,480 --> 00:36:06,600
So these are all states.

600
00:36:06,600 --> 00:36:10,120
So say you've got a group of ducklings.

601
00:36:10,120 --> 00:36:13,280
You've got a whole bunch of ducklings.

602
00:36:13,280 --> 00:36:17,120
But each duck, each duckling, is going

603
00:36:17,120 --> 00:36:19,440
to have different properties.

604
00:36:19,440 --> 00:36:22,680
So these are all our states here.

605
00:36:22,680 --> 00:36:26,440
And we can keep track of different properties

606
00:36:26,440 --> 00:36:29,080
for all the different states.

607
00:36:29,080 --> 00:36:33,680
And what I've done is basically just indicated, OK,

608
00:36:33,680 --> 00:36:39,960
are the states permitting medicinal or adult use?

609
00:36:39,960 --> 00:36:43,760
And then when was medicinal permitted?

610
00:36:43,760 --> 00:36:46,560
And when was adult use permitted?

611
00:36:46,560 --> 00:36:50,640
And this is easier said than done.

612
00:36:50,640 --> 00:36:54,080
So this is where I say this can be improved upon.

613
00:36:54,080 --> 00:36:56,920
So I would actually highly recommend you do this,

614
00:36:56,920 --> 00:36:58,600
because I did this really quickly.

615
00:36:58,600 --> 00:37:03,520
And I still need to go back over and double check everything.

616
00:37:03,520 --> 00:37:06,960
But check out the source, because so for example,

617
00:37:06,960 --> 00:37:13,840
this just says, oh, California permitted medicinal in 1996.

618
00:37:13,840 --> 00:37:17,600
Well, we're going to have to hit the history books and newspapers

619
00:37:17,600 --> 00:37:22,080
and find out what month in 1996.

620
00:37:22,080 --> 00:37:25,720
And when the actual stores were opened.

621
00:37:25,720 --> 00:37:32,520
So I'm trying to set these dates when the actual stores were

622
00:37:32,520 --> 00:37:33,320
opened.

623
00:37:33,320 --> 00:37:36,920
So for example, our representative here

624
00:37:36,920 --> 00:37:42,400
from Colorado, I'm not certain I got these dates correct.

625
00:37:42,400 --> 00:37:45,120
I tried to do my best of identifying

626
00:37:45,120 --> 00:37:50,160
when dispensaries actually opened their doors.

627
00:37:50,160 --> 00:37:56,520
So if you notice any mistakes here, please correct.

628
00:37:56,520 --> 00:37:59,360
Because we've pointed out in the past

629
00:37:59,360 --> 00:38:03,880
that measurement error can lead to bias in your results.

630
00:38:03,880 --> 00:38:07,960
So if, say, some of these, I'm using the date

631
00:38:07,960 --> 00:38:10,360
where the legislation was passed,

632
00:38:10,360 --> 00:38:13,400
but not actually when the dispensaries were opened,

633
00:38:13,400 --> 00:38:15,440
that's measurement error and will

634
00:38:15,440 --> 00:38:17,600
lead to bias in our results.

635
00:38:17,600 --> 00:38:24,000
So this is our first assumption along the way, major assumption,

636
00:38:24,000 --> 00:38:29,440
is that sales start when things are permitted.

637
00:38:29,440 --> 00:38:35,920
But you'll see, if you start reading through these sources,

638
00:38:35,920 --> 00:38:43,280
you'll see that various states rolled things out quickly.

639
00:38:43,280 --> 00:38:45,680
And sometimes things were delayed.

640
00:38:45,680 --> 00:38:48,920
So maybe they permitted cannabis at a certain point,

641
00:38:48,920 --> 00:38:54,080
but the dispensaries didn't open until years later.

642
00:38:54,080 --> 00:38:56,720
So long story short, I've done my best

643
00:38:56,720 --> 00:39:00,640
to identify when each state permitted.

644
00:39:04,400 --> 00:39:07,720
And then the other data set we'll be using

645
00:39:07,720 --> 00:39:11,760
is this is basically all the sales data

646
00:39:11,760 --> 00:39:14,720
that we've collected in 2021.

647
00:39:14,720 --> 00:39:20,920
So we were collecting Colorado sales, Illinois sales,

648
00:39:20,920 --> 00:39:24,160
Massachusetts sales.

649
00:39:24,160 --> 00:39:27,880
Notice Massachusetts was closed for a month there.

650
00:39:27,880 --> 00:39:33,120
But anywho, there are still more data points to be collected.

651
00:39:33,120 --> 00:39:37,560
But these were the ones that we diligently collected.

652
00:39:37,560 --> 00:39:44,080
So as a good Bayesian, we can use our prior knowledge

653
00:39:44,080 --> 00:39:47,640
or prior data in our forecasts here.

654
00:39:47,640 --> 00:39:50,360
So basically, we'll be using the combination

655
00:39:50,360 --> 00:39:55,080
of historic sales data to train our model.

656
00:39:55,080 --> 00:39:58,840
And then we'll use these variables

657
00:39:58,840 --> 00:40:01,600
to try to predict for states where

658
00:40:01,600 --> 00:40:06,080
we don't have good historic data.

659
00:40:06,080 --> 00:40:09,400
What's the data source of the sales data?

660
00:40:09,400 --> 00:40:12,800
Yes, I still need to finish adding them.

661
00:40:12,800 --> 00:40:16,600
I've started to add some here on the sources tab.

662
00:40:16,600 --> 00:40:19,160
There's still some missing.

663
00:40:19,160 --> 00:40:21,800
They're scattered throughout our 2021.

664
00:40:21,800 --> 00:40:24,080
What I'll do is I've ended today.

665
00:40:24,080 --> 00:40:26,200
I'll update all these sources here.

666
00:40:26,200 --> 00:40:31,320
So that way, you can have the sources.

667
00:40:31,320 --> 00:40:35,560
But essentially, they're all public data sources

668
00:40:35,560 --> 00:40:41,000
where those various states are publishing various data points.

669
00:40:41,000 --> 00:40:43,920
Great, thanks.

670
00:40:43,920 --> 00:40:45,240
But good question.

671
00:40:45,240 --> 00:40:47,520
It's always important to show your sources.

672
00:40:47,520 --> 00:40:53,960
And that's what we're doing here, open data, open source.

673
00:40:53,960 --> 00:40:54,880
Awesome.

674
00:40:54,880 --> 00:40:58,880
So we've got our data.

675
00:40:58,880 --> 00:41:00,800
Let's go ahead and read it in.

676
00:41:04,200 --> 00:41:06,120
Awesome.

677
00:41:06,120 --> 00:41:09,040
Just take a quick look at it.

678
00:41:09,040 --> 00:41:09,640
There it is.

679
00:41:09,640 --> 00:41:11,880
There's the exact same data we were just looking at.

680
00:41:14,800 --> 00:41:20,000
Now, time to start doing data science.

681
00:41:20,000 --> 00:41:22,400
So one of my favorite things to do

682
00:41:22,400 --> 00:41:27,040
is create and retrieve supplementary variables,

683
00:41:27,040 --> 00:41:31,200
because you'd be surprised what you can calculate.

684
00:41:31,200 --> 00:41:37,400
So for example, we're going to basically use

685
00:41:37,400 --> 00:41:42,520
when the state permitted medicinal or adult use.

686
00:41:42,520 --> 00:41:46,840
That way, we can calculate how many months the states

687
00:41:46,840 --> 00:41:50,960
had medicinal or adult use.

688
00:41:50,960 --> 00:41:55,680
So don't want to spend too much time on the code.

689
00:41:55,680 --> 00:41:59,920
But basically, just checking, OK,

690
00:41:59,920 --> 00:42:02,680
is the state an adult use state?

691
00:42:02,680 --> 00:42:10,040
And then I'm calculating down here how many months elapsed

692
00:42:10,040 --> 00:42:13,280
since then for each point in time.

693
00:42:13,280 --> 00:42:19,320
And then just creating a set of month effects.

694
00:42:19,320 --> 00:42:25,520
So we get to look at an observation.

695
00:42:25,520 --> 00:42:28,960
So for example, here is Washington.

696
00:42:28,960 --> 00:42:36,920
The last data point we have for there is October of 2021.

697
00:42:36,920 --> 00:42:44,160
And we say that, OK, there's been 160 months of adult use

698
00:42:44,160 --> 00:42:47,760
and 285 months of medicinal use.

699
00:42:47,760 --> 00:42:54,560
And so we're hypothesizing that that has been the case

700
00:42:54,560 --> 00:43:00,880
and that that has an effect on total sales.

701
00:43:00,880 --> 00:43:05,640
So those are our main metrics here.

702
00:43:05,640 --> 00:43:12,280
And then month effects are basically just 0 or 1

703
00:43:12,280 --> 00:43:15,320
if it is a certain month.

704
00:43:15,320 --> 00:43:19,040
And you need a baseline comparison month.

705
00:43:19,040 --> 00:43:21,920
So I've excluded January.

706
00:43:21,920 --> 00:43:24,920
So basically, our month effects will tell us

707
00:43:24,920 --> 00:43:30,920
how different a given month is from January.

708
00:43:30,920 --> 00:43:37,920
So here, the parameter beta 5, it's actually

709
00:43:37,920 --> 00:43:46,640
beta 5 through 15 because we have to include 11 dummy

710
00:43:46,640 --> 00:43:48,400
variables.

711
00:43:48,400 --> 00:43:52,240
So if we're going to do a much coefficient, we'll say, OK,

712
00:43:52,240 --> 00:43:55,880
this is how much different February is from January,

713
00:43:55,880 --> 00:43:59,240
how much different July is from January.

714
00:43:59,240 --> 00:44:03,520
So the idea is you should be able to exclude

715
00:44:03,520 --> 00:44:08,400
any given month and our estimates

716
00:44:08,400 --> 00:44:10,760
for the other parameters should be the same.

717
00:44:14,680 --> 00:44:15,400
Should be.

718
00:44:15,400 --> 00:44:17,920
I have to double check my statistics here.

719
00:44:17,920 --> 00:44:21,920
Without further ado, let's estimate this regression model.

720
00:44:21,920 --> 00:44:23,600
So here it is.

721
00:44:23,600 --> 00:44:27,200
So it's basically just a constant, a coefficient

722
00:44:27,200 --> 00:44:32,720
on population, coefficient on a dummy variable, 0 or 1

723
00:44:32,720 --> 00:44:36,600
if the state has adult use, the months

724
00:44:36,600 --> 00:44:39,520
that it's been medicinal, the months

725
00:44:39,520 --> 00:44:42,120
that it's been adult use.

726
00:44:42,120 --> 00:44:46,200
And this is called an interaction with adult use.

727
00:44:46,200 --> 00:44:50,480
So basically, if the state doesn't have adult use,

728
00:44:50,480 --> 00:44:54,120
then this will just take a value of 0.

729
00:44:54,120 --> 00:44:57,200
I thought that was reasonable.

730
00:44:57,200 --> 00:45:01,800
And then the dummy variable for the month.

731
00:45:01,800 --> 00:45:09,720
So here is the regression of sales on these variables.

732
00:45:09,720 --> 00:45:14,480
And I just did this cursory earlier,

733
00:45:14,480 --> 00:45:18,040
so I actually haven't analyzed this regression.

734
00:45:18,040 --> 00:45:21,160
So just going to do it right here for you.

735
00:45:21,160 --> 00:45:25,560
So first things first, I'd like to look at the r squared.

736
00:45:25,560 --> 00:45:29,840
You can roughly interpret this as how much variation

737
00:45:29,840 --> 00:45:33,640
are we explaining with our variables.

738
00:45:33,640 --> 00:45:38,440
And so you would love an r squared close to 1

739
00:45:38,440 --> 00:45:42,840
because you're explaining almost all of the variation

740
00:45:42,840 --> 00:45:44,440
at that point.

741
00:45:44,440 --> 00:45:53,200
0.47 is actually not the worst for a structural regression.

742
00:45:53,200 --> 00:45:56,640
I've seen r squared is much lower,

743
00:45:56,640 --> 00:45:59,440
but it would also be nice to have an r squared much higher.

744
00:45:59,440 --> 00:46:05,880
So what you can do is you can start to add variables.

745
00:46:05,880 --> 00:46:10,040
And what you can do is you can actually,

746
00:46:10,040 --> 00:46:13,800
there are some statistical tests you can do to see,

747
00:46:13,800 --> 00:46:18,000
you can see, OK, let's try to add some variables

748
00:46:18,000 --> 00:46:23,800
and basically penalize ourselves for adding more variables

749
00:46:23,800 --> 00:46:26,520
while rewarding ourselves if we're

750
00:46:26,520 --> 00:46:29,320
able to explain more variation.

751
00:46:29,320 --> 00:46:33,800
So long story short, you can compare models.

752
00:46:33,800 --> 00:46:36,560
And I would highly recommend all of you to,

753
00:46:36,560 --> 00:46:44,600
maybe you can think of creative independent variables here

754
00:46:44,600 --> 00:46:46,400
that you would like to add.

755
00:46:46,400 --> 00:46:49,360
So personally, I would just start

756
00:46:49,360 --> 00:46:53,880
thinking of demand and supply side variables

757
00:46:53,880 --> 00:46:57,000
and see if we can't include those.

758
00:46:57,000 --> 00:47:01,120
Keep in mind, if you add a variable,

759
00:47:01,120 --> 00:47:06,920
you're going to have to forecast it into 2022.

760
00:47:06,920 --> 00:47:12,440
For population, I used the most naive forecast you can have.

761
00:47:12,440 --> 00:47:16,640
And I just said, OK, the population in 2022

762
00:47:16,640 --> 00:47:21,040
is going to be the same as it's going to be in 2021.

763
00:47:21,040 --> 00:47:25,360
Another big assumption, you can improve upon this model

764
00:47:25,360 --> 00:47:28,720
by forecasting population.

765
00:47:28,720 --> 00:47:33,200
So if you've, once again, it's going

766
00:47:33,200 --> 00:47:34,520
to add a bit more uncertainty.

767
00:47:34,520 --> 00:47:38,600
But it may result in a better forecast

768
00:47:38,600 --> 00:47:43,320
because it's a stretch to assume that population is going

769
00:47:43,320 --> 00:47:45,240
to remain constant in every single state

770
00:47:45,240 --> 00:47:46,440
across the country.

771
00:47:46,440 --> 00:47:47,680
That's not realistic.

772
00:47:47,680 --> 00:47:53,200
So I like to kind of preface the work I'm doing here

773
00:47:53,200 --> 00:47:54,400
in this meetup.

774
00:47:54,400 --> 00:47:56,920
It's a lot of a demonstration.

775
00:47:56,920 --> 00:48:01,480
So please take these numbers at face value

776
00:48:01,480 --> 00:48:05,240
and know that all of these forecasts can and should

777
00:48:05,240 --> 00:48:06,960
be improved upon.

778
00:48:06,960 --> 00:48:10,040
So for example, if you're actually

779
00:48:10,040 --> 00:48:12,360
doing something for the state of Colorado,

780
00:48:12,360 --> 00:48:17,760
definitely want a bit more rigor than I'm using today.

781
00:48:17,760 --> 00:48:24,280
But it's more just a thought exercise, a proof of concept

782
00:48:24,280 --> 00:48:25,840
that we're doing today.

783
00:48:25,840 --> 00:48:28,520
So just for the sake of time, just

784
00:48:28,520 --> 00:48:32,600
going to use these five variables plus the month

785
00:48:32,600 --> 00:48:40,480
effects and know that there's ways to improve upon this.

786
00:48:40,480 --> 00:48:46,560
So let's see if we can't do any rough interpretations

787
00:48:46,560 --> 00:48:48,560
of the data here.

788
00:48:48,560 --> 00:48:53,480
So for example, our coefficient on population

789
00:48:53,480 --> 00:49:00,320
is 6.6, so roughly speaking, for each additional person

790
00:49:00,320 --> 00:49:09,960
in a state, you would expect around $6.60 additional sales.

791
00:49:09,960 --> 00:49:14,800
I've seen a lot of people use adult population, right?

792
00:49:14,800 --> 00:49:18,480
Because just say somebody's born,

793
00:49:18,480 --> 00:49:22,760
they're not going to be consuming cannabis in 2022.

794
00:49:22,760 --> 00:49:28,360
So long story short, you could improve upon this

795
00:49:28,360 --> 00:49:33,160
by perhaps using a measure for adult population.

796
00:49:33,160 --> 00:49:38,960
Maybe I missed something, but where did you factor in price?

797
00:49:38,960 --> 00:49:41,320
I thought we were talking numbers.

798
00:49:41,320 --> 00:49:43,320
Excellent, excellent question.

799
00:49:43,320 --> 00:49:45,360
I'm not factoring in price.

800
00:49:45,360 --> 00:49:51,800
That could be and is an omitted variable.

801
00:49:51,800 --> 00:49:57,920
So as we've seen in states such as there's

802
00:49:57,920 --> 00:50:05,960
a large price discrepancy between prices in, say,

803
00:50:05,960 --> 00:50:11,240
the two states we have price data are Oregon

804
00:50:11,240 --> 00:50:15,560
and I want to say Massachusetts perhaps, or yes, in Maine.

805
00:50:15,560 --> 00:50:18,320
So we have prices in a handful of states

806
00:50:18,320 --> 00:50:22,680
and we could get prices in Washington if we wanted.

807
00:50:22,680 --> 00:50:24,440
Long story short is we don't really

808
00:50:24,440 --> 00:50:30,000
have price data for most of these states.

809
00:50:30,000 --> 00:50:33,920
So then I'm not clear what the population coefficient means.

810
00:50:33,920 --> 00:50:38,440
Would it be number of sales rather than dollars?

811
00:50:38,440 --> 00:50:39,520
It would just be dollars.

812
00:50:39,520 --> 00:50:46,320
So basically, our dependent variable here is sales.

813
00:50:46,320 --> 00:50:49,400
So this is just sales in dollars.

814
00:50:49,400 --> 00:50:56,720
So the total gross amount grossed by cannabis sales

815
00:50:56,720 --> 00:50:59,640
in a given state.

816
00:50:59,640 --> 00:51:01,000
Does that answer your question?

817
00:51:04,840 --> 00:51:06,640
So basically, what I'm saying here

818
00:51:06,640 --> 00:51:11,200
is for each additional person in the state,

819
00:51:11,200 --> 00:51:14,360
you would expect the total sales in the state

820
00:51:14,360 --> 00:51:18,880
to increase by $6.60.

821
00:51:18,880 --> 00:51:22,480
How much actual cannabis that is, like you said,

822
00:51:22,480 --> 00:51:24,520
depends on the price.

823
00:51:24,520 --> 00:51:31,720
So for example, $6 goes a lot further in Oregon

824
00:51:31,720 --> 00:51:35,840
than it does in Massachusetts or Maine

825
00:51:35,840 --> 00:51:40,120
because the prices in Oregon are like a third of what they

826
00:51:40,120 --> 00:51:42,680
are in Massachusetts and Maine.

827
00:51:42,680 --> 00:51:47,440
So long story short, I would love

828
00:51:47,440 --> 00:51:51,120
to include a measure of price for each state,

829
00:51:51,120 --> 00:51:55,040
like for example, maybe average price in a given state.

830
00:51:55,040 --> 00:51:58,440
I just don't have that data.

831
00:51:58,440 --> 00:52:03,200
And I would have to forecast that data in the 2022.

832
00:52:03,200 --> 00:52:08,000
So once again, there are companies

833
00:52:08,000 --> 00:52:10,320
out there that actually do a really good job

834
00:52:10,320 --> 00:52:13,320
at measuring price.

835
00:52:13,320 --> 00:52:15,600
A lot of that's private data.

836
00:52:15,600 --> 00:52:18,880
However, they could improve upon these forecasts

837
00:52:18,880 --> 00:52:22,240
by including a metric for price.

838
00:52:25,560 --> 00:52:26,600
Excellent point.

839
00:52:29,080 --> 00:52:33,880
Surprisingly, we have this negative coefficient

840
00:52:33,880 --> 00:52:36,480
on adult use.

841
00:52:36,480 --> 00:52:39,400
So that would say, oh, having adult use actually

842
00:52:39,400 --> 00:52:41,440
lowers your baseline.

843
00:52:41,440 --> 00:52:44,040
But we've got a positive coefficient

844
00:52:44,040 --> 00:52:46,880
on the months of adult use.

845
00:52:46,880 --> 00:52:49,880
So it seems that the more and more months

846
00:52:49,880 --> 00:52:53,360
that you have adult use, the higher and higher

847
00:52:53,360 --> 00:52:55,600
your sales are going to be.

848
00:52:55,600 --> 00:53:00,360
Similar for medicinal, however, the coefficient

849
00:53:00,360 --> 00:53:05,520
is smaller than adult use, around a third of the size.

850
00:53:05,520 --> 00:53:11,160
So the effect of having adult use is almost three times

851
00:53:11,160 --> 00:53:13,000
as great per month as medicinal.

852
00:53:17,560 --> 00:53:25,200
If you're a frequentist, they're looking fairly significant here.

853
00:53:25,200 --> 00:53:30,760
One thing that you'll see is not many of these month effects

854
00:53:30,760 --> 00:53:38,520
are statistically significant, except for perhaps July

855
00:53:38,520 --> 00:53:43,560
and August at the 95% confidence level.

856
00:53:43,560 --> 00:53:46,400
So basically, you really couldn't

857
00:53:46,400 --> 00:53:50,840
conclude that any month is different in sales.

858
00:53:50,840 --> 00:53:54,720
Keep in mind, our training data is limited.

859
00:53:54,720 --> 00:53:56,480
I showed you that earlier.

860
00:53:56,480 --> 00:54:04,600
Our training data doesn't include all of the states.

861
00:54:04,600 --> 00:54:08,640
So I need to expand our training data,

862
00:54:08,640 --> 00:54:14,800
because there is still public data that's not been collected.

863
00:54:14,800 --> 00:54:18,320
So that's another way to expand on our forecasts,

864
00:54:18,320 --> 00:54:24,240
is we can add more training data to just train a better

865
00:54:24,240 --> 00:54:26,680
and better model here.

866
00:54:26,680 --> 00:54:31,160
So long story short, I'm including them anyways.

867
00:54:31,160 --> 00:54:34,880
But you may want to try to re-estimate this model

868
00:54:34,880 --> 00:54:38,400
without month effects and see maybe

869
00:54:38,400 --> 00:54:41,240
you don't need these month effects.

870
00:54:41,240 --> 00:54:43,280
I'm including them because they're

871
00:54:43,280 --> 00:54:47,080
one thing that we can forecast with 100% certainty.

872
00:54:47,080 --> 00:54:50,720
We always know that in October of 2022,

873
00:54:50,720 --> 00:54:52,480
it's going to be October.

874
00:54:52,480 --> 00:54:55,040
So we know that with 100% certainty.

875
00:54:55,040 --> 00:55:03,280
So I think they're a nice go-to when forecasting.

876
00:55:03,280 --> 00:55:03,960
Awesome.

877
00:55:03,960 --> 00:55:06,400
So without further ado, just going

878
00:55:06,400 --> 00:55:09,600
to define the parameters here.

879
00:55:09,600 --> 00:55:13,600
And here's just a whole bunch of software.

880
00:55:13,600 --> 00:55:19,840
But basically, what I'm doing here is, OK,

881
00:55:19,840 --> 00:55:24,680
we've got this list of states here.

882
00:55:24,680 --> 00:55:31,440
So I'm just going to iterate over all of the states,

883
00:55:31,440 --> 00:55:35,200
find out when adult use was permitted in each state,

884
00:55:35,200 --> 00:55:38,120
when medicinal was permitted in each state,

885
00:55:38,120 --> 00:55:42,280
get the population of 2021, which was actually just

886
00:55:42,280 --> 00:55:44,640
published a few weeks ago.

887
00:55:44,640 --> 00:55:49,680
So long story short, we've got those variables for each state.

888
00:55:49,680 --> 00:55:56,280
And then I'm just going to iterate over our forecast

889
00:55:56,280 --> 00:56:00,280
horizon, which is 2022.

890
00:56:00,280 --> 00:56:04,840
So iterating over each state, iterating over each date,

891
00:56:04,840 --> 00:56:08,360
and then just a handful of if statements

892
00:56:08,360 --> 00:56:14,280
to figure out, OK, what are the variables we need to include.

893
00:56:14,280 --> 00:56:19,920
But nothing too fancy, really just

894
00:56:19,920 --> 00:56:23,000
this is where the magic happens, where we actually

895
00:56:23,000 --> 00:56:27,720
estimate the forecast, beta naught plus beta 1 times

896
00:56:27,720 --> 00:56:31,080
population, beta 2 times adult use,

897
00:56:31,080 --> 00:56:35,000
beta 3 medicinal, so on and so forth,

898
00:56:35,000 --> 00:56:39,320
just as we specified in our regression model.

899
00:56:39,320 --> 00:56:42,160
So let's do that.

900
00:56:42,160 --> 00:56:45,400
So basically, I'm just plugging in variables

901
00:56:45,400 --> 00:56:53,560
into the regression model that we just estimated.

902
00:56:53,560 --> 00:56:57,160
Takes it a handful of seconds because I didn't write

903
00:56:57,160 --> 00:57:00,240
the most optimal routine here.

904
00:57:00,240 --> 00:57:03,200
There's a lot of ways you can optimize this.

905
00:57:03,200 --> 00:57:08,160
But as Edward Tufts says, show the data.

906
00:57:08,160 --> 00:57:13,000
So here are, whoops, I didn't quite do that right.

907
00:57:18,120 --> 00:57:19,080
Give me one second here.

908
00:57:19,080 --> 00:57:20,160
I can fix this.

909
00:57:20,160 --> 00:57:31,760
Actually, I think all we need to do is add a time index to this.

910
00:57:43,880 --> 00:57:44,360
Hold on.

911
00:57:44,360 --> 00:57:45,840
Yes, question?

912
00:57:49,040 --> 00:57:50,080
Is there a question at hand?

913
00:57:52,800 --> 00:58:03,440
OK, so we may have to just save this part because we can still

914
00:58:03,440 --> 00:58:05,640
look at, I'm going to try this one other thing

915
00:58:05,640 --> 00:58:08,600
and then we'll just move on.

916
00:58:08,600 --> 00:58:09,120
Awesome.

917
00:58:09,120 --> 00:58:13,400
Just had to add the index just to kind of line everything up.

918
00:58:13,400 --> 00:58:17,880
So this isn't actually the most informative chart ever.

919
00:58:17,880 --> 00:58:20,720
Was there a question?

920
00:58:20,720 --> 00:58:22,240
Some people have to jump.

921
00:58:22,240 --> 00:58:24,080
And I always try to wrap it up.

922
00:58:24,080 --> 00:58:25,840
So I'll be wrapping this up shortly.

923
00:58:25,840 --> 00:58:28,240
And then if you do have to draw, you

924
00:58:28,240 --> 00:58:32,080
can check out the recording on YouTube.

925
00:58:32,080 --> 00:58:35,880
So I'm going to try to add a time index.

926
00:58:35,880 --> 00:58:39,040
And then I'm going to try to do some more

927
00:58:39,040 --> 00:58:42,040
of the recording on YouTube.

928
00:58:42,040 --> 00:58:47,440
So not the best plot in the world because there's

929
00:58:47,440 --> 00:58:49,880
just so many states.

930
00:58:49,880 --> 00:58:55,120
And I think I'm reusing some colors over and over again.

931
00:58:55,120 --> 00:59:00,720
But we've got our forecast for California on top.

932
00:59:00,720 --> 00:59:06,920
Oh, yes, let's just list these because I think the list is

933
00:59:06,920 --> 00:59:10,680
a bit more informative than this big plot.

934
00:59:10,680 --> 00:59:19,400
So here's just the predicted sales by state.

935
00:59:19,400 --> 00:59:25,480
And I'm going to order this by the most sales.

936
00:59:25,480 --> 00:59:29,360
So that way, it's a little easier to read here.

937
00:59:29,360 --> 00:59:32,760
And as you'll notice, our forecasting model

938
00:59:32,760 --> 00:59:35,560
has some wonky predictions.

939
00:59:35,560 --> 00:59:43,160
So basically, Texas does technically have medicinal use.

940
00:59:43,160 --> 00:59:46,280
And they've got a giant population.

941
00:59:46,280 --> 00:59:50,160
So because of the way our forecasting model was

942
00:59:50,160 --> 00:59:56,080
constructed, Texas is taking the number two sales.

943
00:59:56,080 --> 01:00:00,240
I don't think this is realistic per se.

944
01:00:00,240 --> 01:00:03,840
So I think this can be improved upon.

945
01:00:03,840 --> 01:00:06,560
So for example, Florida is number three.

946
01:00:06,560 --> 01:00:10,000
Once again, I'm not certain about this.

947
01:00:10,000 --> 01:00:13,040
But we don't really have good data from Florida.

948
01:00:13,040 --> 01:00:17,800
So if you look at our regression model, all we're really using

949
01:00:17,800 --> 01:00:22,400
is their population, whether they're allowing adult use,

950
01:00:22,400 --> 01:00:25,160
whether they're allowing medicinal use.

951
01:00:25,160 --> 01:00:30,960
So our model can probably be improved upon.

952
01:00:30,960 --> 01:00:33,240
But really, the population is what's

953
01:00:33,240 --> 01:00:38,240
carrying our estimations here.

954
01:00:38,240 --> 01:00:41,560
But like I said, any forecast, I think,

955
01:00:41,560 --> 01:00:43,800
is better than no forecast.

956
01:00:43,800 --> 01:00:50,320
So we're way underestimating Colorado

957
01:00:50,320 --> 01:00:54,520
because we have done eight theoretical forecasts.

958
01:00:54,520 --> 01:00:58,040
And I think our eight theoretical forecast

959
01:00:58,040 --> 01:01:06,640
was that Colorado would be around 1.6 billion plus in 2022.

960
01:01:06,640 --> 01:01:13,000
So this model has some huge shortcomings.

961
01:01:13,000 --> 01:01:19,240
But we just needed to sort of do something

962
01:01:19,240 --> 01:01:24,440
because we just wanted to get a number for each state here.

963
01:01:24,440 --> 01:01:27,360
Because now we can compare our estimates

964
01:01:27,360 --> 01:01:31,520
to CNBC's estimates.

965
01:01:31,520 --> 01:01:37,360
So given our crude forecasting model,

966
01:01:37,360 --> 01:01:45,800
we're predicting 1.5 billion in New York in 2022.

967
01:01:45,800 --> 01:01:49,640
And if you look at my assumptions here,

968
01:01:49,640 --> 01:01:56,040
that is under the assumption that New York permits

969
01:01:56,040 --> 01:02:01,160
adult use on October 1.

970
01:02:01,160 --> 01:02:04,320
My source was this website, which

971
01:02:04,320 --> 01:02:09,760
suggested that New York would do it near the end of 2022.

972
01:02:09,760 --> 01:02:12,880
So that's an assumption built into the model

973
01:02:12,880 --> 01:02:16,840
that can be improved upon.

974
01:02:16,840 --> 01:02:22,000
But long story short, our total estimate for the US,

975
01:02:22,000 --> 01:02:29,480
given all our imperfections along the way, is 26.9 billion.

976
01:02:29,480 --> 01:02:34,160
So now we actually have a number to put on it.

977
01:02:34,160 --> 01:02:41,760
And I would like to say that now we can actually say, OK,

978
01:02:41,760 --> 01:02:44,240
whose model predicts better?

979
01:02:44,240 --> 01:02:46,040
Because we can actually tell.

980
01:02:46,040 --> 01:02:49,600
So we can wait till the end of 2022.

981
01:02:49,600 --> 01:02:56,760
And we can basically calculate the root mean squared error.

982
01:02:56,760 --> 01:03:01,080
And we can basically see whose model predicts better.

983
01:03:01,080 --> 01:03:04,520
Does CNBC's model predict better?

984
01:03:04,520 --> 01:03:08,680
Does this crude model predict better?

985
01:03:08,680 --> 01:03:12,600
Or perhaps you.

986
01:03:12,600 --> 01:03:14,880
You can create your own forecast.

987
01:03:14,880 --> 01:03:16,800
And so I was thinking about maybe we

988
01:03:16,800 --> 01:03:18,640
could do some sort of competition,

989
01:03:18,640 --> 01:03:23,920
where we all could make our own forecasting model.

990
01:03:23,920 --> 01:03:25,920
Maybe we could do a mix.

991
01:03:25,920 --> 01:03:29,920
So I think that may be the best sort of mixed model, where

992
01:03:29,920 --> 01:03:36,000
we use the a theoretical model when we do have sales data,

993
01:03:36,000 --> 01:03:38,800
such as in Colorado.

994
01:03:38,800 --> 01:03:41,720
And then we can use the a theoretical.

995
01:03:41,720 --> 01:03:43,640
I mean, we can use the structural model,

996
01:03:43,640 --> 01:03:48,080
the theoretical model, for states where we don't have good data.

997
01:03:50,720 --> 01:03:55,840
A lot of the states, we don't have fantastic data.

998
01:03:55,840 --> 01:03:57,320
So we still need to look a bit more.

999
01:03:57,320 --> 01:04:01,920
But long story short is a lot of this

1000
01:04:01,920 --> 01:04:05,160
is chalked up to medicinal sales,

1001
01:04:05,160 --> 01:04:09,960
which I'm skeptical will be as high as our estimates are.

1002
01:04:09,960 --> 01:04:14,520
But any estimate's better than no estimate.

1003
01:04:14,520 --> 01:04:18,280
So there we have the numbers.

1004
01:04:18,280 --> 01:04:20,280
We've run a little long.

1005
01:04:20,280 --> 01:04:23,720
So I'm going to go ahead and end the presentation.

1006
01:04:23,720 --> 01:04:26,600
But does anybody have any thoughts or comments?

1007
01:04:26,600 --> 01:04:28,200
Yeah, I'm a little overwhelmed.

1008
01:04:28,200 --> 01:04:31,280
I mean, it's a lot.

1009
01:04:31,280 --> 01:04:38,360
But my first thought is why concatenate medicinal

1010
01:04:38,360 --> 01:04:44,200
and recreational use just to get a total number

1011
01:04:44,200 --> 01:04:50,120
or because they're kind of different.

1012
01:04:50,120 --> 01:04:57,000
Wouldn't a more granular focus?

1013
01:04:57,000 --> 01:04:59,320
I mean, yeah, you want to have an overall number, I guess,

1014
01:04:59,320 --> 01:05:02,360
the farmers, because they're going to be producing it.

1015
01:05:02,360 --> 01:05:07,200
But wouldn't a more granular take on it help?

1016
01:05:07,200 --> 01:05:09,200
Yes, 100%.

1017
01:05:09,200 --> 01:05:11,680
And that sort of brings us back to sort of this day to day

1018
01:05:11,680 --> 01:05:14,480
luge, where I think you're 100% right.

1019
01:05:14,480 --> 01:05:15,480
I think you're right.

1020
01:05:15,480 --> 01:05:16,480
I think you're right.

1021
01:05:16,480 --> 01:05:20,440
I think it would be good to have an estimate of adult use

1022
01:05:20,440 --> 01:05:23,800
in medicinal because I think they're different things.

1023
01:05:23,800 --> 01:05:28,600
I was primarily doing this just to compare to CNBC's number

1024
01:05:28,600 --> 01:05:34,920
because they said that was their estimate of total legal sales.

1025
01:05:34,920 --> 01:05:38,920
So medicinal sales, they're still legal.

1026
01:05:38,920 --> 01:05:42,440
So I was including them.

1027
01:05:42,440 --> 01:05:44,920
And it's really important to have

1028
01:05:44,920 --> 01:05:49,600
a set of estimates for states with medicinal use.

1029
01:05:49,600 --> 01:05:53,680
And it's really tough to distinguish these, right?

1030
01:05:53,680 --> 01:05:57,680
Because, for example, Oklahoma is a medicinal state,

1031
01:05:57,680 --> 01:06:03,160
but it's fairly permissible, whereas other states may only

1032
01:06:03,160 --> 01:06:10,280
permit, they may have heavily restricted medicinal programs.

1033
01:06:10,280 --> 01:06:14,880
So that's why I was saying a lot of these estimates

1034
01:06:14,880 --> 01:06:17,520
because I don't know how much faith I would put in those

1035
01:06:17,520 --> 01:06:22,320
forecasts because I think there's

1036
01:06:22,320 --> 01:06:30,400
a big discrepancies in how the medicinal programs behave

1037
01:06:30,400 --> 01:06:31,800
from state to state.

1038
01:06:31,800 --> 01:06:36,400
So if there's a way to capture that in a variable,

1039
01:06:36,400 --> 01:06:39,800
then we should all brainstorm and we could probably

1040
01:06:39,800 --> 01:06:40,600
improve upon them all.

1041
01:06:40,600 --> 01:06:42,800
But I'll just leave you with this.

1042
01:06:42,800 --> 01:06:44,480
This is a starting point.

1043
01:06:44,480 --> 01:06:45,960
It's crude.

1044
01:06:45,960 --> 01:06:48,280
It's not set in stone.

1045
01:06:48,280 --> 01:06:50,200
It's imperfect.

1046
01:06:50,200 --> 01:06:51,880
It's just a starting point.

1047
01:06:51,880 --> 01:06:54,200
So I'm just putting that out there.

1048
01:06:54,200 --> 01:06:57,000
Just this is what we have.

1049
01:06:57,000 --> 01:06:59,560
So I think it's better than nothing.

1050
01:06:59,560 --> 01:07:03,160
I think it can definitely get a whole lot better.

1051
01:07:03,160 --> 01:07:06,600
So definitely hedging it.

1052
01:07:06,600 --> 01:07:13,800
So now you saw the process and you can see, OK,

1053
01:07:13,800 --> 01:07:16,200
there's a lot of assumptions built into this.

1054
01:07:16,200 --> 01:07:18,360
The model is imperfect.

1055
01:07:18,360 --> 01:07:21,560
The data is imperfect.

1056
01:07:21,560 --> 01:07:24,400
But it's all transparent.

1057
01:07:24,400 --> 01:07:26,840
I want to say that there's a lot to get your teeth into with

1058
01:07:26,840 --> 01:07:27,840
this.

1059
01:07:27,840 --> 01:07:28,840
Yes.

1060
01:07:28,840 --> 01:07:32,680
And so instead of just seeing, oh, $31 billion

1061
01:07:32,680 --> 01:07:35,440
and just having to take that at the expense of the state,

1062
01:07:35,440 --> 01:07:38,080
just having to take that at face value,

1063
01:07:38,080 --> 01:07:42,800
you can dive into all the different layers that

1064
01:07:42,800 --> 01:07:43,880
are involved.

1065
01:07:43,880 --> 01:07:45,560
You've got to have a baseline somewhere.

1066
01:07:45,560 --> 01:07:48,160
So here we are.

1067
01:07:48,160 --> 01:07:49,120
Exactly.

1068
01:07:49,120 --> 01:07:51,960
And the cool thing about forecasting

1069
01:07:51,960 --> 01:07:56,440
is we can check it, if we can get the data, that is.

1070
01:07:56,440 --> 01:08:03,200
So as 2022 unfolds, we can see how right or wrong we were

1071
01:08:03,200 --> 01:08:05,400
and adjust our model.

1072
01:08:05,400 --> 01:08:07,160
And that's part of the process.

1073
01:08:07,160 --> 01:08:10,520
So we've got to learn from our mistakes.

1074
01:08:10,520 --> 01:08:13,600
And inevitably, we've made some mistakes.

1075
01:08:13,600 --> 01:08:17,560
So the model can be improved upon.

1076
01:08:17,560 --> 01:08:20,400
And we can keep at it.

1077
01:08:20,400 --> 01:08:25,120
So for example, as January comes, some of these states

1078
01:08:25,120 --> 01:08:27,360
are coming online.

1079
01:08:27,360 --> 01:08:39,280
What was the state that just permitted here?

1080
01:08:39,280 --> 01:08:40,680
Delaware.

1081
01:08:40,680 --> 01:08:43,080
Yeah, Delaware and, oh, yes, Montana

1082
01:08:43,080 --> 01:08:45,600
just permitted adult use.

1083
01:08:45,600 --> 01:08:49,440
So as we see these various states unfolding

1084
01:08:49,440 --> 01:08:52,120
and collect more data, then each month

1085
01:08:52,120 --> 01:08:56,000
we can revise our forecasts for each month ahead.

1086
01:08:56,000 --> 01:08:58,560
So we can measure ourselves as we go

1087
01:08:58,560 --> 01:09:02,480
and then make better and better forecasts for the future.

1088
01:09:02,480 --> 01:09:06,720
So it's the best we can do.

1089
01:09:06,720 --> 01:09:09,160
I'd just like to thank you all for coming.

1090
01:09:09,160 --> 01:09:11,800
Thank you all for staying a little extra.

1091
01:09:11,800 --> 01:09:13,240
I know your time's precious.

1092
01:09:13,240 --> 01:09:16,960
So I'll go ahead and wrap it up here.

1093
01:09:16,960 --> 01:09:18,440
Thank you all for coming, though.

1094
01:09:18,440 --> 01:09:18,960
Thank you.

1095
01:09:18,960 --> 01:09:19,760
See you.

1096
01:09:19,760 --> 01:09:20,360
It was great.

1097
01:09:20,360 --> 01:09:21,080
Thank you.

1098
01:09:21,080 --> 01:09:22,040
Awesome.

1099
01:09:22,040 --> 01:09:23,360
Have a productive week.

1100
01:09:23,360 --> 01:09:26,280
Thank you for notes to the grindstone, and I'll see you next week.

1101
01:09:26,280 --> 01:09:53,920
Thank you.

