1
00:00:00,000 --> 00:00:14,520
Alright, so let's begin.

2
00:00:14,520 --> 00:00:17,540
So just a little bit of background here, Alice.

3
00:00:17,540 --> 00:00:20,220
So I'll put in the chat.

4
00:00:20,220 --> 00:00:27,340
So we've been working with some of the Washington State data.

5
00:00:27,340 --> 00:00:36,720
So in specific, you can make public records requests and actually get the traceability

6
00:00:36,720 --> 00:00:38,400
data dumps.

7
00:00:38,400 --> 00:00:52,000
So for example, let me find the specific link and I will put it in the chat.

8
00:00:52,000 --> 00:01:05,400
So here you can access, that would be a snapshot of what the data would look like in December.

9
00:01:05,400 --> 00:01:10,400
So when was your group created and what's the goal of the group?

10
00:01:10,400 --> 00:01:17,640
So the group is just to sort of share what we're working on.

11
00:01:17,640 --> 00:01:23,920
So all of us are sort of working tangentially with the cannabis industry.

12
00:01:23,920 --> 00:01:29,880
And so we're trying to do some analysis to just get a better understanding of the industry

13
00:01:29,880 --> 00:01:32,160
and of particular markets.

14
00:01:32,160 --> 00:01:40,460
So it's a group to bring together like-minded people that want to talk about cannabis data.

15
00:01:40,460 --> 00:01:44,000
And so it's actually a pretty open-ended group.

16
00:01:44,000 --> 00:01:48,640
So at the moment, I've just been sharing some of the work I've been doing.

17
00:01:48,640 --> 00:01:56,000
Charles has been doing a lot of data cleaning on some of the data we've been able to come

18
00:01:56,000 --> 00:01:57,400
across.

19
00:01:57,400 --> 00:02:03,040
And then some people have come in that were interested in APIs.

20
00:02:03,040 --> 00:02:10,800
And today I was going to share with you some data that I was looking at in Colorado a few

21
00:02:10,800 --> 00:02:16,520
years ago and see if anyone's interested in that data.

22
00:02:16,520 --> 00:02:21,920
And if so, then it may be worthwhile updating it with the most recent data.

23
00:02:21,920 --> 00:02:25,200
And are you both involved in the cannabis industry?

24
00:02:25,200 --> 00:02:26,200
Yes.

25
00:02:26,200 --> 00:02:35,000
Well, I can't speak for Charles, but I've started a company, Cannelytics, that provides

26
00:02:35,000 --> 00:02:36,480
cannabis analytics.

27
00:02:36,480 --> 00:02:40,920
Primarily, we focus on providing solutions to laboratories.

28
00:02:40,920 --> 00:02:47,400
So we're providing an end-to-end solution for laboratories that need a laboratory information

29
00:02:47,400 --> 00:02:48,680
management system.

30
00:02:48,680 --> 00:02:55,720
And we also provide cannabis analytics to producers, retailers, and consumers.

31
00:02:55,720 --> 00:03:03,720
And then just part-time, I do some data scientist work for a company in San Francisco that does

32
00:03:03,720 --> 00:03:06,840
supply chain management in the cannabis industry.

33
00:03:06,840 --> 00:03:07,840
OK.

34
00:03:07,840 --> 00:03:11,120
And what about you, Charles?

35
00:03:11,120 --> 00:03:13,000
I'm just interested in data science.

36
00:03:13,000 --> 00:03:17,280
And this is a unique data set that not a lot of people have studied.

37
00:03:17,280 --> 00:03:23,360
And so I'm interested in looking at it and seeing what I can find.

38
00:03:23,360 --> 00:03:24,360
Awesome.

39
00:03:24,360 --> 00:03:29,840
Well, speaking of which, let me go ahead and...

40
00:03:29,840 --> 00:03:50,640
Let me share with you some more resources.

41
00:03:50,640 --> 00:03:58,600
So I guess I'll just go ahead and start sharing my screen, and we can kick it off for today.

42
00:03:58,600 --> 00:04:01,680
All right.

43
00:04:01,680 --> 00:04:07,080
All right.

44
00:04:07,080 --> 00:04:20,120
So some of the data sources that I've been looking at were...

45
00:04:20,120 --> 00:04:31,640
A while back, I had started writing just this paper, and I was looking at wages.

46
00:04:31,640 --> 00:04:48,560
And I recently saw this post by Leafly.

47
00:04:48,560 --> 00:05:14,320
And I'll also put this link in the chat.

48
00:05:14,320 --> 00:05:18,480
So long story short, it's a post by Leafly.

49
00:05:18,480 --> 00:05:27,160
And they're talking about how there's a large number of jobs in the cannabis industry.

50
00:05:27,160 --> 00:05:29,320
It's a growing field.

51
00:05:29,320 --> 00:05:35,120
And that was an area that I'd looked at previously.

52
00:05:35,120 --> 00:05:41,560
So I thought it would be worthwhile to do some analysis.

53
00:05:41,560 --> 00:05:49,720
So the first place we need is just a nice source of data.

54
00:05:49,720 --> 00:05:58,400
And for no particular reason, but I just chose Colorado.

55
00:05:58,400 --> 00:06:04,600
Their cannabis industry is fairly well established, and they have a fairly good source of public

56
00:06:04,600 --> 00:06:06,160
data.

57
00:06:06,160 --> 00:06:19,280
So I'll put these links in the chat here, but there are some...

58
00:06:19,280 --> 00:06:34,920
There's some good data about the actual number of licensees in Colorado.

59
00:06:34,920 --> 00:06:37,840
It's been legal there for quite a while.

60
00:06:37,840 --> 00:06:38,840
Right?

61
00:06:38,840 --> 00:06:39,840
Exactly.

62
00:06:39,840 --> 00:06:42,480
So they've had medicinal...

63
00:06:42,480 --> 00:06:46,600
I don't want to say how long, but probably more than 10 years.

64
00:06:46,600 --> 00:06:56,080
And then the recreational dates back to really the start of 2015.

65
00:06:56,080 --> 00:06:58,160
So there's more data there.

66
00:06:58,160 --> 00:06:59,160
Exactly.

67
00:06:59,160 --> 00:07:14,120
So just taking a quick look at some of this data, you can actually go to the API docs.

68
00:07:14,120 --> 00:07:24,520
And so this is useful because you can interact with this data in your favorite programming

69
00:07:24,520 --> 00:07:25,520
language.

70
00:07:25,520 --> 00:07:31,440
So they even have all these code snippets down here at the bottom.

71
00:07:31,440 --> 00:07:33,720
I just tend to use Python.

72
00:07:33,720 --> 00:07:41,160
So I'm just going to just show you just a quick example of how to hit this endpoint

73
00:07:41,160 --> 00:07:46,160
in Spyder just to show you how to get some licensee data.

74
00:07:46,160 --> 00:07:49,760
So it's fairly simple.

75
00:07:49,760 --> 00:07:59,960
I'll be using Socrata's development tools.

76
00:07:59,960 --> 00:08:04,560
And then I'm just going to use an unauthenticated client.

77
00:08:04,560 --> 00:08:11,340
However, you can get an API key just for authentication.

78
00:08:11,340 --> 00:08:22,760
Just for authentication, that way you don't have to be throttled.

79
00:08:22,760 --> 00:08:33,720
And then the way you really work with the endpoints is there's a unique ID for a given

80
00:08:33,720 --> 00:08:35,400
data set.

81
00:08:35,400 --> 00:08:43,360
So for example, when we're looking here at this data set, you'll see this ID right there

82
00:08:43,360 --> 00:08:47,520
at the end of the URL.

83
00:08:47,520 --> 00:08:56,680
And then likewise, the same ID.

84
00:08:56,680 --> 00:09:01,760
And then you can just use standard requests.

85
00:09:01,760 --> 00:09:09,440
But in this case, I'll just use the client, and I'm just going to get maybe just 10 observations

86
00:09:09,440 --> 00:09:10,440
here.

87
00:09:10,440 --> 00:09:15,080
So you can make a request.

88
00:09:15,080 --> 00:09:25,520
I'm not exactly sure what type of response object this is.

89
00:09:25,520 --> 00:09:41,160
But we can convert it to a Pandas data frame.

90
00:09:41,160 --> 00:09:58,080
And so then you can look at observations.

91
00:09:58,080 --> 00:10:04,680
So I didn't actually save this data to the GitHub repository, because it does have people's

92
00:10:04,680 --> 00:10:05,880
actual names.

93
00:10:05,880 --> 00:10:18,760
So I guess that's public information, but it's of no interest to me.

94
00:10:18,760 --> 00:10:31,240
I essentially just went and did a count of the number of licensees by county.

95
00:10:31,240 --> 00:10:33,600
I think you get people's city.

96
00:10:33,600 --> 00:10:37,560
So from the city, you can deduce the county.

97
00:10:37,560 --> 00:10:45,360
And then you can just get a nice county count.

98
00:10:45,360 --> 00:10:48,200
There's other fields here that may be of interest.

99
00:10:48,200 --> 00:10:52,840
Not many people fill them out, but there are degrees.

100
00:10:52,840 --> 00:11:12,840
So some people will list their degrees and their

101
00:11:12,840 --> 00:11:14,500
specialties.

102
00:11:14,500 --> 00:11:34,960
So that could be an interesting analysis to look at what specialties and degrees people

103
00:11:34,960 --> 00:11:39,920
in the cannabis industry tend to fall into.

104
00:11:39,920 --> 00:11:48,200
Because there's people from all walks of life in the cannabis industry.

105
00:11:48,200 --> 00:11:56,480
So just to get back on track here, just to show you the actual work I was doing here.

106
00:11:56,480 --> 00:12:07,280
So I'll actually show you the data.

107
00:12:07,280 --> 00:12:10,560
So I did recently...

108
00:12:10,560 --> 00:12:24,240
Sorry, can I ask you in this data set, so professional and occupational license, are

109
00:12:24,240 --> 00:12:28,080
we talking about people who work for a certain business or are we talking about the businesses

110
00:12:28,080 --> 00:12:29,080
themselves?

111
00:12:29,080 --> 00:12:32,680
I'm looking at the data from right now.

112
00:12:32,680 --> 00:12:40,080
So in Colorado, you have to have a license to work in the cannabis industry.

113
00:12:40,080 --> 00:12:43,720
So these are employee specific licenses.

114
00:12:43,720 --> 00:12:49,600
So I think they're tied to a specific person.

115
00:12:49,600 --> 00:12:56,040
So yeah, is there somewhere where it says the company that they work in, if they are

116
00:12:56,040 --> 00:12:57,400
working in the company?

117
00:12:57,400 --> 00:13:03,040
I was looking at that and I couldn't find that because that would be...

118
00:13:03,040 --> 00:13:18,280
Like you kind of hit the nail on the head, that would be an incredibly interesting breakdown

119
00:13:18,280 --> 00:13:22,320
to look at firm size.

120
00:13:22,320 --> 00:13:23,320
Right?

121
00:13:23,320 --> 00:13:24,320
Yeah.

122
00:13:24,320 --> 00:13:30,240
I mean, in the leafy graph that you had, the infographics, it was...

123
00:13:30,240 --> 00:13:33,480
Or maybe in one of your slides, you say you have...

124
00:13:33,480 --> 00:13:36,680
The company has between one and 10 employees or...

125
00:13:36,680 --> 00:13:37,920
Oh, yes.

126
00:13:37,920 --> 00:13:43,160
So now I was going to kind of show you some of the...

127
00:13:43,160 --> 00:13:46,760
This is some of the research that I was doing.

128
00:13:46,760 --> 00:13:52,920
So we're here talking about data science applied to the cannabis industry.

129
00:13:52,920 --> 00:14:03,560
And so the way you can really capitalize and make interesting observations is to combine

130
00:14:03,560 --> 00:14:05,360
data sets.

131
00:14:05,360 --> 00:14:12,280
So you can get pretty rich data out of one data set.

132
00:14:12,280 --> 00:14:18,600
But now if we start to combine data sets, we may have to make some approximations and

133
00:14:18,600 --> 00:14:25,160
estimations, but we can start to get some insights here.

134
00:14:25,160 --> 00:14:35,440
So this was a survey that was done in 2014.

135
00:14:35,440 --> 00:14:37,720
So this is going to be dated.

136
00:14:37,720 --> 00:14:43,200
And plus, I did this research in 2017.

137
00:14:43,200 --> 00:14:50,160
So the things I'm sharing with you are almost four years old.

138
00:14:50,160 --> 00:14:52,240
And so it would be worthwhile...

139
00:14:52,240 --> 00:14:59,040
And that's why I'm kind of working on them is because these are things I did four years

140
00:14:59,040 --> 00:15:06,040
ago and it may be worthwhile to touch them up with current data.

141
00:15:06,040 --> 00:15:11,120
So that's why I'm sharing it with you in case any of you want to work on this type of stuff

142
00:15:11,120 --> 00:15:12,120
as well.

143
00:15:12,120 --> 00:15:23,680
But long story short here, I found this work and wellbeing survey and I'll have to share

144
00:15:23,680 --> 00:15:27,880
this link after the presentation.

145
00:15:27,880 --> 00:15:35,880
But they found this breakdown where you'd see most company...

146
00:15:35,880 --> 00:15:42,320
The majority of companies have fewer than 30 employees, so you would really consider

147
00:15:42,320 --> 00:15:45,120
that a small business.

148
00:15:45,120 --> 00:15:57,520
And then there are a handful of large companies with greater than 50 employees.

149
00:15:57,520 --> 00:16:08,560
But this is probably not the typical breakdown that you would see in a typical industry.

150
00:16:08,560 --> 00:16:11,480
Was somebody going to say something?

151
00:16:11,480 --> 00:16:20,880
But I'm just conjecturing, but I would imagine that in other industries, you may see employees

152
00:16:20,880 --> 00:16:24,000
typically working for large companies.

153
00:16:24,000 --> 00:16:26,120
But that's just a conjecture.

154
00:16:26,120 --> 00:16:31,520
I'd actually have to look at the data.

155
00:16:31,520 --> 00:16:39,240
So just to kind of keep moving on here with some of these other stats that I was breaking

156
00:16:39,240 --> 00:16:40,240
down.

157
00:16:40,240 --> 00:16:48,480
And before I get too far along here, I was just going to basically show you that if you

158
00:16:48,480 --> 00:16:56,400
go here to the GitHub, I've got all these stats.

159
00:16:56,400 --> 00:17:07,040
So here I'll put this link in the chat.

160
00:17:07,040 --> 00:17:14,740
And just to give you an idea of what this data looks like.

161
00:17:14,740 --> 00:17:25,820
This is old data, and so one of the subject matters of the day is refactoring, which is

162
00:17:25,820 --> 00:17:35,120
taking old code or not the best written code and rewriting it, cleaning it up, making it

163
00:17:35,120 --> 00:17:36,120
better.

164
00:17:36,120 --> 00:17:41,080
And in fact, that's often how you wind up with really, really good code, is by just

165
00:17:41,080 --> 00:17:43,840
cleaning up old code.

166
00:17:43,840 --> 00:17:51,440
So here we've got this mess of work that I did back in 2017.

167
00:17:51,440 --> 00:18:00,600
And so the idea is maybe it's time to refactor this, take this code, clean it up, get the

168
00:18:00,600 --> 00:18:08,200
data up to 2021, and it could be quite interesting.

169
00:18:08,200 --> 00:18:13,520
Long story short, we've got some cool data points here.

170
00:18:13,520 --> 00:18:23,360
So remember I was telling you, well, not that I was telling you, you can find these occupational

171
00:18:23,360 --> 00:18:35,000
licenses, this license data here.

172
00:18:35,000 --> 00:18:37,840
But you just have people's cities.

173
00:18:37,840 --> 00:18:51,240
So you have to actually do a bit of legwork to count the total number of employees in

174
00:18:51,240 --> 00:18:52,720
different places.

175
00:18:52,720 --> 00:18:59,520
And in fact, you may even have to go to the archives to get the actual month by month.

176
00:18:59,520 --> 00:19:05,280
So I'll start sharing some of these resources as I populate this to the recent date.

177
00:19:05,280 --> 00:19:13,600
But I'll have to share the links afterwards.

178
00:19:13,600 --> 00:19:21,520
But I've gone ahead here and month by month just created totals.

179
00:19:21,520 --> 00:19:29,120
So I've got total occupational licenses.

180
00:19:29,120 --> 00:19:31,880
There's a bunch of real cool data points.

181
00:19:31,880 --> 00:19:36,760
I probably wouldn't have formatted the keys like this today.

182
00:19:36,760 --> 00:19:40,240
Like snake case would probably be preferred.

183
00:19:40,240 --> 00:19:44,480
But this is what the data is.

184
00:19:44,480 --> 00:19:50,520
And so you've got all the licensees here.

185
00:19:50,520 --> 00:19:53,760
You have all the sales.

186
00:19:53,760 --> 00:20:01,800
And then you have some of these fields that I started to calculate, which are things like

187
00:20:01,800 --> 00:20:12,760
this is sort of occupational licenses per business license.

188
00:20:12,760 --> 00:20:23,760
So we don't know how many employees actually work at each particular business, which would

189
00:20:23,760 --> 00:20:30,600
be awesome because then we could get some really granular stats.

190
00:20:30,600 --> 00:20:37,720
But we can still just see the average number of licensees per license.

191
00:20:37,720 --> 00:20:46,720
And so what you see if you look at that number is you see it just in the early two years

192
00:20:46,720 --> 00:20:48,120
start to rise.

193
00:20:48,120 --> 00:20:53,680
So that's an indication that you're moving from small to large businesses.

194
00:20:53,680 --> 00:21:02,720
But still, you know, eleven employees per license is quite small.

195
00:21:02,720 --> 00:21:12,000
But OK, so we've got some good data here.

196
00:21:12,000 --> 00:21:19,480
You know, so now remember rule number one about data.

197
00:21:19,480 --> 00:21:21,560
Look at the data.

198
00:21:21,560 --> 00:21:28,920
So here is the growth of occupational licenses.

199
00:21:28,920 --> 00:21:34,200
So this is just how fast our license is growing.

200
00:21:34,200 --> 00:21:45,200
And so as you would expect when Colorado first started in 2014, it looks like.

201
00:21:45,200 --> 00:21:52,040
Of course, there's a lot of people entering high monthly growth rate of licenses, and

202
00:21:52,040 --> 00:21:56,440
then that gradually then that gradually diminishes.

203
00:21:56,440 --> 00:22:05,480
And so it would be really interesting to plot this out.

204
00:22:05,480 --> 00:22:13,040
You know, it would be real interesting to plot this out and see what's happening all

205
00:22:13,040 --> 00:22:18,240
the way up to, you know, twenty twenty one.

206
00:22:18,240 --> 00:22:22,640
So to that is sort of.

207
00:22:22,640 --> 00:22:26,880
The objective at hand.

208
00:22:26,880 --> 00:22:33,400
OK, and so.

209
00:22:33,400 --> 00:22:36,320
I was just going to show you a little bit of economics.

210
00:22:36,320 --> 00:22:40,840
I was sort of meaning to boil it down a little better.

211
00:22:40,840 --> 00:22:43,440
But.

212
00:22:43,440 --> 00:22:47,400
I'll touch on it, so don't let this scare you off or anything.

213
00:22:47,400 --> 00:22:53,080
But long story short is I was sort of going to do a little analysis here.

214
00:22:53,080 --> 00:22:57,280
And so.

215
00:22:57,280 --> 00:23:04,560
The way economic analysis really starts is you basically assume that.

216
00:23:04,560 --> 00:23:09,160
Things are produced given some sort of function.

217
00:23:09,160 --> 00:23:14,240
So right now, I'm basically saying, OK, so what's being produced?

218
00:23:14,240 --> 00:23:17,800
It's basically just cannabis flower.

219
00:23:17,800 --> 00:23:19,400
Right.

220
00:23:19,400 --> 00:23:26,040
So that's, you know, cannabis flower.

221
00:23:26,040 --> 00:23:28,840
And you know, it's dependent on something.

222
00:23:28,840 --> 00:23:29,840
Right.

223
00:23:29,840 --> 00:23:35,280
So I would just say, OK, well, that depends on the number of plants.

224
00:23:35,280 --> 00:23:44,920
Right.

225
00:23:44,920 --> 00:23:52,040
So so basically just saying, you know, you know, production, you know, is some sort of

226
00:23:52,040 --> 00:23:56,800
function of the number of plants.

227
00:23:56,800 --> 00:24:05,600
And what's nice is if you take the log of both sides, because you can apply any function

228
00:24:05,600 --> 00:24:15,720
to both sides, then you just get a nice linear function which you can estimate with the regression.

229
00:24:15,720 --> 00:24:28,840
So now you can just do an ordinary least squares regression of the log of flowers or sales

230
00:24:28,840 --> 00:24:33,480
on the log of plants.

231
00:24:33,480 --> 00:24:40,320
And you can, you know, you basically can estimate how.

232
00:24:40,320 --> 00:24:44,920
You know how productive the market is.

233
00:24:44,920 --> 00:24:54,680
And so this alpha is really is basically a measure of how productive the market is.

234
00:24:54,680 --> 00:25:03,000
If it's really close to zero, you're not going to have a very.

235
00:25:03,000 --> 00:25:05,120
It's going to be a really inefficient market.

236
00:25:05,120 --> 00:25:11,700
You're going to have to grow a ton of plants to just produce a tiny little bit of flower.

237
00:25:11,700 --> 00:25:19,000
But if alpha is close to one, then you don't, you know, you're it's a very efficient market

238
00:25:19,000 --> 00:25:26,400
and you're getting a lot of flower for the for the number of plants that you're growing.

239
00:25:26,400 --> 00:25:36,120
So that is the simplest, you know, production function.

240
00:25:36,120 --> 00:25:43,160
And so now we're going to introduce labor, right, because.

241
00:25:43,160 --> 00:25:46,000
You know, plants don't grow themselves.

242
00:25:46,000 --> 00:25:50,240
So here you just have plants and you have flower.

243
00:25:50,240 --> 00:25:57,080
OK, but now we're going to say, OK, we're now we're going to say, you know, we've got

244
00:25:57,080 --> 00:25:59,880
the same thing here.

245
00:25:59,880 --> 00:26:04,200
We've got our production.

246
00:26:04,200 --> 00:26:07,600
It depends on plants.

247
00:26:07,600 --> 00:26:12,440
But it also depends on labor.

248
00:26:12,440 --> 00:26:17,600
So you need plants or capital.

249
00:26:17,600 --> 00:26:22,400
So you know, K is essentially all physical goods.

250
00:26:22,400 --> 00:26:27,600
So just any anything physical required to produce.

251
00:26:27,600 --> 00:26:32,440
And then L is any any labor.

252
00:26:32,440 --> 00:26:34,800
So that is your managers.

253
00:26:34,800 --> 00:26:43,120
That is your you know, just your you know, your standard cultivators, your operators,

254
00:26:43,120 --> 00:26:49,320
all their time and energy, thought power, all that.

255
00:26:49,320 --> 00:26:55,200
And so, you know, we're basically saying, OK, you know, given that and, you know, the

256
00:26:55,200 --> 00:26:57,360
state and then.

257
00:26:57,360 --> 00:27:02,080
You know, you've got just things that are taken is given.

258
00:27:02,080 --> 00:27:03,080
So.

259
00:27:03,080 --> 00:27:10,280
So this is just, you know, your tech.

260
00:27:10,280 --> 00:27:14,360
You know, so that's just your tech.

261
00:27:14,360 --> 00:27:26,560
You know, and that's essentially your efficiency.

262
00:27:26,560 --> 00:27:29,680
So that's basically your efficiency of capital.

263
00:27:29,680 --> 00:27:38,240
And then, you know, beta is essentially your efficiency of.

264
00:27:38,240 --> 00:27:41,560
Of labor.

265
00:27:41,560 --> 00:27:54,520
So that's just sort of the mental framework, you know.

266
00:27:54,520 --> 00:27:58,720
So so that's just the mental framework that sort of show.

267
00:27:58,720 --> 00:28:00,800
OK, so.

268
00:28:00,800 --> 00:28:08,760
That's the framework for essentially why we're going to be plotting the data like this.

269
00:28:08,760 --> 00:28:14,680
And so so this is so basically, you know, this is three dimensions, right?

270
00:28:14,680 --> 00:28:16,600
So this is.

271
00:28:16,600 --> 00:28:18,600
Why your production?

272
00:28:18,600 --> 00:28:22,640
Hey, you know, your capital and L, your labor.

273
00:28:22,640 --> 00:28:24,720
So this is really a three dimension.

274
00:28:24,720 --> 00:28:28,800
You can think about that as like a three dimensional dome.

275
00:28:28,800 --> 00:28:35,280
And so this is just two two dimensions of that dome.

276
00:28:35,280 --> 00:28:40,560
And so here I've plotted.

277
00:28:40,560 --> 00:28:45,120
Occupational licenses, which would be our.

278
00:28:45,120 --> 00:28:46,720
Labor.

279
00:28:46,720 --> 00:28:52,880
Right, so this is basically, you know, LT down here.

280
00:28:52,880 --> 00:29:05,180
So right down here, we've got L. And then up here, we've got our production.

281
00:29:05,180 --> 00:29:08,880
So you know, as you would kind of expect.

282
00:29:08,880 --> 00:29:11,280
The more labor.

283
00:29:11,280 --> 00:29:12,280
That's in the market.

284
00:29:12,280 --> 00:29:17,680
You know, the more production you're going to have.

285
00:29:17,680 --> 00:29:28,240
And it's just sort of. You know, we're just beginning to to sort of form formalize a production

286
00:29:28,240 --> 00:29:29,240
function.

287
00:29:29,240 --> 00:29:31,680
OK, and so why?

288
00:29:31,680 --> 00:29:41,160
Well, the reason why is we're sort of trying to measure this curve here, right?

289
00:29:41,160 --> 00:29:42,600
Because.

290
00:29:42,600 --> 00:29:45,920
You know, we want to know.

291
00:29:45,920 --> 00:29:51,680
Because we want to know, OK, does that curve go like that?

292
00:29:51,680 --> 00:29:56,000
Does this curve go like that?

293
00:29:56,000 --> 00:29:57,280
Because.

294
00:29:57,280 --> 00:30:03,840
You know, right here, you know, you would have like, you know, you know, highly efficient

295
00:30:03,840 --> 00:30:13,240
labor and, you know, down here, you'd have, you know, real low efficient labor and.

296
00:30:13,240 --> 00:30:21,640
You know, essentially what economic theory tells us is, you know, the higher your efficiency

297
00:30:21,640 --> 00:30:30,620
of labor, you know, the higher wage you can expect to get.

298
00:30:30,620 --> 00:30:36,000
And so here's a similar chart with.

299
00:30:36,000 --> 00:30:37,800
The medical industry.

300
00:30:37,800 --> 00:30:45,400
And so as you as you see here, see here, you've got a real steep.

301
00:30:45,400 --> 00:30:50,320
Relation between licensees and retail revenue.

302
00:30:50,320 --> 00:30:52,320
And here.

303
00:30:52,320 --> 00:30:55,040
There's not quite a sharp relation.

304
00:30:55,040 --> 00:31:03,840
You've got fairly, fairly flat relation between the number of medical.

305
00:31:03,840 --> 00:31:07,720
Licenses and then medical revenue.

306
00:31:07,720 --> 00:31:14,840
And so all that would really mean is you would expect that the wage rate would actually be

307
00:31:14,840 --> 00:31:22,280
lower in the medical industry than it would be in the recreational industry.

308
00:31:22,280 --> 00:31:27,620
So that would that would just be a hypothesis that you would you could make.

309
00:31:27,620 --> 00:31:32,000
From looking at this data and, you know, using economic theory.

310
00:31:32,000 --> 00:31:34,920
But, you know, it's nothing more than a hypothesis.

311
00:31:34,920 --> 00:31:41,240
You know, you'd actually have to look at at the wage rates in the in the two in the two

312
00:31:41,240 --> 00:31:45,040
industries.

313
00:31:45,040 --> 00:31:51,240
So and then here's here's just the actual economic theory behind it.

314
00:31:51,240 --> 00:31:58,640
So basically.

315
00:31:58,640 --> 00:32:02,640
You know, economic theory would suggest that.

316
00:32:02,640 --> 00:32:06,800
You know, the competitive wage would be the one where you're basically getting paid, you

317
00:32:06,800 --> 00:32:09,080
know, your marginal product.

318
00:32:09,080 --> 00:32:16,120
So what that says in simplest terms is.

319
00:32:16,120 --> 00:32:21,440
You're going to produce a certain amount of value for your for your work.

320
00:32:21,440 --> 00:32:29,160
You know, no one would pay you more than the value you you generate.

321
00:32:29,160 --> 00:32:37,320
But if they're paying you less than the value you generate in a perfectly competitive market,

322
00:32:37,320 --> 00:32:43,840
somebody else would come along and say, hey, somebody's paying him less than the value

323
00:32:43,840 --> 00:32:51,040
he's producing or she, then I'm going to, you know, pay them just a little bit more.

324
00:32:51,040 --> 00:32:57,720
And so that would just keep happening until you're essentially paid the exact amount

325
00:32:57,720 --> 00:33:00,440
of value that you're producing.

326
00:33:00,440 --> 00:33:02,520
So it's all sort of conceptual.

327
00:33:02,520 --> 00:33:09,840
And, you know, you would expect that, you know, in the real world, you wouldn't expect

328
00:33:09,840 --> 00:33:15,600
that anybody would be getting paid their competitive wage just because of market frictions and

329
00:33:15,600 --> 00:33:17,860
all sorts of factors.

330
00:33:17,860 --> 00:33:26,720
So this is an entirely theoretical wage, but it's just a benchmark that we can just use

331
00:33:26,720 --> 00:33:30,280
to, you know, start estimating.

332
00:33:30,280 --> 00:33:31,280
OK.

333
00:33:31,280 --> 00:33:36,960
You know, OK, we don't know people's wages.

334
00:33:36,960 --> 00:33:41,120
All we know is retail revenue and the licenses.

335
00:33:41,120 --> 00:33:50,080
You know, is there any way we can estimate their wages, you know, given that data?

336
00:33:50,080 --> 00:34:01,120
Well, if you go back to our production function here, so here we have ykT.

337
00:34:01,120 --> 00:34:17,480
Basically we're just going to add, you know, lT to the beta right there.

338
00:34:17,480 --> 00:34:23,880
And then that gets log linearized down here.

339
00:34:23,880 --> 00:34:32,700
And that turns into, you know, beta log lT.

340
00:34:32,700 --> 00:34:42,080
So we still have our linear equation where we're just going to take a regression of production,

341
00:34:42,080 --> 00:34:44,320
which I'll just call sales.

342
00:34:44,320 --> 00:34:46,040
So that'll just be revenue.

343
00:34:46,040 --> 00:34:52,720
So we're just going to be regressing revenue on some measure of capital, which I'm calling

344
00:34:52,720 --> 00:35:02,480
plants and then on labor, the log of labor, which I'm calling total occupational licenses.

345
00:35:02,480 --> 00:35:09,280
So you know, it's all just a crude approximation and just, you know, just seeing what we can

346
00:35:09,280 --> 00:35:10,280
make fit.

347
00:35:10,280 --> 00:35:22,160
It's just interesting to just sort of make this visual conceptual model of the industry.

348
00:35:22,160 --> 00:35:33,840
So we're saying, okay, the wage, that was, we can probably estimate it to be competitive.

349
00:35:33,840 --> 00:35:38,120
Also in the, so this is just more data.

350
00:35:38,120 --> 00:35:40,120
This probably should have belonged earlier.

351
00:35:40,120 --> 00:35:46,360
This is just a breakdown of actual positions held.

352
00:35:46,360 --> 00:35:48,520
Okay.

353
00:35:48,520 --> 00:36:00,640
So we've done the nitty gritty.

354
00:36:00,640 --> 00:36:03,600
Now we want to calculate this.

355
00:36:03,600 --> 00:36:04,960
Okay.

356
00:36:04,960 --> 00:36:20,160
So I'm going to real quick show you how that would be done.

357
00:36:20,160 --> 00:36:34,080
So you're going to import this data here where you've got total occupational licenses.

358
00:36:34,080 --> 00:36:37,200
And we also have total revenue.

359
00:36:37,200 --> 00:36:43,400
So that's basically, this is basically going to be our Y and then this is basically going

360
00:36:43,400 --> 00:36:46,400
to be our L.

361
00:36:46,400 --> 00:37:04,320
And I believe we have the number of plants.

362
00:37:04,320 --> 00:37:05,320
I think yes.

363
00:37:05,320 --> 00:37:09,360
So here I should have average cultivated plants.

364
00:37:09,360 --> 00:37:14,800
So let's, you know, read in that data.

365
00:37:14,800 --> 00:37:19,200
Hopefully everything's there.

366
00:37:19,200 --> 00:37:21,920
Number one rule about data.

367
00:37:21,920 --> 00:37:23,280
Look at the data.

368
00:37:23,280 --> 00:37:27,760
So this is total revenue.

369
00:37:27,760 --> 00:37:33,600
Oops.

370
00:37:33,600 --> 00:37:51,760
Here we have total plants, which will basically be our KT.

371
00:37:51,760 --> 00:37:54,320
So we've got our YT, KT and then our LT.

372
00:37:54,320 --> 00:37:57,920
And so we've got our total labor.

373
00:37:57,920 --> 00:38:03,400
Also and this is what I was wanting to talk about, refactoring.

374
00:38:03,400 --> 00:38:10,080
This is old, bad code that needs to be refactored.

375
00:38:10,080 --> 00:38:15,680
So I wouldn't really actually recommend labeling your variables like this.

376
00:38:15,680 --> 00:38:22,160
So like for Python, you know, you should probably go, you know, total labor.

377
00:38:22,160 --> 00:38:24,960
So this code needs to be refactored.

378
00:38:24,960 --> 00:38:28,080
So that is the name.

379
00:38:28,080 --> 00:38:30,360
That is the buzzword of the day.

380
00:38:30,360 --> 00:38:37,280
And so this code needs to be refactored.

381
00:38:37,280 --> 00:38:50,440
Remember earlier I said it's real interesting to supplement your data with external data

382
00:38:50,440 --> 00:38:51,440
sources.

383
00:38:51,440 --> 00:39:03,760
Well, here I'm actually using the Federal Reserve's thread API.

384
00:39:03,760 --> 00:39:10,960
And so I'm not certain I have this.

385
00:39:10,960 --> 00:39:13,960
Okay.

386
00:39:13,960 --> 00:39:25,680
So we'll need to install the thread Python API.

387
00:39:25,680 --> 00:39:42,120
And so let's actually just, you know, see if we can't do that real quick.

388
00:39:42,120 --> 00:40:00,360
So I think we can just do...

389
00:40:00,360 --> 00:40:13,960
And so what I'm using here is basically the Federal Reserve has, you know, this would

390
00:40:13,960 --> 00:40:21,000
be the go to place if you just need monthly, yearly, you know, potentially even more frequently,

391
00:40:21,000 --> 00:40:26,080
but typically the not so frequent series.

392
00:40:26,080 --> 00:40:33,920
So you know, anything from, you know, so this is where a lot of economists get their data.

393
00:40:33,920 --> 00:40:42,320
Honestly I probably under utilize it because there is a bunch of awesome data here, but

394
00:40:42,320 --> 00:40:49,880
essentially right now pulling the...

395
00:40:49,880 --> 00:40:53,080
Right?

396
00:40:53,080 --> 00:40:55,680
Because we're estimating the wage rate.

397
00:40:55,680 --> 00:41:02,840
So it wouldn't be any fun to just pull the average wage data.

398
00:41:02,840 --> 00:41:10,200
But basically what I'm pulling here is just the...

399
00:41:10,200 --> 00:41:16,280
This is just the average number of hours a week that somebody would work.

400
00:41:16,280 --> 00:41:25,240
Because remember earlier I said, oh, I'm just in a conjecture that they work 40 hours per

401
00:41:25,240 --> 00:41:26,240
week.

402
00:41:26,240 --> 00:41:30,440
But then I started thinking, you know, that's not so realistic.

403
00:41:30,440 --> 00:41:37,280
And so what you can do is you can actually, you know, you can actually get the average

404
00:41:37,280 --> 00:41:39,880
hourly work week.

405
00:41:39,880 --> 00:41:45,720
So you know, so like you just slowly, you know, getting your approximations closer and

406
00:41:45,720 --> 00:41:49,000
closer to realistic.

407
00:41:49,000 --> 00:41:55,400
So as you see, it's not, people aren't working 40 hours a week on average.

408
00:41:55,400 --> 00:42:03,480
It's typically, you know, it's close to, you know, 30, 33, 34 or so.

409
00:42:03,480 --> 00:42:04,480
Okay.

410
00:42:04,480 --> 00:42:17,080
And so this is old code and let's see if we can still get this to work.

411
00:42:17,080 --> 00:42:30,920
So it looks like we need numpy.

412
00:42:30,920 --> 00:42:38,440
And we need stats models.

413
00:42:38,440 --> 00:42:58,360
And so essentially what I'm doing here is just getting everything ready for this regression.

414
00:42:58,360 --> 00:43:07,160
I'm just getting the log of y, the log of capital, and the log of labor, as well as

415
00:43:07,160 --> 00:43:14,920
a constant, just to represent technology.

416
00:43:14,920 --> 00:43:19,240
And this is where we're going to start off-roading a little bit.

417
00:43:19,240 --> 00:43:24,960
I need to probably touch up this code.

418
00:43:24,960 --> 00:43:32,560
So I'm not even going to promise that it's going to work here, but we can at least give

419
00:43:32,560 --> 00:43:39,200
it a shot.

420
00:43:39,200 --> 00:43:43,280
See it was even written in Python too.

421
00:43:43,280 --> 00:43:51,800
So we're going to do a regression here.

422
00:43:51,800 --> 00:43:54,520
Okay.

423
00:43:54,520 --> 00:44:01,920
And so remember, we barely even have any data.

424
00:44:01,920 --> 00:44:03,520
Still we have some data.

425
00:44:03,520 --> 00:44:06,440
We can get a lot more data.

426
00:44:06,440 --> 00:44:12,500
So we only have, you know, 29 observations.

427
00:44:12,500 --> 00:44:15,760
So it would be much better to run this through 2021.

428
00:44:15,760 --> 00:44:21,260
And look, we're estimating these exact things.

429
00:44:21,260 --> 00:44:25,680
So we estimated our constant.

430
00:44:25,680 --> 00:44:34,320
So that's, so right, so we just estimated that term.

431
00:44:34,320 --> 00:44:51,880
So we just, we just estimated that, this whole term.

432
00:44:51,880 --> 00:45:07,480
And we estimated x1, which is alpha.

433
00:45:07,480 --> 00:45:20,800
So we write, oops.

434
00:45:20,800 --> 00:45:28,920
So we just estimated alpha.

435
00:45:28,920 --> 00:45:34,960
And we estimated that to be 0.28.

436
00:45:34,960 --> 00:45:51,120
And then we estimated beta to be 0.0684.

437
00:45:51,120 --> 00:46:01,800
And if you look at this, if you add these two coefficients together, well, it's not quite

438
00:46:01,800 --> 00:46:14,160
one, but essentially economic theory would suggest that alpha plus beta would equal one.

439
00:46:14,160 --> 00:46:21,360
So we're obviously leaving some explanation left on the table, right?

440
00:46:21,360 --> 00:46:27,200
There's still about 10% of efficiency that's not really being explained here.

441
00:46:27,200 --> 00:46:37,440
But, you know, we're still explaining, you know, a lot of, you know, alpha and beta,

442
00:46:37,440 --> 00:46:44,840
you know, but they're close to what they should be, you know, theoretically.

443
00:46:44,840 --> 00:46:53,560
And so, so remember, so this is just a measure of efficiency.

444
00:46:53,560 --> 00:46:58,640
So remember, the closer to one it is, the more efficient it is.

445
00:46:58,640 --> 00:47:09,200
So here we've got fairly efficient labor and then capital is not as efficient.

446
00:47:09,200 --> 00:47:17,560
And remember, if we go back here, so we now have beta.

447
00:47:17,560 --> 00:47:29,240
Remember, so beta is, you know, remember beta is like about 0.6 or so.

448
00:47:29,240 --> 00:47:35,120
So now we can actually calculate the competitive wage.

449
00:47:35,120 --> 00:47:43,560
So now at any given point in time, we can say, okay, what is the competitive wage?

450
00:47:43,560 --> 00:47:44,560
Right?

451
00:47:44,560 --> 00:47:57,320
So it's just going to be the total amount of sales divided by the total amount of labor,

452
00:47:57,320 --> 00:48:03,480
you know, times 0.6.

453
00:48:03,480 --> 00:48:26,520
So we now have the competitive wage or an estimate.

454
00:48:26,520 --> 00:48:30,400
So this is, let me really benchmark that there.

455
00:48:30,400 --> 00:48:37,560
This is, you know, essentially, you know, a complete estimate, you know, I mean, keep

456
00:48:37,560 --> 00:48:42,280
in mind all of the assumptions we've made along the way.

457
00:48:42,280 --> 00:48:53,160
So we first assumed that production is a function of plants and labor.

458
00:48:53,160 --> 00:49:03,200
Obviously, production is more than a function of plants and simply measuring labor's total

459
00:49:03,200 --> 00:49:09,160
occupational licenses is also sort of a big mental stretch.

460
00:49:09,160 --> 00:49:13,720
So we did some mental gymnastics there.

461
00:49:13,720 --> 00:49:20,760
And just in fact, the assumption that this is what the production function looks like,

462
00:49:20,760 --> 00:49:25,760
that's also an enormous assumption.

463
00:49:25,760 --> 00:49:35,320
And then we assumed that everything's perfectly competitive.

464
00:49:35,320 --> 00:49:43,960
And you know, obviously, you know, there's a lot of factors, you know, that come into

465
00:49:43,960 --> 00:49:45,960
play there.

466
00:49:45,960 --> 00:49:53,880
And so the long story short, lots of considerations, but we're estimating, you know, a competitive

467
00:49:53,880 --> 00:50:05,000
wage, you know, is fluctuating somewhere between, you know, $9 and $11 per hour.

468
00:50:05,000 --> 00:50:10,600
And so, you know, like I said, this is an estimate.

469
00:50:10,600 --> 00:50:16,040
And we actually can put confidence intervals on that.

470
00:50:16,040 --> 00:50:31,480
And I think I've done that.

471
00:50:31,480 --> 00:50:34,520
Let's try to plot this all in one.

472
00:50:34,520 --> 00:50:44,640
So hold on, I'll be bringing this home here.

473
00:50:44,640 --> 00:50:51,920
All right.

474
00:50:51,920 --> 00:50:57,960
And so here, you know, you could plot this in a more fanciful way.

475
00:50:57,960 --> 00:51:04,160
But here, we're essentially saying, OK, you know, we think, you know, the real wage, you

476
00:51:04,160 --> 00:51:13,520
know, could be anywhere between, you know, $4 and, you know, $18 or so per hour.

477
00:51:13,520 --> 00:51:21,520
And so, you know, with our best guess at around, you know, $10 an hour.

478
00:51:21,520 --> 00:51:26,900
And so, you know, this is sort of bringing, you know, bringing it home.

479
00:51:26,900 --> 00:51:39,680
So basically, given, you know, all we were given was this data set here, just the Colorado

480
00:51:39,680 --> 00:51:47,080
data publicly available, you know, as well as, you know, the Fred API.

481
00:51:47,080 --> 00:51:55,560
We were able to create monthly totals.

482
00:51:55,560 --> 00:52:10,360
Then we were able to use economic theory to estimate what a competitive wage may be.

483
00:52:10,360 --> 00:52:15,680
And then we estimated it and then we plotted it.

484
00:52:15,680 --> 00:52:22,600
And now, you know, now people can know that, hey, this is what, you know, this is what

485
00:52:22,600 --> 00:52:26,600
the competitive wage is in the cannabis industry.

486
00:52:26,600 --> 00:52:32,400
And I mean, if you're looking at like a competitive wage of, you know, up to, you know, $18 or

487
00:52:32,400 --> 00:52:45,520
so an hour, well, that may be why there are a lot of legal cannabis jobs, because you

488
00:52:45,520 --> 00:52:51,160
have a high, a high competitive wage.

489
00:52:51,160 --> 00:53:01,520
So, I think that brings us to the end of the presentation.

490
00:53:01,520 --> 00:53:10,120
I kind of wanted to save a little bit more time for questions, but, you know, that's

491
00:53:10,120 --> 00:53:12,480
some of the work I'm doing.

492
00:53:12,480 --> 00:53:21,760
And now the task is to rinse and repeat with all the data up to 2021.

493
00:53:21,760 --> 00:53:25,440
So, thanks.

494
00:53:25,440 --> 00:53:26,440
That was really cool.

495
00:53:26,440 --> 00:53:28,680
But I have another meeting I have to go to.

496
00:53:28,680 --> 00:53:29,680
Okay.

497
00:53:29,680 --> 00:53:30,680
So, I'll see you next week.

498
00:53:30,680 --> 00:53:31,680
Definitely.

499
00:53:31,680 --> 00:53:37,600
But I'll go ahead and conclude here and I'll get this recording up, get some of the material

500
00:53:37,600 --> 00:53:38,920
up.

501
00:53:38,920 --> 00:53:45,360
And I know it was a lot to take in, but it was even a lot for me to take in.

502
00:53:45,360 --> 00:53:48,200
I've sort of got to get refreshed on it.

503
00:53:48,200 --> 00:53:52,160
But, you know, like I said, that's what I'm working on.

504
00:53:52,160 --> 00:53:55,160
It's got a lot of cleaning up to do.

505
00:53:55,160 --> 00:53:59,120
But there's some interesting insights there to be had.

506
00:53:59,120 --> 00:54:00,120
Yeah, definitely.

507
00:54:00,120 --> 00:54:01,120
Thank you.

508
00:54:01,120 --> 00:54:03,120
Thank you for presenting that.

509
00:54:03,120 --> 00:54:05,320
You're very welcome, Alice.

510
00:54:05,320 --> 00:54:06,640
Yeah, there you go.

511
00:54:06,640 --> 00:54:07,880
So, thank you for coming.

512
00:54:07,880 --> 00:54:09,840
I hope you got something out of it.

513
00:54:09,840 --> 00:54:12,640
Do you have a little bit of time for a question?

514
00:54:12,640 --> 00:54:14,840
Definitely, by all means.

515
00:54:14,840 --> 00:54:17,960
If you were in, I don't know anything about economy.

516
00:54:17,960 --> 00:54:20,160
So that was a new piece for me.

517
00:54:20,160 --> 00:54:26,160
But if you were in a market where things are stabilized and you don't have that many more,

518
00:54:26,160 --> 00:54:33,440
like you have more license or less license, like it's very thin variation.

519
00:54:33,440 --> 00:54:36,440
And let's say the revenue is more or less the same.

520
00:54:36,440 --> 00:54:46,080
And in that plot, the reason you were able to plot revenue versus occupational license

521
00:54:46,080 --> 00:54:51,720
and to have this long line is because the market had been growing rapidly, right?

522
00:54:51,720 --> 00:54:52,720
Exactly.

523
00:54:52,720 --> 00:55:04,200
And so if you were in a stable market, you'd have a much smaller plot, per se.

524
00:55:04,200 --> 00:55:08,280
So, exactly, and you hit on something interesting there.

525
00:55:08,280 --> 00:55:17,280
So with statistics, variation is key if you're trying to get some good measurements.

526
00:55:17,280 --> 00:55:22,560
So it will be interesting to see.

527
00:55:22,560 --> 00:55:30,000
And that's why it's so worthwhile to extend this out to the current day is, okay, did

528
00:55:30,000 --> 00:55:34,480
the occupational licenses, did that level off?

529
00:55:34,480 --> 00:55:39,200
And if so, what's the effect on the market?

530
00:55:39,200 --> 00:55:41,480
Did the market keep growing?

531
00:55:41,480 --> 00:55:48,920
Because if the market keeps growing and then the occupational licenses level off, you'd

532
00:55:48,920 --> 00:55:53,080
expect those people maybe to get paid, maybe get paid more.

533
00:55:53,080 --> 00:55:57,400
Maybe they're becoming better at their jobs, becoming more efficient or something like

534
00:55:57,400 --> 00:55:58,400
that.

535
00:55:58,400 --> 00:55:59,400
Yeah.

536
00:55:59,400 --> 00:56:00,400
That's interesting.

537
00:56:00,400 --> 00:56:03,720
Yeah, that's what I'm thinking.

538
00:56:03,720 --> 00:56:06,000
And so, yeah, so it's exciting.

539
00:56:06,000 --> 00:56:13,480
And so I'd wanted to extend this out to 2021 for today, but there's a good bit of data

540
00:56:13,480 --> 00:56:15,360
wrangling ahead.

541
00:56:15,360 --> 00:56:21,520
So that's why I wanted to go ahead and share what I had with you today.

542
00:56:21,520 --> 00:56:26,560
And the data that you had for 2014 and 2015, you had in a spreadsheet.

543
00:56:26,560 --> 00:56:29,680
What was something that you got from a different data source?

544
00:56:29,680 --> 00:56:33,240
And then the links you shared with that.

545
00:56:33,240 --> 00:56:36,760
So you see the GitHub in the chat there?

546
00:56:36,760 --> 00:56:37,760
Yeah.

547
00:56:37,760 --> 00:56:44,620
So I'm so there is an Excel spreadsheet there.

548
00:56:44,620 --> 00:56:49,840
You may have to essentially clone the GitHub if you want.

549
00:56:49,840 --> 00:56:56,760
I could send you my email afterwards and then essentially just email you that data set if

550
00:56:56,760 --> 00:56:58,760
you would like.

551
00:56:58,760 --> 00:57:00,120
Yeah.

552
00:57:00,120 --> 00:57:05,040
For me right now, it's more understanding kind of like where you so I see the Excel.

553
00:57:05,040 --> 00:57:13,360
I mean, I see this CSV file on today's presentation data directory.

554
00:57:13,360 --> 00:57:23,320
But you're saying that to get data up until 2021, then you'd need to do like get the data

555
00:57:23,320 --> 00:57:25,320
actually from those websites.

556
00:57:25,320 --> 00:57:26,320
Exactly.

557
00:57:26,320 --> 00:57:30,840
And so here, let me put that in the chat here.

558
00:57:30,840 --> 00:57:40,200
But essentially, when I did it, I was looking at archived Excel versions.

559
00:57:40,200 --> 00:57:44,520
You may be able to do it with that API I was sharing earlier.

560
00:57:44,520 --> 00:57:50,280
But now that I'm starting to think about it, I think that may just be at least for the

561
00:57:50,280 --> 00:57:54,160
occupational licenses, that may only be a current snapshot.

562
00:57:54,160 --> 00:58:00,160
So I may have to share with you the links to the actual archives.

563
00:58:00,160 --> 00:58:06,040
So I actually do that is actually sort of a gap.

564
00:58:06,040 --> 00:58:13,000
So I realized that in my preparation today that I was missing that data source.

565
00:58:13,000 --> 00:58:15,560
So I still have to scrounge that up.

566
00:58:15,560 --> 00:58:19,840
But I'll send that to you when I get it.

567
00:58:19,840 --> 00:58:23,640
Because I've got it somewhere.

568
00:58:23,640 --> 00:58:24,640
I've got the data.

569
00:58:24,640 --> 00:58:32,560
I just need to find out where I got that online again, so we can get it through 2021.

570
00:58:32,560 --> 00:58:41,480
And on the GitHub, you had a bunch of handouts.

571
00:58:41,480 --> 00:58:46,080
What were you wondering?

572
00:58:46,080 --> 00:58:48,440
Sorry, there's a little...

573
00:58:48,440 --> 00:58:50,920
I hear myself on your end.

574
00:58:50,920 --> 00:58:51,920
Sorry.

575
00:58:51,920 --> 00:58:52,920
Ah, here.

576
00:58:52,920 --> 00:58:59,640
I can mute myself when I'm speaking.

577
00:58:59,640 --> 00:59:08,520
I was saying in the... so I'm on the March 10 repository of your GitHub.

578
00:59:08,520 --> 00:59:11,080
And there were a bunch of handouts.

579
00:59:11,080 --> 00:59:16,040
And I was curious about mapping basics with bucket.

580
00:59:16,040 --> 00:59:20,280
Like you put the links to articles.

581
00:59:20,280 --> 00:59:22,720
What was the idea for you there?

582
00:59:22,720 --> 00:59:23,840
Oh, yes.

583
00:59:23,840 --> 00:59:31,760
So part of what I'm doing here is I'm just sort of keeping track of just interesting

584
00:59:31,760 --> 00:59:38,920
data science slash cannabis data that I come across on a week-to-week basis.

585
00:59:38,920 --> 00:59:41,520
I don't know if it's helpful or not.

586
00:59:41,520 --> 00:59:47,080
But I just kind of thought that I just was starting to just accumulate things for the

587
00:59:47,080 --> 00:59:48,080
group.

588
00:59:48,080 --> 00:59:53,800
And I just kind of say like, hey, these are things I found in the past week.

589
00:59:53,800 --> 00:59:55,920
I think they're interesting.

590
00:59:55,920 --> 00:59:56,920
They're worth looking at.

591
00:59:56,920 --> 01:00:05,280
I'm just going to share them with the group because I've got, you know, 1,001 pies in

592
01:00:05,280 --> 01:00:06,280
the oven.

593
01:00:06,280 --> 01:00:13,440
And I may not get to all these awesome ideas and projects.

594
01:00:13,440 --> 01:00:20,120
And so I found that instead of just leaving them to get dusty, I'll just share them because

595
01:00:20,120 --> 01:00:25,640
like, for example, Charles was finding some use for the data.

596
01:00:25,640 --> 01:00:30,960
And so instead of them just getting dust, I'm just going to share them.

597
01:00:30,960 --> 01:00:33,040
And I'll get to them eventually.

598
01:00:33,040 --> 01:00:38,640
And then if you want to poke around at them for the time being, have at it.

599
01:00:38,640 --> 01:00:42,080
So that's great.

600
01:00:42,080 --> 01:00:48,320
So you have past week's presentations in that GitHub.

601
01:00:48,320 --> 01:00:52,500
So the actual videos, I'm still uploading online.

602
01:00:52,500 --> 01:00:59,920
So I'm going to I'm basically creating a web page where I'll just have all the video archives.

603
01:00:59,920 --> 01:01:03,680
Like I said, it's on my to do list.

604
01:01:03,680 --> 01:01:07,000
And in fact, it's on my today's to do list.

605
01:01:07,000 --> 01:01:17,200
So you should check your messages on Meetup and I'll make sure to I'll send you the links

606
01:01:17,200 --> 01:01:21,400
because those will be coming out the pipeline real soon.

607
01:01:21,400 --> 01:01:22,400
Yeah.

608
01:01:22,400 --> 01:01:29,240
Do you have some time to talk just in general about data science?

609
01:01:29,240 --> 01:01:30,240
I don't know.

610
01:01:30,240 --> 01:01:31,240
I don't want to take too much.

611
01:01:31,240 --> 01:01:34,240
Oh, I definitely do.

612
01:01:34,240 --> 01:01:36,960
I'm always happy to talk about data science.

613
01:01:36,960 --> 01:01:41,960
Would you want to like schedule like a one on one?

614
01:01:41,960 --> 01:01:42,960
Maybe.

615
01:01:42,960 --> 01:01:47,520
Maybe it'd be better if you like you have it in your schedule and we can we can chat.

616
01:01:47,520 --> 01:01:50,320
I'm an aspiring data scientist.

617
01:01:50,320 --> 01:01:51,320
Awesome.

618
01:01:51,320 --> 01:01:55,880
I'm pivoting from physics research.

619
01:01:55,880 --> 01:02:03,600
And I've enrolled into a data science boot camp that I'm doing in a couple of months.

620
01:02:03,600 --> 01:02:07,680
And one of the things that they ask us is to build a project.

621
01:02:07,680 --> 01:02:13,360
And so I'm kind of looking at different things that like different data sets that people

622
01:02:13,360 --> 01:02:16,080
haven't necessarily looked at a lot.

623
01:02:16,080 --> 01:02:19,800
And kind of his data seems like really rich.

624
01:02:19,800 --> 01:02:23,740
And actually, I don't know that that many people have looked at it.

625
01:02:23,740 --> 01:02:28,400
And so I'm thinking that that could become my project.

626
01:02:28,400 --> 01:02:34,840
If that was the case, then I would work on it probably like 10, 20 hours per week on

627
01:02:34,840 --> 01:02:35,840
it.

628
01:02:35,840 --> 01:02:42,440
And so I'm trying to see whether it's a good opportunity for me to start getting into.

629
01:02:42,440 --> 01:02:43,440
Definitely Alice.

630
01:02:43,440 --> 01:02:48,760
Well, I can definitely point you in the direction of some good resources, because like you said,

631
01:02:48,760 --> 01:02:56,720
it may be unparalleled, the amount of public data, because very few industries are actually

632
01:02:56,720 --> 01:03:02,200
regulated on such a granular level.

633
01:03:02,200 --> 01:03:08,480
So they are recording data at every stage.

634
01:03:08,480 --> 01:03:16,720
And the public has kind of opted towards a public data approach towards cannabis, because

635
01:03:16,720 --> 01:03:22,640
you just kind of want to let people know about it, know that it's going on, measure it.

636
01:03:22,640 --> 01:03:29,320
And it's new, so people were able to set up, in some cases, APIs.

637
01:03:29,320 --> 01:03:33,040
So there are some awesome sources.

638
01:03:33,040 --> 01:03:40,280
And like you said, it's basically a firehose of data.

639
01:03:40,280 --> 01:03:43,000
And people are just scrambling at it.

640
01:03:43,000 --> 01:03:45,360
And so there's a lot to be uncovered.

641
01:03:45,360 --> 01:03:46,360
OK.

642
01:03:46,360 --> 01:03:47,360
Yeah, yeah.

643
01:03:47,360 --> 01:03:50,880
It's really, really exciting.

644
01:03:50,880 --> 01:03:59,760
I'm going to try to come through a little bit of your GitHub repo for the past presentations

645
01:03:59,760 --> 01:04:02,200
that you've done, kind of trying to understand.

646
01:04:02,200 --> 01:04:04,080
Right now, I can see there's a lot of data.

647
01:04:04,080 --> 01:04:09,560
I'm still a little bit unclear on what are the most interesting questions to focus on.

648
01:04:09,560 --> 01:04:11,920
And I guess that's your real house.

649
01:04:11,920 --> 01:04:12,920
Ooh.

650
01:04:12,920 --> 01:04:16,760
Well, the warning, the repository is a bit of a mess right now.

651
01:04:16,760 --> 01:04:17,760
It needs a bit of cleanup.

652
01:04:17,760 --> 01:04:21,480
I've been just posting things for the time being.

653
01:04:21,480 --> 01:04:24,880
But if you'd like, I could think of some things.

654
01:04:24,880 --> 01:04:33,640
And then when we speak, we could hammer out some good projects that you may want to poke

655
01:04:33,640 --> 01:04:34,640
around at.

656
01:04:34,640 --> 01:04:35,640
That sounds great.

657
01:04:35,640 --> 01:04:45,400
Do you know what your availabilities are in the next week or next week?

658
01:04:45,400 --> 01:04:48,280
I'm always available afternoon.

659
01:04:48,280 --> 01:04:54,120
And then on Fridays, I'm usually wide open.

660
01:04:54,120 --> 01:04:59,280
So how about tomorrow afternoon?

661
01:04:59,280 --> 01:05:00,280
Definitely.

662
01:05:00,280 --> 01:05:01,280
OK.

663
01:05:01,280 --> 01:05:04,520
Do you have a time that's best for you?

664
01:05:04,520 --> 01:05:05,520
2 PM.

665
01:05:05,520 --> 01:05:10,680
And tomorrow is Thursday.

666
01:05:10,680 --> 01:05:18,560
So I do have an engagement at 2.30 on tomorrow.

667
01:05:18,560 --> 01:05:25,520
So if we can do Friday.

668
01:05:25,520 --> 01:05:30,080
How about I'm going to send you an email and you think of the best time for you.

669
01:05:30,080 --> 01:05:33,640
I'm going to give you my time windows when I'm free and then you pick the time.

670
01:05:33,640 --> 01:05:34,640
Awesome.

671
01:05:34,640 --> 01:05:35,640
Awesome.

672
01:05:35,640 --> 01:05:36,640
All right.

673
01:05:36,640 --> 01:05:37,640
Well, this is exciting, Alice.

674
01:05:37,640 --> 01:05:38,640
It was awesome meeting you.

675
01:05:38,640 --> 01:05:41,720
Can you put your email in the chat?

676
01:05:41,720 --> 01:05:42,720
Yes.

677
01:05:42,720 --> 01:05:50,680
So you can just reach me at just Keegan Ski at Gmail.

678
01:05:50,680 --> 01:05:51,680
All right.

679
01:05:51,680 --> 01:05:54,440
Well, it was great.

680
01:05:54,440 --> 01:05:58,440
I wanted to attend this like, I don't know, three, four weeks ago and I couldn't.

681
01:05:58,440 --> 01:06:01,120
I'm really glad I got to do that.

682
01:06:01,120 --> 01:06:05,600
Well, today ended up being an interesting presentation.

683
01:06:05,600 --> 01:06:11,080
There was a lot in there, but I thought it was an interesting one.

684
01:06:11,080 --> 01:06:13,400
But I think it still needs to get cleaned up.

685
01:06:13,400 --> 01:06:18,600
Well, you get the content and then you had the data science and the coding.

686
01:06:18,600 --> 01:06:21,960
And so that was that was a lot to put in like one hour.

687
01:06:21,960 --> 01:06:26,000
But I definitely like the approach and like knowing a little bit more background about

688
01:06:26,000 --> 01:06:28,040
like the economical science behind it.

689
01:06:28,040 --> 01:06:30,080
So that was really interesting to me.

690
01:06:30,080 --> 01:06:32,120
Oh, well, all right.

691
01:06:32,120 --> 01:06:33,600
I'll send you an email right away.

692
01:06:33,600 --> 01:06:34,600
Nice.

693
01:06:34,600 --> 01:06:36,160
I'm glad you got something out of it, Alice.

694
01:06:36,160 --> 01:06:37,960
So I'll look forward to speaking with you.

695
01:06:37,960 --> 01:06:38,960
All right.

696
01:06:38,960 --> 01:06:39,960
Bye.

697
01:06:39,960 --> 01:06:40,960
Have an awesome day.

698
01:06:40,960 --> 01:07:04,560
Thank you.

