1
00:00:00,000 --> 00:00:12,000
Welcome to the Cannabis Data Science Meetup Group for the 1st of September.

2
00:00:12,000 --> 00:00:14,000
I hope everybody's doing well.

3
00:00:14,000 --> 00:00:19,000
Crunching numbers here in Oklahoma again.

4
00:00:19,000 --> 00:00:29,000
So I've prepared some data that I can share with you essentially trying to look at the market competitiveness on a county by county basis here.

5
00:00:29,000 --> 00:00:39,000
As we do our national analysis, thought I would drill down here real quick since we have county level data.

6
00:00:39,000 --> 00:00:46,000
But before we dive into that, I guess, you know, welcome to the group, Katie.

7
00:00:46,000 --> 00:00:54,000
And of course, Heather is here. And then, Katie, you wouldn't mind introducing yourself if you want.

8
00:00:54,000 --> 00:00:59,000
You don't have to, but just just curious.

9
00:00:59,000 --> 00:01:06,000
What avenue you're coming at the group from.

10
00:01:06,000 --> 00:01:11,000
Oh, it's okay. Always, always happy to have people listen.

11
00:01:11,000 --> 00:01:33,000
So without further ado, I'll share with you some of the work that I've been working on and we can maybe even spearhead and do some new development.

12
00:01:33,000 --> 00:01:40,000
Okay. So.

13
00:01:40,000 --> 00:01:43,000
We've started.

14
00:01:43,000 --> 00:01:46,000
As you can see, we've got a blank map here.

15
00:01:46,000 --> 00:01:54,000
So we're just now starting to essentially try to populate data.

16
00:01:54,000 --> 00:02:00,000
All the public cannabis data that we can scrounge up for each state.

17
00:02:00,000 --> 00:02:09,000
So, of course, the states, many states here in the east, the Eastern seaboard.

18
00:02:09,000 --> 00:02:12,000
As well as some here in the south.

19
00:02:12,000 --> 00:02:17,000
Don't permit cannabis.

20
00:02:17,000 --> 00:02:25,000
Neither some of them, neither medicinally nor recreationally, and then many states don't allow cannabis recreationally.

21
00:02:25,000 --> 00:02:39,000
So we're going to be collecting all the data that the states provide on either and or recreational and or medicinal cannabis.

22
00:02:39,000 --> 00:02:45,000
So, for example, Washington State, Oregon.

23
00:02:45,000 --> 00:02:50,000
You still have medicinal programs as well as recreational programs.

24
00:02:50,000 --> 00:03:00,000
So we'll try to track that data. Same for Colorado and then states like Oklahoma just have medicinal sales.

25
00:03:00,000 --> 00:03:04,000
And then, you know, we've got Illinois, Michigan.

26
00:03:04,000 --> 00:03:12,000
And then we'll we'll take a look at Maryland and we're looking at some of the northeastern states as well.

27
00:03:12,000 --> 00:03:15,000
So Maine has some good public data.

28
00:03:15,000 --> 00:03:19,000
We'll see if Massachusetts has any data.

29
00:03:19,000 --> 00:03:24,000
And then, you know, some of the southern states do have medicinal markets.

30
00:03:24,000 --> 00:03:27,000
So Florida has medicinal cannabis sales.

31
00:03:27,000 --> 00:03:30,000
So we'll see if there's any data there.

32
00:03:30,000 --> 00:03:34,000
And then, of course, we can't forget Alaska and Hawaii.

33
00:03:34,000 --> 00:03:42,000
I think Hawaii has medicinal and then Alaska has.

34
00:03:42,000 --> 00:03:47,000
They have permitted cannabis. I forget if it's recreational and or medicinal.

35
00:03:47,000 --> 00:03:53,000
And then we may even look at a few other territories, essentially.

36
00:03:53,000 --> 00:04:03,000
So Puerto Rico has some sort of permitted cannabis, I believe.

37
00:04:03,000 --> 00:04:07,000
And then I was looking in their Guam has cannabis.

38
00:04:07,000 --> 00:04:11,000
So they may be worth looking at as well.

39
00:04:11,000 --> 00:04:19,000
So for now, we were going to drill down on Oklahoma.

40
00:04:19,000 --> 00:04:28,000
And so a pardon that this map is small.

41
00:04:28,000 --> 00:04:33,000
I'll be working to render this larger in the future.

42
00:04:33,000 --> 00:04:37,000
And so that's Texas County, Oklahoma.

43
00:04:37,000 --> 00:04:43,000
And so essentially, we'll want to look at a county by county level analysis.

44
00:04:43,000 --> 00:04:53,000
And so one measure of market competitiveness that was brought up at the cannabis conference was,

45
00:04:53,000 --> 00:05:02,000
OK, let's just look at the total number of licensees per capita.

46
00:05:02,000 --> 00:05:09,000
So that could give you an idea if there are any markets that may be underserved.

47
00:05:09,000 --> 00:05:22,000
So if there's a high number of licensees per county, then that county may be getting saturated.

48
00:05:22,000 --> 00:05:39,000
Whereas if there's a low number of licensees per county, then that county may be ripe for

49
00:05:39,000 --> 00:05:42,000
for trying to set up a new license.

50
00:05:42,000 --> 00:05:44,000
Or there may be a reason for that.

51
00:05:44,000 --> 00:05:50,000
There may be or there may be hard to do licensing in that county.

52
00:05:50,000 --> 00:05:53,000
So there could be a variety of factors.

53
00:05:53,000 --> 00:05:59,000
So what tools and data do we need?

54
00:05:59,000 --> 00:06:06,000
Well, first, started to just go ahead and write.

55
00:06:06,000 --> 00:06:10,000
So first, just posed our question.

56
00:06:10,000 --> 00:06:13,000
So, OK, we're looking at licensees per capita in Oklahoma.

57
00:06:13,000 --> 00:06:19,000
We would like to look at licensees per capita by county.

58
00:06:19,000 --> 00:06:28,000
Of course, our null hypothesis is there's just an even distribution of licensees per state.

59
00:06:28,000 --> 00:06:53,000
I mean, no one county has a statistically significant difference than another.

60
00:06:53,000 --> 00:07:04,000
And we will reject this if we can conclude that any counties do have a statistically significant number of licensees than another.

61
00:07:04,000 --> 00:07:09,000
So that's a research question.

62
00:07:09,000 --> 00:07:15,000
So let's get to work.

63
00:07:15,000 --> 00:07:24,000
So first things first, we need the number of licensees in Oklahoma.

64
00:07:24,000 --> 00:07:41,000
I have Fortanlytics put together a script that collects data, public data, public cannabis data for Oklahoma.

65
00:07:41,000 --> 00:07:48,000
And so this is MIT licensed so anybody can use it for any purpose they wish.

66
00:07:48,000 --> 00:08:01,000
Just make sure to give a link back to Canlytics as the copyright holder.

67
00:08:01,000 --> 00:08:14,000
However, this script essentially just collects data from various sources.

68
00:08:14,000 --> 00:08:16,000
So we're not looking at revenue at the moment.

69
00:08:16,000 --> 00:08:18,000
We're just looking at licensees.

70
00:08:18,000 --> 00:08:29,000
So essentially, Oklahoma provides a running list here of licensees.

71
00:08:29,000 --> 00:08:47,000
So essentially what the script does is, OK, just downloads all of these PDFs, parses this data into a usable format.

72
00:08:47,000 --> 00:08:53,000
So there's the raw data.

73
00:08:53,000 --> 00:09:13,000
And then you can use this script here and you can get a decent list of licensees in Oklahoma.

74
00:09:13,000 --> 00:09:24,000
So now we know, OK, we've got the licensee, their trade name.

75
00:09:24,000 --> 00:09:42,000
So a lot of companies may, not a lot, but a handful of companies may do business under a different name than their registered business, their licensed business name.

76
00:09:42,000 --> 00:09:50,000
So, for example, here we've got a license that's licensed under Donald P. Northcutt.

77
00:09:50,000 --> 00:09:55,000
However, they're going to be packaging as Big Bud.

78
00:09:55,000 --> 00:10:00,000
So you're going to know them at the store as Big Bud.

79
00:10:00,000 --> 00:10:06,000
So this is not uncommon in the cannabis industry.

80
00:10:06,000 --> 00:10:08,000
Here's Toe, Oklahoma.

81
00:10:08,000 --> 00:10:24,000
So just to say what anyone's doing per se, but a lot of times you may see it essentially to help with banking.

82
00:10:24,000 --> 00:10:28,000
So it's still tough for cannabis businesses to get banked.

83
00:10:28,000 --> 00:10:35,000
So these people could be doing things for different reasons.

84
00:10:35,000 --> 00:10:44,000
However, it would probably be easier to get banked under agency properties versus Toe, Oklahoma.

85
00:10:44,000 --> 00:10:52,000
And that's just the tragic nature of the state of affairs.

86
00:10:52,000 --> 00:11:03,000
And so, OK, but to long story short, we have many companies here in Oklahoma.

87
00:11:03,000 --> 00:11:22,000
So some include including every licensee that's plant touching in some way, you know, from waste disposal to laboratories to dispensaries to the growers themselves.

88
00:11:22,000 --> 00:11:25,000
And of course, processors and the whole shebang.

89
00:11:25,000 --> 00:11:34,000
There's almost 13,000 plant touching businesses in Oklahoma.

90
00:11:34,000 --> 00:11:41,000
So where are they? Are they all right outside of Oklahoma City?

91
00:11:41,000 --> 00:11:43,000
Are they all in the country?

92
00:11:43,000 --> 00:11:56,000
So we know what city they're in. However, actually, we can even look at the zip code.

93
00:11:56,000 --> 00:12:00,000
So I didn't even realize that we could get that granular.

94
00:12:00,000 --> 00:12:06,000
So for this analysis, we'll just be looking at the county.

95
00:12:06,000 --> 00:12:13,000
So we'll try to get a county by county count.

96
00:12:13,000 --> 00:12:19,000
So without further ado,

97
00:12:19,000 --> 00:12:23,000
let's try to do just that.

98
00:12:23,000 --> 00:12:38,000
So it's just running quick Python script here.

99
00:12:38,000 --> 00:12:51,000
OK, so essentially, we're just reading in the licensees data.

100
00:12:51,000 --> 00:13:00,000
So this is the data we just saw in Excel.

101
00:13:00,000 --> 00:13:04,000
So we can just do some sanity checks here.

102
00:13:04,000 --> 00:13:21,000
For example, let's look at OK, what are all of the license types that we did in fact get?

103
00:13:21,000 --> 00:13:29,000
So like we expected, we've got the growers, the labs, the waste disposals and dispensaries.

104
00:13:29,000 --> 00:13:34,000
And we also have processors and transporters.

105
00:13:34,000 --> 00:13:42,000
So you could find out how many of each there are.

106
00:13:42,000 --> 00:13:48,000
So if you wanted to do that, which I mean, why not?

107
00:13:48,000 --> 00:13:55,000
So I'm a little curious. OK, how many of each are there and what's the percent of the total?

108
00:13:55,000 --> 00:14:10,000
Because we can do a little exploratory data analysis since this is the first real good time that I've had to look at the licensees data here.

109
00:14:10,000 --> 00:14:13,000
So let's look at it together.

110
00:14:13,000 --> 00:14:16,000
So the way I would do this is OK.

111
00:14:16,000 --> 00:14:18,000
So let's look at each of these types.

112
00:14:18,000 --> 00:14:26,000
So for the license type.

113
00:14:26,000 --> 00:14:29,000
OK, so what do we want to do here?

114
00:14:29,000 --> 00:14:43,000
So we want to isolate data for that type.

115
00:14:43,000 --> 00:14:52,000
So let's just locate the licensees where the licensees license type.

116
00:14:52,000 --> 00:14:55,000
Equal to the license type.

117
00:14:55,000 --> 00:14:58,000
OK, not too bad.

118
00:14:58,000 --> 00:15:00,000
And then what do we want to do?

119
00:15:00,000 --> 00:15:04,000
Well, let's just print out some statistics here.

120
00:15:04,000 --> 00:15:07,000
So one, a statistic is a total.

121
00:15:07,000 --> 00:15:30,000
So so we could just say, OK, what's the total number of license type?

122
00:15:30,000 --> 00:15:38,000
That's just going to be the total number that we just found.

123
00:15:38,000 --> 00:15:47,000
And then I'm also curious, OK, what's that as a percentage of the total?

124
00:15:47,000 --> 00:15:51,000
So let's just say.

125
00:15:51,000 --> 00:15:56,000
OK, so what's that as a percent of the total?

126
00:15:56,000 --> 00:16:07,000
Well, we're just going to be the length divided by all licensees.

127
00:16:07,000 --> 00:16:22,000
Then I'm just going to add an extra space here just so things print out decently.

128
00:16:22,000 --> 00:16:33,000
And then let's also maybe just format this decently.

129
00:16:33,000 --> 00:16:45,000
So let's round this times 100 in two decimal places.

130
00:16:45,000 --> 00:16:52,000
OK, so now the numbers look a little cleaner here.

131
00:16:52,000 --> 00:16:56,000
OK, so this is interesting.

132
00:16:56,000 --> 00:17:07,000
We can start to, you know, these would be good numbers for a pie chart here.

133
00:17:07,000 --> 00:17:12,000
So that way you can actually see, OK, what are the pieces of the pie?

134
00:17:12,000 --> 00:17:16,000
For now, we'll just look at the just the raw numbers.

135
00:17:16,000 --> 00:17:19,000
So OK, so this is actually surprising.

136
00:17:19,000 --> 00:17:23,000
So I knew it was going to be a high percentage.

137
00:17:23,000 --> 00:17:30,000
For some reason, I was thinking higher, just my.

138
00:17:30,000 --> 00:17:35,000
My. What do you call it?

139
00:17:35,000 --> 00:17:39,000
My naive prior would have been higher.

140
00:17:39,000 --> 00:17:44,000
However, it's about 70 percent of all licensees are growers.

141
00:17:44,000 --> 00:17:48,000
About 12 percent are processors.

142
00:17:48,000 --> 00:17:53,000
That's interesting. I would have thought higher, but about 12 percent.

143
00:17:53,000 --> 00:17:58,000
That's about right. I would have thought about 20 percent, but.

144
00:17:58,000 --> 00:18:00,000
That sounds about right.

145
00:18:00,000 --> 00:18:03,000
Well, yeah, so maybe maybe that's more growers than I was expecting.

146
00:18:03,000 --> 00:18:08,000
So you have about 12 percent processors.

147
00:18:08,000 --> 00:18:16,000
Well, should we try to do a quick pie chart just for the heck of it?

148
00:18:16,000 --> 00:18:21,000
Since since we're crunching numbers here and that's what we do.

149
00:18:21,000 --> 00:18:29,000
So let's do a map plot live pie chart and see if it's not the hardest thing in the world.

150
00:18:29,000 --> 00:18:32,000
Because if we can get a pie chart.

151
00:18:32,000 --> 00:18:37,000
You know, thrown together.

152
00:18:37,000 --> 00:18:43,000
You know, in a couple of minutes, then.

153
00:18:43,000 --> 00:18:49,000
Also, I'm going to spin this down.

154
00:18:49,000 --> 00:18:52,000
Yeah, I think we could look at some of this data here.

155
00:18:52,000 --> 00:18:55,000
So let's let's look at a quick pie chart here.

156
00:18:55,000 --> 00:18:59,000
So right, because we're already looping through this.

157
00:18:59,000 --> 00:19:11,000
So all we have to do here is just collect these points that we're.

158
00:19:11,000 --> 00:19:13,000
We're trying to do, right?

159
00:19:13,000 --> 00:19:20,000
So we just have to collect the what's called what they're calling the sizes here.

160
00:19:20,000 --> 00:19:24,000
So the size of each piece of the pie.

161
00:19:24,000 --> 00:19:30,000
Right. So all we have to do.

162
00:19:30,000 --> 00:19:34,000
Is keep track of the proportion.

163
00:19:34,000 --> 00:19:39,000
Each license type makes up and look, we're already calculating the proportion.

164
00:19:39,000 --> 00:19:47,000
So that's easy enough.

165
00:19:47,000 --> 00:19:57,000
We don't really need to explode anything. In fact, we're trying to do the exact opposite.

166
00:19:57,000 --> 00:20:02,000
And then it looks like.

167
00:20:02,000 --> 00:20:13,000
They have their labels as a tuple, but we're going to try to do our labels as a list and see if that works.

168
00:20:13,000 --> 00:20:28,000
No one caught me. We need to append these not.

169
00:20:28,000 --> 00:20:50,000
OK.

170
00:20:50,000 --> 00:21:03,000
So let's see if we can't make a quick pie chart just to get a bit better visualization.

171
00:21:03,000 --> 00:21:05,000
OK. And look at that.

172
00:21:05,000 --> 00:21:08,000
So now we can actually get the data.

173
00:21:08,000 --> 00:21:24,000
So.

174
00:21:24,000 --> 00:21:28,000
So we've got the growers.

175
00:21:28,000 --> 00:21:36,000
Making up about 68 percent processors, 12 percent.

176
00:21:36,000 --> 00:21:40,000
The dispensaries, about 18 percent.

177
00:21:40,000 --> 00:21:43,000
So.

178
00:21:43,000 --> 00:21:46,000
That's quite interesting.

179
00:21:46,000 --> 00:21:50,000
About 30 percent here.

180
00:21:50,000 --> 00:21:52,000
Processors, dispensaries.

181
00:21:52,000 --> 00:21:55,000
And then look, just this tiny little sliver.

182
00:21:55,000 --> 00:22:07,000
You have about one percent of.

183
00:22:07,000 --> 00:22:15,000
So you have a very small number of these almost ancillary.

184
00:22:15,000 --> 00:22:20,000
Plant touching businesses so that still make the.

185
00:22:20,000 --> 00:22:25,000
Make the industry function at the end of the day. Right.

186
00:22:25,000 --> 00:22:29,000
So you need transporters to get products from point A to point B.

187
00:22:29,000 --> 00:22:35,000
In certain states like Oklahoma, you really need to pay attention to waste disposal.

188
00:22:35,000 --> 00:22:40,000
So there's waste disposal companies and you need your your quality control.

189
00:22:40,000 --> 00:22:49,000
So you've got the laboratories often doing the state mandated quality control tests.

190
00:22:49,000 --> 00:22:52,000
So. So that's the breakdown of licensees.

191
00:22:52,000 --> 00:22:55,000
Let's go ahead and move on.

192
00:22:55,000 --> 00:22:58,000
You know, trying to get to bogged down in that.

193
00:22:58,000 --> 00:23:03,000
So we've got our number of licensees.

194
00:23:03,000 --> 00:23:08,000
And like let's not forget lose sight of what we're after here.

195
00:23:08,000 --> 00:23:16,000
We're after the county by county breakdown.

196
00:23:16,000 --> 00:23:20,000
OK, let's get a couple of data points before we continue.

197
00:23:20,000 --> 00:23:24,000
So because we were trying to get the population.

198
00:23:24,000 --> 00:23:31,000
So a quick aside.

199
00:23:31,000 --> 00:23:38,000
The place you want to go if you want to get reliable.

200
00:23:38,000 --> 00:23:41,000
Credible.

201
00:23:41,000 --> 00:23:49,000
Economic data points for economic indicators such as population.

202
00:23:49,000 --> 00:23:55,000
The place you want to go is the Federal Reserve and Fred.

203
00:23:55,000 --> 00:23:59,000
So this is economic data repository.

204
00:23:59,000 --> 00:24:03,000
Public open to the you know, it's free.

205
00:24:03,000 --> 00:24:09,000
And you can use it via API, which is what we'll do here momentarily.

206
00:24:09,000 --> 00:24:12,000
And.

207
00:24:12,000 --> 00:24:18,000
This is a trusted data source, so no one will be too hard on you.

208
00:24:18,000 --> 00:24:31,000
If you use the Fed Fred data, you know, of course, they may say, oh, why did you use, you know, the not seasonally adjusted series or this or that?

209
00:24:31,000 --> 00:24:42,000
However, they won't you know, they won't try to undermine the data source itself. So so that's why it's credible and reliable.

210
00:24:42,000 --> 00:24:44,000
So.

211
00:24:44,000 --> 00:24:48,000
But in that regard, I would still hedge the data in that.

212
00:24:48,000 --> 00:24:52,000
OK, the data is still probably not probably.

213
00:24:52,000 --> 00:24:56,000
It is an estimate of the population here.

214
00:24:56,000 --> 00:25:01,000
So, you know, it's probably not spot on to a T.

215
00:25:01,000 --> 00:25:05,000
However, you know, we can get a decent.

216
00:25:05,000 --> 00:25:09,000
Measure of the of the population here.

217
00:25:09,000 --> 00:25:12,000
And so, for example.

218
00:25:12,000 --> 00:25:19,000
Here we just have OK, here's just the total population in Oklahoma.

219
00:25:19,000 --> 00:25:23,000
Times 1000, so this is thousands of people.

220
00:25:23,000 --> 00:25:26,000
So.

221
00:25:26,000 --> 00:25:31,000
Just shy of four million.

222
00:25:31,000 --> 00:25:35,000
People in Oklahoma.

223
00:25:35,000 --> 00:25:40,000
And then you can get this on a county by county level basis.

224
00:25:40,000 --> 00:25:46,000
And so, for example, this is LaFleur County.

225
00:25:46,000 --> 00:25:53,000
Has almost 50,000 people.

226
00:25:53,000 --> 00:25:56,000
So.

227
00:25:56,000 --> 00:26:04,000
So long story short, we can go ahead and get those data points just using the, you know, the Federal Reserve.

228
00:26:04,000 --> 00:26:09,000
API.

229
00:26:09,000 --> 00:26:15,000
So, you know, we can get the population.

230
00:26:15,000 --> 00:26:17,000
Of Oklahoma.

231
00:26:17,000 --> 00:26:21,000
And then we can actually get the licensees per capita.

232
00:26:21,000 --> 00:26:29,000
So this number on itself may not be that.

233
00:26:29,000 --> 00:26:31,000
Informative slash interesting.

234
00:26:31,000 --> 00:26:38,000
However, what would be interesting would be to calculate this number licensees per capita.

235
00:26:38,000 --> 00:26:44,000
For.

236
00:26:44,000 --> 00:26:49,000
For all of the states that way we can see, OK, which states.

237
00:26:49,000 --> 00:26:53,000
You know, how do the different states shake out?

238
00:26:53,000 --> 00:26:56,000
So, for example.

239
00:26:56,000 --> 00:27:04,000
Illinois, I think there's only, you know, a handful of dozen.

240
00:27:04,000 --> 00:27:06,000
So, you know.

241
00:27:06,000 --> 00:27:12,000
A couple dozen to maybe more licensees in the whole state.

242
00:27:12,000 --> 00:27:16,000
And so if you think about the population of Illinois.

243
00:27:16,000 --> 00:27:25,000
Then this number licensees per capita is going to be much smaller than it will be in Oklahoma.

244
00:27:25,000 --> 00:27:28,000
And so the scale will be interesting.

245
00:27:28,000 --> 00:27:32,000
So the scale there will be interesting.

246
00:27:32,000 --> 00:27:35,000
So.

247
00:27:35,000 --> 00:27:37,000
We can't.

248
00:27:37,000 --> 00:27:40,000
Well, yeah, we can't do state to state yet.

249
00:27:40,000 --> 00:27:44,000
So let's just let's look at county to county.

250
00:27:44,000 --> 00:27:47,000
And so.

251
00:27:47,000 --> 00:27:55,000
Basically, what I've done here is I've just gone ahead and gotten the federal reserve.

252
00:27:55,000 --> 00:28:00,000
That Fred code here for looking up each county.

253
00:28:00,000 --> 00:28:03,000
And so this will be the first time I've done this.

254
00:28:03,000 --> 00:28:11,000
So long story short, let's try to look at the licensees per capita in each county.

255
00:28:11,000 --> 00:28:16,000
And ambitiously, we may try to plot this.

256
00:28:16,000 --> 00:28:20,000
It could be a wild one, but we're going to try.

257
00:28:20,000 --> 00:28:23,000
OK, so.

258
00:28:23,000 --> 00:28:29,000
Without further ado, let's start looping over these counties.

259
00:28:29,000 --> 00:28:34,000
So so we'll just kind of view this on the fly, right?

260
00:28:34,000 --> 00:28:37,000
So OK, so we want to do it for each county.

261
00:28:37,000 --> 00:28:38,000
We know that.

262
00:28:38,000 --> 00:28:43,000
So for county and counties.

263
00:28:43,000 --> 00:28:46,000
Well, what do we want to do?

264
00:28:46,000 --> 00:28:55,000
We want to find all the licensees in that county.

265
00:28:55,000 --> 00:28:59,000
Then we want to get the population.

266
00:28:59,000 --> 00:29:01,000
Of that county.

267
00:29:01,000 --> 00:29:14,000
And then at that point, we just want to calculate the licensees per capita.

268
00:29:14,000 --> 00:29:18,000
In that county.

269
00:29:18,000 --> 00:29:21,000
And we want to keep track of this.

270
00:29:21,000 --> 00:29:25,000
So.

271
00:29:25,000 --> 00:29:32,000
I'm just going to create a dictionary here.

272
00:29:32,000 --> 00:29:41,000
We've got a dictionary of states and because just heads up that I kind of know this is how I want the data presented here in a second.

273
00:29:41,000 --> 00:29:44,000
So that's why I'm formatting it like this.

274
00:29:44,000 --> 00:29:48,000
So so we then do licensees per capita.

275
00:29:48,000 --> 00:29:52,000
OK, so first things first, we want all the licensees in that county.

276
00:29:52,000 --> 00:29:54,000
So.

277
00:29:54,000 --> 00:29:56,000
So the county licensees.

278
00:29:56,000 --> 00:29:58,000
Oh, well, that'll be simple enough.

279
00:29:58,000 --> 00:30:02,000
That's just going to be the licensees.

280
00:30:02,000 --> 00:30:06,000
Where the licensees.

281
00:30:06,000 --> 00:30:08,000
County.

282
00:30:08,000 --> 00:30:10,000
Correct.

283
00:30:10,000 --> 00:30:14,000
It's equal to that county.

284
00:30:14,000 --> 00:30:16,000
Simple enough.

285
00:30:16,000 --> 00:30:20,000
And we want to get the population of that county.

286
00:30:20,000 --> 00:30:25,000
Well, we know how to get the fed spread data.

287
00:30:25,000 --> 00:30:30,000
So we've got the county population.

288
00:30:30,000 --> 00:30:37,000
And then we got the county code here.

289
00:30:37,000 --> 00:30:39,000
That's right.

290
00:30:39,000 --> 00:30:44,000
And I want to make sure this is equal equal to the county name.

291
00:30:44,000 --> 00:30:47,000
Because we've built a good county.

292
00:30:47,000 --> 00:30:50,000
I just call it pop graph.

293
00:30:50,000 --> 00:30:53,000
So.

294
00:30:53,000 --> 00:31:01,000
Long story short, we want to get this.

295
00:31:01,000 --> 00:31:04,000
OK.

296
00:31:04,000 --> 00:31:18,000
And then we would just want to calculate the licensees per capita in that county, which we did for the state. So we can use that same logic once again.

297
00:31:18,000 --> 00:31:23,000
So this is just going to be county licensees per capita.

298
00:31:23,000 --> 00:31:29,000
And that's just going to be.

299
00:31:29,000 --> 00:31:34,000
County licensees.

300
00:31:34,000 --> 00:31:38,000
Divided by the county population.

301
00:31:38,000 --> 00:31:43,000
That's a thousand since these are in thousands of people.

302
00:31:43,000 --> 00:31:44,000
Cool.

303
00:31:44,000 --> 00:31:47,000
So we now should have that.

304
00:31:47,000 --> 00:31:50,000
And then.

305
00:31:50,000 --> 00:31:53,000
Basically, we want to keep track.

306
00:31:53,000 --> 00:32:02,000
Of the data and the way I'm doing it is in the dictionary of county.

307
00:32:02,000 --> 00:32:11,000
And then I actually want some variables here. And then basically I want to keep track of this as.

308
00:32:11,000 --> 00:32:18,000
Licensees per capita.

309
00:32:18,000 --> 00:32:23,000
That's how I'll keep track.

310
00:32:23,000 --> 00:32:27,000
Of this data.

311
00:32:27,000 --> 00:32:32,000
So.

312
00:32:32,000 --> 00:32:35,000
That should work.

313
00:32:35,000 --> 00:32:38,000
Let's just go ahead and.

314
00:32:38,000 --> 00:32:41,000
Print some.

315
00:32:41,000 --> 00:32:51,000
Print some data out here just so we can kind of see what we're doing as we're going. So we're just going to say OK.

316
00:32:51,000 --> 00:32:56,000
OK, what's the county.

317
00:32:56,000 --> 00:33:04,000
With the population of the county.

318
00:33:04,000 --> 00:33:08,000
And.

319
00:33:08,000 --> 00:33:10,000
What's.

320
00:33:10,000 --> 00:33:20,000
The number of licensees.

321
00:33:20,000 --> 00:33:29,000
And then finally, what is the number of licensees per capita.

322
00:33:29,000 --> 00:33:41,000
So just go ahead and print that out for each county.

323
00:33:41,000 --> 00:33:48,000
And.

324
00:33:48,000 --> 00:33:59,000
I'm going to go ahead and just print a separating line here.

325
00:33:59,000 --> 00:34:10,000
And let's see what happens.

326
00:34:10,000 --> 00:34:23,000
We got an error and that's because I was not using this county name here. So let's just go ahead and define a new variable in the Pythonic way.

327
00:34:23,000 --> 00:34:27,000
County name.

328
00:34:27,000 --> 00:34:39,000
So simple enough.

329
00:34:39,000 --> 00:34:53,000
And we are cranking out some calculations here.

330
00:34:53,000 --> 00:34:55,000
Awesome.

331
00:34:55,000 --> 00:34:58,000
So we just printed out.

332
00:34:58,000 --> 00:35:01,000
Probably printed too many lines here.

333
00:35:01,000 --> 00:35:05,000
We just.

334
00:35:05,000 --> 00:35:15,000
I probably should have once again, print. Oh well, print county name. OK, so long story short, we've got our counties here.

335
00:35:15,000 --> 00:35:18,000
We've got the population.

336
00:35:18,000 --> 00:35:26,000
As of 2020. So keep in mind, this is the 2020 population. So all of these numbers are done with 2020 population.

337
00:35:26,000 --> 00:35:29,000
So.

338
00:35:29,000 --> 00:35:35,000
So we now have the number of licensees and then we also.

339
00:35:35,000 --> 00:35:38,000
So, for example.

340
00:35:38,000 --> 00:35:49,000
You can really start to compare these counties, right? And so this is where you may start to get a measure of competitiveness, right?

341
00:35:49,000 --> 00:35:57,000
Because, OK, a DARE County has already issued 18 license, 118 licenses.

342
00:35:57,000 --> 00:36:02,000
A TOCA County has issued 76 licenses.

343
00:36:02,000 --> 00:36:05,000
However.

344
00:36:05,000 --> 00:36:16,000
A TOCA may be slightly more saturated with licensees than a DARE.

345
00:36:16,000 --> 00:36:31,000
Just, just, just slightly. And so, so this is where, OK, you may not necessarily just want to look at the total number of licensees issued per county.

346
00:36:31,000 --> 00:36:38,000
But you may actually want to measure them.

347
00:36:38,000 --> 00:36:41,000
Based on.

348
00:36:41,000 --> 00:36:46,000
On licensees per capita.

349
00:36:46,000 --> 00:36:48,000
So.

350
00:36:48,000 --> 00:36:55,000
So we now have this data and so now we want to look at this data.

351
00:36:55,000 --> 00:37:03,000
So there's two ways we can do this, the easy and the hard way.

352
00:37:03,000 --> 00:37:13,000
So the easy way is just, if this just works with my JavaScript.

353
00:37:13,000 --> 00:37:26,000
If this just works with my JavaScript map here, if not, we'll have to make a map in Python.

354
00:37:26,000 --> 00:37:35,000
So I'm going to my try. Yes. So right now I just need to print.

355
00:37:35,000 --> 00:37:43,000
Right now I just need to print the licensees per capita.

356
00:37:43,000 --> 00:37:49,000
And then let's see if this data.

357
00:37:49,000 --> 00:37:54,000
Just works nicely.

358
00:37:54,000 --> 00:38:03,000
With

359
00:38:03,000 --> 00:38:08,000
this Oklahoma County map that I've been tinkering with.

360
00:38:08,000 --> 00:38:13,000
So

361
00:38:13,000 --> 00:38:19,000
hopefully

362
00:38:19,000 --> 00:38:31,000
this will be a bit of a problem.

363
00:38:31,000 --> 00:38:55,000
Okay, not the end of the world here. So real quick like I just need to format.

364
00:38:55,000 --> 00:39:07,000
The name where the spaces are replaced with underscores.

365
00:39:07,000 --> 00:39:18,000
Sort of a pain here and I'm sorry Fred that I'm about to make a bunch of more API calls, but that's what we're going to do.

366
00:39:18,000 --> 00:39:28,000
Okay.

367
00:39:28,000 --> 00:39:46,000
But for some reason I don't think we'll have a chloro plant right out of the box.

368
00:39:46,000 --> 00:39:59,000
Unless we're somehow able to.

369
00:39:59,000 --> 00:40:01,000
Yeah.

370
00:40:01,000 --> 00:40:06,000
Okay.

371
00:40:06,000 --> 00:40:16,000
So I'll try to do this.

372
00:40:16,000 --> 00:40:23,000
This could be interesting. So long story short, we're going to try to visualize this data.

373
00:40:23,000 --> 00:40:31,000
Which is tricky right because it's state data, it's geographical data, but

374
00:40:31,000 --> 00:40:40,000
we're going to try here. So long story short, we are trying to color this map.

375
00:40:40,000 --> 00:40:49,000
So we're trying to color this with places with more licensees to be darker than the places with less license.

376
00:40:49,000 --> 00:40:54,000
So going to show you a little bit of web programming here.

377
00:40:54,000 --> 00:41:08,000
So we're going to just dump in our data just for the time being normally you would get your data from the server, you know.

378
00:41:08,000 --> 00:41:16,000
Right. Normally you get that from your database. Okay, so we've got our data here.

379
00:41:16,000 --> 00:41:27,000
What we can do is we can actually add a little tool tip.

380
00:41:27,000 --> 00:41:47,000
Bear with me.

381
00:41:47,000 --> 00:41:59,000
So okay, so the first thing first, let's just see if we can't just bought all this data here on the map.

382
00:41:59,000 --> 00:42:06,000
Okay, cool. So now we can at least see the numbers.

383
00:42:06,000 --> 00:42:14,000
However, we want to get them colored.

384
00:42:14,000 --> 00:42:27,000
Also, I wonder if we can do something like, I forget what it is off the top of my head, but

385
00:42:27,000 --> 00:42:33,000
I don't think it was that.

386
00:42:33,000 --> 00:42:41,000
Well, this is what Google's for.

387
00:42:41,000 --> 00:42:57,000
Oh, it's too fixed. Okay.

388
00:42:57,000 --> 00:43:03,000
All right. And let's give it a few more decimal places here.

389
00:43:03,000 --> 00:43:07,000
Awesome.

390
00:43:07,000 --> 00:43:14,000
So now we have licensees per capita

391
00:43:14,000 --> 00:43:16,000
by county in Oklahoma.

392
00:43:16,000 --> 00:43:22,000
And so we've got 15 minutes left. Let's see if we can't color these.

393
00:43:22,000 --> 00:43:27,000
So the

394
00:43:27,000 --> 00:43:50,000
proper sometimes use a proper way and improper way to do things. So the proper way would be if somehow

395
00:43:50,000 --> 00:44:01,000
we could

396
00:44:01,000 --> 00:44:05,000
I wonder

397
00:44:05,000 --> 00:44:15,000
if this variable is going to be accessible right here.

398
00:44:15,000 --> 00:44:37,000
Doesn't look like it.

399
00:44:37,000 --> 00:44:42,000
Interesting.

400
00:44:42,000 --> 00:45:07,000
Okay, so we may have to resort.

401
00:45:07,000 --> 00:45:25,000
Okay.

402
00:45:25,000 --> 00:45:38,000
So the problem is, this is probably like much

403
00:45:38,000 --> 00:45:56,000
smaller than number necessary to give us a good range.

404
00:45:56,000 --> 00:46:05,000
Okay, so this may end up waiting to the future or we can maybe try a Python chloropleth real quick.

405
00:46:05,000 --> 00:46:17,000
Just trying to scale this a bit.

406
00:46:17,000 --> 00:46:35,000
Last shot here.

407
00:46:35,000 --> 00:46:49,000
Hmm.

408
00:46:49,000 --> 00:47:06,000
Okay, so we may not be able to get this JavaScript working right this second. So let's see if we can't.

409
00:47:06,000 --> 00:47:22,000
I'm not confident that I can get a chloropleth going here in a in 10 minutes in Python.

410
00:47:22,000 --> 00:47:37,000
Well, okay, here's the deal. We'll try it. We'll give it 10 minutes and see if we can't make a chloropleth of this data in Oklahoma in Python. And then if not, I'll finish this later.

411
00:47:37,000 --> 00:47:46,000
I'll finish this plot later today and then post that link. So this is how you make a chloropleth in Python.

412
00:47:46,000 --> 00:48:14,000
So

413
00:48:14,000 --> 00:48:21,000
Well, it may not actually be as hard as one may think.

414
00:48:21,000 --> 00:48:28,000
So let's see if we can use this handy dandy

415
00:48:28,000 --> 00:48:47,000
package and do this work for us, right, because we're not trying to reinvent the wheel here. Okay, so let's see what this yields us.

416
00:48:47,000 --> 00:48:50,000
Okay, do we just have our county names here?

417
00:48:50,000 --> 00:48:58,000
Okay, those look like the Oklahoma counties. So that's cool.

418
00:48:58,000 --> 00:49:10,000
We already have the licensees.

419
00:49:10,000 --> 00:49:22,000
And so what was this chloropleth doing?

420
00:49:22,000 --> 00:49:36,000
So let's just grab all of this.

421
00:49:36,000 --> 00:49:42,000
And so we've got our county names, and we're just trying to make this chloropleth.

422
00:49:42,000 --> 00:49:49,000
Okay, so we're, so it says, okay, so now we define the county shapes.

423
00:49:49,000 --> 00:49:57,000
And let's hope these are defined already through BOKA.

424
00:49:57,000 --> 00:50:01,000
It looks like they are. That's awesome.

425
00:50:01,000 --> 00:50:09,000
Looks like we need a couple of BOKA's plotting tools.

426
00:50:09,000 --> 00:50:24,000
That's okay with us.

427
00:50:24,000 --> 00:50:38,000
Okay, so the main thing we need are values. That actually shouldn't be the hardest thing to do.

428
00:50:38,000 --> 00:50:44,000
Maybe just partition our values.

429
00:50:44,000 --> 00:50:49,000
So, you know,

430
00:50:49,000 --> 00:50:54,000
let's just do it in the order that county names is in.

431
00:50:54,000 --> 00:51:04,000
So we basically want to define our values here.

432
00:51:04,000 --> 00:51:15,000
And so we'll want to just account for maybe just in case any of these counties don't have any.

433
00:51:15,000 --> 00:51:26,000
So what we want to get licensees per county.

434
00:51:26,000 --> 00:51:32,000
So we want to get the value for that county name.

435
00:51:32,000 --> 00:51:42,000
If we don't find anything,

436
00:51:42,000 --> 00:51:57,000
then we don't want it.

437
00:51:57,000 --> 00:52:02,000
And this is just because this is how we,

438
00:52:02,000 --> 00:52:06,000
well, how I formatted this data above.

439
00:52:06,000 --> 00:52:10,000
Okay, actually, I'm going to need to just do something here.

440
00:52:10,000 --> 00:52:14,000
Okay, so this just didn't work.

441
00:52:14,000 --> 00:52:23,000
I may revisit this later.

442
00:52:23,000 --> 00:52:26,000
Trying to do this on the quick. I think I can finish.

443
00:52:26,000 --> 00:52:31,000
Okay, so we've got our licensees per capita.

444
00:52:31,000 --> 00:52:40,000
And then let's just.

445
00:52:40,000 --> 00:52:45,000
Let's just do this simple, right? That's sort of the Candlesticks model.

446
00:52:45,000 --> 00:52:49,000
And I'm going to have to clean up all this code later.

447
00:52:49,000 --> 00:52:57,000
Okay, so we've got a much more simple dictionary now, which is the county name value.

448
00:52:57,000 --> 00:53:04,000
So that should make making these values much easier to get.

449
00:53:04,000 --> 00:53:12,000
Right now, our value is just going to be licensees per capita.

450
00:53:12,000 --> 00:53:16,000
Okay, much simpler.

451
00:53:16,000 --> 00:53:30,000
So we've got all our values.

452
00:53:30,000 --> 00:53:47,000
Okay, so this is the Oklahoma cannabis licensees per capita.

453
00:53:47,000 --> 00:53:54,000
Well, let's see what happens.

454
00:53:54,000 --> 00:54:00,000
We've got five minutes left to debug.

455
00:54:00,000 --> 00:54:04,000
If anything goes askew.

456
00:54:04,000 --> 00:54:08,000
And it's not held get the JavaScript up and running today.

457
00:54:08,000 --> 00:54:11,000
So thanks for bearing with.

458
00:54:11,000 --> 00:54:17,000
Thanks for bearing with as we went through this exploration here, but you saw us firsthand.

459
00:54:17,000 --> 00:54:29,000
How we go about scrounging the licensees data, supplementing it with population data from the Federal Reserve.

460
00:54:29,000 --> 00:54:33,000
Calculating licensees per capita.

461
00:54:33,000 --> 00:54:36,000
And.

462
00:54:36,000 --> 00:54:43,000
Ideally, providing a visualization here.

463
00:54:43,000 --> 00:54:55,000
And we've got an obscure error that I'm not certain I can debug right here on the fly.

464
00:54:55,000 --> 00:55:14,000
One can hope it's just some problem with these tooltips or something.

465
00:55:14,000 --> 00:55:25,000
But.

466
00:55:25,000 --> 00:55:47,000
So we unfortunately.

467
00:55:47,000 --> 00:56:15,000
They don't think it's the value so there must be something.

468
00:56:15,000 --> 00:56:19,000
What.

469
00:56:19,000 --> 00:56:34,000
What is going so.

470
00:56:34,000 --> 00:56:48,000
So let's just try to district.

471
00:56:48,000 --> 00:57:13,000
Oh.

472
00:57:13,000 --> 00:57:34,000
Probably because we don't have a plot.

473
00:57:34,000 --> 00:57:44,000
That's so bizarre.

474
00:57:44,000 --> 00:58:13,000
So.

475
00:58:13,000 --> 00:58:21,000
So we may have to give it one shot here in spider just.

476
00:58:21,000 --> 00:58:25,000
Just to see if there's something just going on with the plotting.

477
00:58:25,000 --> 00:58:29,000
But otherwise, I may have to send it to.

478
00:58:29,000 --> 00:58:34,000
The final output to you afterwards.

479
00:58:34,000 --> 00:58:49,000
Just a little bit of a letdown because I was real curious to see this data and talk about it here.

480
00:58:49,000 --> 00:59:17,000
Let's just do one quick Google search and then we may have to just.

481
00:59:17,000 --> 00:59:21,000
Oh, that's right. So it gives us.

482
00:59:21,000 --> 00:59:23,000
This HTML.

483
00:59:23,000 --> 00:59:31,000
Hold on. We may just have a plot here.

484
00:59:31,000 --> 00:59:36,000
That's bizarre.

485
00:59:36,000 --> 00:59:40,000
We may just have a plot.

486
00:59:40,000 --> 00:59:48,000
That would be bizarre if.

487
00:59:48,000 --> 00:59:53,000
Well, by golly.

488
00:59:53,000 --> 00:59:56,000
Something happened.

489
00:59:56,000 --> 01:00:10,000
I don't think this is what we quite what we quite wanted. It was it. So this appears to be Oklahoma.

490
01:00:10,000 --> 01:00:18,000
With with only some of the county shaded in.

491
01:00:18,000 --> 01:00:26,000
And and so.

492
01:00:26,000 --> 01:00:31,000
So long story short, going to need to touch this up a bit.

493
01:00:31,000 --> 01:00:37,000
But what you can see right off the bat here.

494
01:00:37,000 --> 01:00:45,000
So there's going to be a lot of cleanup that needs to be done. So thank you for bearing with this initial for a.

495
01:00:45,000 --> 01:00:48,000
But this is I mean, this is interesting right off the bat.

496
01:00:48,000 --> 01:00:51,000
So it's going to need to be cleaned up a bit.

497
01:00:51,000 --> 01:00:55,000
But as you can see.

498
01:00:55,000 --> 01:01:01,000
The counties that even permit. It looks like cannabis licensees are limited.

499
01:01:01,000 --> 01:01:06,000
And so it looks like I want to get the actual numbers here and get the shaded in.

500
01:01:06,000 --> 01:01:18,000
But surprisingly, it looks like a wide swath of the licensees are in this.

501
01:01:18,000 --> 01:01:21,000
Eastern Crescent.

502
01:01:21,000 --> 01:01:23,000
Here.

503
01:01:23,000 --> 01:01:26,000
So.

504
01:01:26,000 --> 01:01:31,000
So I apologize that these maps were.

505
01:01:31,000 --> 01:01:39,000
Finicky and not behaving well. And so that's that's the thing with geographical data like it is.

506
01:01:39,000 --> 01:01:48,000
A wrangling process to get these maps to work. And so I'm going to spend a bit of time today and get these cleaned up because this is a little embarrassing.

507
01:01:48,000 --> 01:01:52,000
And so for next week, I can point you at.

508
01:01:52,000 --> 01:02:02,000
I can point you to.

509
01:02:02,000 --> 01:02:05,000
State data trove.

510
01:02:05,000 --> 01:02:09,000
As we fill in state by state.

511
01:02:09,000 --> 01:02:15,000
And so and hopefully county by county.

512
01:02:15,000 --> 01:02:23,000
So it's going to be a little bit of a journey here getting all this data.

513
01:02:23,000 --> 01:02:26,000
Processed and presentable.

514
01:02:26,000 --> 01:02:29,000
However, we're on the right road.

515
01:02:29,000 --> 01:02:32,000
So.

516
01:02:32,000 --> 01:02:38,000
I did my best to present this data will will give it another go again next week.

517
01:02:38,000 --> 01:02:43,000
So thanks for bearing with this ad hoc data presentation here.

518
01:02:43,000 --> 01:02:49,000
And I'm going to go ahead and wrap it up there.

519
01:02:49,000 --> 01:02:53,000
If there are any.

520
01:02:53,000 --> 01:03:02,000
Questions I can field a couple of questions before concluding.

521
01:03:02,000 --> 01:03:05,000
Yeah, I actually had a question.

522
01:03:05,000 --> 01:03:21,000
By all means, so sorry about earlier. I was having trouble getting my microphone unblocked, but I'm pretty new to data science. I was just wondering why you decided to pick Python to examine your data.

523
01:03:21,000 --> 01:03:27,000
Good question. That's sort of my tool of choice. So.

524
01:03:27,000 --> 01:03:41,000
It's just what I'm most comfortable with. So I work with a couple people and they like to use our so that's their first go to choice. So for me.

525
01:03:41,000 --> 01:03:53,000
I find it simple. It just generally reads like English. I find it powerful. Basically, I find if I want to do something with Python.

526
01:03:53,000 --> 01:04:02,000
And it can be done on a computer. I can do it, which is awesome because then like you saw today.

527
01:04:02,000 --> 01:04:18,000
The Oklahoma licensees data is in a bunch of PDFs. And so I found okay well that's on a computer. I would normally download this PDFs and put them into Excel by hand.

528
01:04:18,000 --> 01:04:28,000
Manually with Python, I can download all those PDFs scrounge all the data, get it into Excel with the click of a button.

529
01:04:28,000 --> 01:04:37,000
I'm sure you can do this with other programming languages. So I'm sure you could potentially do this with our.

530
01:04:37,000 --> 01:04:42,000
You could definitely do this with C or.

531
01:04:42,000 --> 01:04:51,000
Or who knows? I'm not well versed with programming languages that you know there could be Java or.

532
01:04:51,000 --> 01:05:02,000
Potentially PHP, what have you. So it's to me I see them as tools, you know means to an end. So.

533
01:05:02,000 --> 01:05:14,000
I just picked Python because I knew I could. Well, I didn't quite get the job done, but I knew I could get given more time I could get the job done. So I knew I could get the job done with Python.

534
01:05:14,000 --> 01:05:24,000
So that's why I chose in. So does that answer your question? Yeah, for sure. Thank you.

535
01:05:24,000 --> 01:05:33,000
How about yourself? Do you have any go to tools that you use for data science? So far, no.

536
01:05:33,000 --> 01:05:45,000
I'm going through the Thinkful program at the moment and I finished with Excel and then I'm pretty close to finishing SQL, but I haven't started with the other ones yet.

537
01:05:45,000 --> 01:05:54,000
OK, so the SQL will be OK. That's where OK, yes, we need to store our data in the database.

538
01:05:54,000 --> 01:06:03,000
By myself, I'm not the biggest SQL expert. I a lot of the times use no SQL Firebase.

539
01:06:03,000 --> 01:06:15,000
And so that's like the JSON like dictionary format formatted data for me. It's a bit more natural and.

540
01:06:15,000 --> 01:06:33,000
I gravitate towards. So my recommendation, of course, I'm biased, but my recommendation would be what you may want to try to install Python and try to do a little bit.

541
01:06:33,000 --> 01:06:48,000
Of course, try to figure out, OK, what are some of the repetitive tasks that you find yourself doing? Oh, I'm downloading this data set on every week or oh.

542
01:06:48,000 --> 01:06:58,000
I keep searching for these journal articles or this or that and just find things that you do repeatedly and automate them.

543
01:06:58,000 --> 01:07:07,000
And so you could set up what's called a cron job, so that's just have your Python script run automatically.

544
01:07:07,000 --> 01:07:13,000
So you could have your Python script run every Monday and.

545
01:07:13,000 --> 01:07:19,000
Get you data for the week, so. That's my personal recommendation.

546
01:07:19,000 --> 01:07:30,000
Python's got a close place to my heart. So maybe maybe it'll find a place near your heart. Yeah, hopefully.

547
01:07:30,000 --> 01:07:38,000
I think we're learning that next, so OK, well, my recommendation is.

548
01:07:38,000 --> 01:07:44,000
Make it make it your own. So like a lot of times.

549
01:07:44,000 --> 01:07:52,000
You know you get out of things which put into them, and so that's why I got so I learned Python in a.

550
01:07:52,000 --> 01:08:06,000
Computational economics course and it was one of those things where I just fell in love with it and I realized, oh wow, I can make really cool plots with Python.

551
01:08:06,000 --> 01:08:15,000
And so I just went out of my way to whenever I needed to make a good looking plot.

552
01:08:15,000 --> 01:08:21,000
I would do it with Python, no matter how hard or arduous.

553
01:08:21,000 --> 01:08:28,000
It was so. But it was worth it. It paid off because.

554
01:08:28,000 --> 01:08:33,000
The cool thing is, once you write the code, you have it so.

555
01:08:33,000 --> 01:08:38,000
Although it wasn't that clean rate, I was digging up this old.

556
01:08:38,000 --> 01:08:47,000
Boca chloroplet chart that we had done months months ago, and so with a bit more tinkering, we could get it to work.

557
01:08:47,000 --> 01:08:55,000
But long story short, you know you can and that's the that's the importance to writing clean code is so you can just pick it back up months later and use it.

558
01:08:55,000 --> 01:09:00,000
That was not clean code could not pick it back up and use it. That was unfortunate.

559
01:09:00,000 --> 01:09:07,000
So long story short, you can write these awesome tools for yourself, which will.

560
01:09:07,000 --> 01:09:12,000
Be there for perpetuity and.

561
01:09:12,000 --> 01:09:16,000
Sky's the limit, so that's my.

562
01:09:16,000 --> 01:09:20,000
Promo for Python. Thank you.

563
01:09:20,000 --> 01:09:23,000
Yes.

564
01:09:23,000 --> 01:09:30,000
On that note, future Python East is I want to go ahead and wrap it up for today.

565
01:09:30,000 --> 01:09:33,000
Just don't want to just.

566
01:09:33,000 --> 01:09:38,000
Just stay, you know, just spread it too long, but.

567
01:09:38,000 --> 01:09:46,000
Like I said, there's still more cleanup to be done here. So for next week we can have some real beautiful looking charts.

568
01:09:46,000 --> 01:09:54,000
And so that's what I'll be tinkering on, but you can at least I think it's important to see the process because it's not like.

569
01:09:54,000 --> 01:10:05,000
Just someone waves a magic wand or you know flashes their little curtain or what have you and then.

570
01:10:05,000 --> 01:10:14,000
Yeah, so pleasure people don't just wave a wand and things are made there's there's tinkering to be done and there's mistakes to be made along the way.

571
01:10:14,000 --> 01:10:19,000
And so, as you saw, we've got some ugly looking charts along the way we're making some mistakes.

572
01:10:19,000 --> 01:10:23,000
We're we're figuring this out, but.

573
01:10:23,000 --> 01:10:31,000
You'll see next week the sausage will hopefully we'll get some good visualizations out of it.

574
01:10:31,000 --> 01:10:36,000
So that's the idea.

575
01:10:36,000 --> 01:10:38,000
On that note.

576
01:10:38,000 --> 01:10:48,000
Thank you both for attending for coming today and definitely feel free to reach out if you have anything you want specifically covered in in the coming weeks.

577
01:10:48,000 --> 01:11:04,000
And then next week we'll be moving on to more states and just trying to fill in this map and get all the public cannabis data that we can and make it available.

578
01:11:04,000 --> 01:11:10,000
All right, I'm going to end the recording here and see you both next week.

