1
00:00:00,000 --> 00:00:06,640
So let's talk about James, dancing Greek statues, had one and a half hour talk with him.

2
00:00:06,640 --> 00:00:09,600
Now he's a master of control nuts inside of stable diffusion.

3
00:00:09,600 --> 00:00:30,800
And Elon Musk replied, James is like we had to reply immediately back.

4
00:00:30,800 --> 00:00:34,160
Ladies and gentlemen, boys and girls, children of all ages, dogs, cats,

5
00:00:34,160 --> 00:00:40,320
robots and everybody in between, especially you first time viewers on YouTube.

6
00:00:40,320 --> 00:00:42,000
That's right. We got a live podcast.

7
00:00:42,000 --> 00:00:46,400
Everybody welcome to HTTTA how to talk to AI.

8
00:00:46,400 --> 00:00:53,920
I am your host in the flesh, the synth buying West and it's all I am joined by the gush,

9
00:00:53,920 --> 00:01:00,800
the gracious, the globe trotting, the galvanizing force of nature herself.

10
00:01:00,800 --> 00:01:04,400
Never ghastly, always graceful, the miss go to go herself.

11
00:01:04,400 --> 00:01:05,440
Gee, how are you?

12
00:01:05,440 --> 00:01:07,360
I'm great. Thank you so much.

13
00:01:07,360 --> 00:01:10,800
You know, I'm always giggling with a huge smile.

14
00:01:10,800 --> 00:01:14,320
You're amazing. Your introductions are fantastic.

15
00:01:14,320 --> 00:01:16,480
I'm excited about this video.

16
00:01:16,480 --> 00:01:18,080
Let's see how that goes.

17
00:01:18,080 --> 00:01:19,600
That's definitely something new.

18
00:01:20,080 --> 00:01:20,400
Yes.

19
00:01:20,400 --> 00:01:22,400
And not entirely new.

20
00:01:22,400 --> 00:01:24,960
You have a little more experience in this realm than me.

21
00:01:24,960 --> 00:01:26,960
Yeah, but it doesn't get easier.

22
00:01:26,960 --> 00:01:28,320
So right to that.

23
00:01:28,320 --> 00:01:29,920
Hey, let's jump right in.

24
00:01:29,920 --> 00:01:38,640
I recall a few months back, you did a video on RunwayML's gen one text to image generation.

25
00:01:38,640 --> 00:01:42,320
Do you want to just fill people in on kind of like your experience using that?

26
00:01:42,320 --> 00:01:45,120
Because we have some news that gen two is out now.

27
00:01:45,120 --> 00:01:47,760
Right. So I tested gen one.

28
00:01:47,760 --> 00:01:52,800
I don't know if we talked about it, but my experience was that it was definitely not

29
00:01:52,800 --> 00:01:57,120
easy with gen one, but it was also beta version in this core.

30
00:01:57,120 --> 00:02:03,440
So to get something consistent, like suddenly, for example, you're prompting something like

31
00:02:03,440 --> 00:02:07,200
a character and suddenly I would get a man with a beard.

32
00:02:07,200 --> 00:02:10,560
And I was like, is it trying to tell me something?

33
00:02:10,560 --> 00:02:18,080
But that being said, gen two came out and that's definitely a little bit different right now.

34
00:02:18,080 --> 00:02:20,080
And everyone is now snapping fingers.

35
00:02:20,080 --> 00:02:21,440
It's so funny.

36
00:02:21,440 --> 00:02:22,240
It is pretty funny.

37
00:02:22,240 --> 00:02:27,920
Runway is offering on their website, a little tutorial video on how to use gen one.

38
00:02:27,920 --> 00:02:31,600
And I think some of the flow, some of the content is even more compelling.

39
00:02:31,600 --> 00:02:35,440
And you see what's the difference because now I think on the paid version, because I

40
00:02:35,440 --> 00:02:41,600
tried free, I'm going to check out paid one too, but there's like sliders, the important

41
00:02:41,600 --> 00:02:45,680
part, this like connectivity with the footage that you prompt.

42
00:02:45,680 --> 00:02:50,160
If you see all these videos, they are all based on original footage.

43
00:02:50,160 --> 00:02:56,000
So you upload a video and then you have a slider on how much of that video it takes

44
00:02:56,000 --> 00:02:57,360
as a reference point.

45
00:02:57,360 --> 00:03:02,480
This is like in a stable diffusion, you know, when you control opacity of how much image

46
00:03:02,480 --> 00:03:06,000
is influencing your final image versus text prompt.

47
00:03:06,000 --> 00:03:06,400
Okay.

48
00:03:06,400 --> 00:03:13,120
I think what is driving this type of results, but you can actually see faces and body

49
00:03:13,120 --> 00:03:19,280
structures or even fingers more accurate as if you just prompted, you have to have like

50
00:03:19,280 --> 00:03:20,720
a video as a reference.

51
00:03:20,720 --> 00:03:26,880
So image to image, video to video, something to kind of give the model.

52
00:03:26,880 --> 00:03:27,360
Yeah.

53
00:03:27,360 --> 00:03:28,320
Some, some lifting.

54
00:03:28,320 --> 00:03:28,960
And you know what?

55
00:03:28,960 --> 00:03:34,880
I will say in prior image generations where I'm looking for very specific details,

56
00:03:34,880 --> 00:03:39,920
someone's holding something a specific way, the lighting conditions need to be such a

57
00:03:39,920 --> 00:03:44,320
way I know in mid journey stable diffusion, you could do an image sample in addition to

58
00:03:44,320 --> 00:03:45,120
your prompt.

59
00:03:45,120 --> 00:03:49,440
And oftentimes the model just goes, okay, I got it.

60
00:03:49,440 --> 00:03:53,040
That, that helps me, you know, not try to have to figure everything out.

61
00:03:53,040 --> 00:03:59,360
You know, it's interesting because with AI video, it's kind of this two paths of the

62
00:03:59,360 --> 00:04:04,800
ones which are using video as a reference of you and kind of overlaying whatever you

63
00:04:04,800 --> 00:04:05,440
want.

64
00:04:05,440 --> 00:04:12,800
And another one is this AI generated avatars where you take the image or the model trained

65
00:04:12,800 --> 00:04:16,960
on your realistic images, and then you feed the text.

66
00:04:16,960 --> 00:04:20,640
It's two different directions of the control of the output.

67
00:04:20,640 --> 00:04:25,120
I mean, it's, it seems like at some point you're going to have to like go, all right,

68
00:04:25,120 --> 00:04:32,000
I have to lean on the model to generate something compelling, but I also as a director, as

69
00:04:32,000 --> 00:04:38,160
producer, as the person creating this, I have a specific vision and I do need some deterministic

70
00:04:38,160 --> 00:04:40,720
lever in the results to achieve that.

71
00:04:40,720 --> 00:04:45,840
I wanted to show that you was kind of back to reality experience.

72
00:04:45,840 --> 00:04:48,960
So let's talk about James, Dancing Greek Statues.

73
00:04:49,840 --> 00:04:52,480
Had one and a half hour talk with him.

74
00:04:52,480 --> 00:04:54,880
He's amazing artist and we were just geeking out.

75
00:04:54,880 --> 00:04:57,120
He was like, I didn't know that you're painting.

76
00:04:57,120 --> 00:04:58,400
He went on my Instagram.

77
00:04:58,400 --> 00:05:02,800
Check out last two weeks newsletters for a little bit more James's work from episode

78
00:05:02,800 --> 00:05:06,800
nine and, and some of Goda's art that we just talked about last week.

79
00:05:06,800 --> 00:05:13,440
Right, so, so we kind of like bonded in this artistic style and he's just, yeah, brilliant

80
00:05:13,440 --> 00:05:18,000
and it's so amazing to hear, you know, that his work took off like that.

81
00:05:18,560 --> 00:05:21,840
But then I was showing his work to everybody.

82
00:05:21,840 --> 00:05:24,320
This is just how I was just like, you have to see this.

83
00:05:24,320 --> 00:05:29,680
And we showed it to our family, the parents, parents-in-law, we were like,

84
00:05:29,680 --> 00:05:32,640
we watched it and they're like, okay, it's nice animation.

85
00:05:32,640 --> 00:05:36,320
And I was like, you don't understand.

86
00:05:36,320 --> 00:05:38,320
This is not animation.

87
00:05:38,320 --> 00:05:40,320
Just Elon Musk replied.

88
00:05:41,120 --> 00:05:47,200
He was like, we had to reply immediately back, but we don't know if Elon Musk saw that,

89
00:05:47,920 --> 00:05:51,840
you know, his version because we did with Elon Musk face on it.

90
00:05:51,840 --> 00:05:52,240
Yep.

91
00:05:52,240 --> 00:05:54,000
You know, that's some good coverage.

92
00:05:54,000 --> 00:05:56,800
Unlike our little video, this is our, our pal James.

93
00:05:56,800 --> 00:05:58,240
We talked about it for a couple of weeks now.

94
00:05:58,240 --> 00:06:01,280
He's a master of control nuts inside of stable diffusion and

95
00:06:01,280 --> 00:06:04,960
just so many possibilities, such, such fun stuff there.

96
00:06:04,960 --> 00:06:09,840
And as we say often on here, this, these kinds of things, they're only going to get better.

97
00:06:09,840 --> 00:06:11,680
This is the worst they'll ever be.

98
00:06:11,680 --> 00:06:13,120
And they're already compelling.

99
00:06:13,120 --> 00:06:16,880
His stuff has been viewed in the last two, three weeks, over 200 million times.

100
00:06:16,880 --> 00:06:21,680
But going back to this kind of, you know, stepping outside of the AI bubble,

101
00:06:21,680 --> 00:06:25,440
it is interesting to me how the outside world perceives these things,

102
00:06:25,440 --> 00:06:29,120
because when at the end of the day, there is going to be value attached to that.

103
00:06:29,120 --> 00:06:29,360
Right.

104
00:06:29,360 --> 00:06:29,760
Yeah.

105
00:06:29,760 --> 00:06:34,480
So while we are geeking out and we're like, oh my God, like I will just pay you as much as possible.

106
00:06:34,480 --> 00:06:36,240
I want something like that.

107
00:06:36,240 --> 00:06:43,360
The public sees this as we said, oh, nice animation as it's like some sort of just,

108
00:06:43,360 --> 00:06:45,360
I don't know, premiere pro package.

109
00:06:45,360 --> 00:06:46,320
Oh, yeah.

110
00:06:46,320 --> 00:06:47,040
That's some frame.

111
00:06:47,040 --> 00:06:48,720
That's the only frame of reference they have.

112
00:06:48,720 --> 00:06:49,200
I don't know.

113
00:06:49,200 --> 00:06:53,600
For me, that was kind of interesting insight because yeah, at the end of the day,

114
00:06:54,400 --> 00:06:58,640
these things will be generalized and actually made available.

115
00:06:58,640 --> 00:07:02,720
So, you know, not to minimize the quality and value of James work.

116
00:07:02,720 --> 00:07:07,600
But I also, because of that, I saw, I think two apps popping up saying that,

117
00:07:07,600 --> 00:07:10,960
oh, I just uploaded a video and we generate AI video.

118
00:07:10,960 --> 00:07:15,520
And it is just kind of, it is this visual effect applied to it.

119
00:07:15,520 --> 00:07:18,640
I really don't believe that they run the whole thing.

120
00:07:18,640 --> 00:07:23,120
Because when we talk to James, like if people think that this is just like

121
00:07:23,120 --> 00:07:26,560
one click prompt easy, so wrong.

122
00:07:26,560 --> 00:07:27,200
Not right now.

123
00:07:27,200 --> 00:07:35,760
No, it is so much work, so much iteration and it costs a lot of money just to run with things.

124
00:07:35,760 --> 00:07:40,400
No one has a concept of for every frame of those little clips of James,

125
00:07:40,400 --> 00:07:44,800
if it's running at 60 frames a second, which some of those are, because they're pretty smooth,

126
00:07:44,800 --> 00:07:49,760
you know, seven second video, that's hundreds and hundreds of frames that individually need

127
00:07:49,760 --> 00:07:51,600
to be rendered and stitched together.

128
00:07:51,600 --> 00:07:57,520
Unless you have a pretty beefy GPU or paying for some cloud server.

129
00:07:57,520 --> 00:07:59,200
Or both.

130
00:07:59,200 --> 00:08:00,080
Yeah, or both.

131
00:08:00,640 --> 00:08:03,520
That takes an inordinately long amount of time.

132
00:08:03,520 --> 00:08:06,000
So talking about GPUs.

133
00:08:06,000 --> 00:08:07,920
We could definitely talk about GPUs.

134
00:08:07,920 --> 00:08:11,840
The biggest transition to that, my video I did about mid journey,

135
00:08:11,840 --> 00:08:16,000
where founder was talking about this two futures of AI.

136
00:08:16,000 --> 00:08:17,680
Did you, I hope you watched it.

137
00:08:17,680 --> 00:08:18,240
No, don't.

138
00:08:18,240 --> 00:08:20,240
Of course, of course I watched your video.

139
00:08:20,240 --> 00:08:22,400
Yes, of course, I am a natural professional.

140
00:08:23,200 --> 00:08:30,480
That's one of the older ones, but interesting thing he had this mention of GPUs and he was

141
00:08:30,480 --> 00:08:32,320
saying there's two realities.

142
00:08:32,320 --> 00:08:39,200
One is that to scale anything, we are just bound by physics as a first principle, just as a data

143
00:08:39,200 --> 00:08:43,200
centers and chips that we can produce physically.

144
00:08:43,200 --> 00:08:45,520
And it will take us seven years to scale.

145
00:08:45,520 --> 00:08:49,680
And that was a, that is a reason why mid journey is not marketed.

146
00:08:49,680 --> 00:08:54,400
You are not going to see ads for mid journey or like a, hey, there is an app for mid journey

147
00:08:54,400 --> 00:08:58,880
because there's simply no ability to scale to sustain such user base.

148
00:08:58,880 --> 00:09:05,120
But when he talked about different path, if we invent new chips or new abilities or integrate

149
00:09:05,120 --> 00:09:08,480
neural nets and chips, it's out of my domain knowledge.

150
00:09:08,480 --> 00:09:12,720
But, and then he said that if this happens, then we scale in a year.

151
00:09:12,720 --> 00:09:13,120
Yeah.

152
00:09:13,120 --> 00:09:16,720
So think what the craziness what's happening right now.

153
00:09:16,720 --> 00:09:22,240
And then if there is next technological breakthrough in the physical department, the

154
00:09:22,240 --> 00:09:22,800
hardware.

155
00:09:23,680 --> 00:09:26,320
Well, speaking of hardware, let's talk.

156
00:09:26,320 --> 00:09:26,880
Yeah.

157
00:09:26,880 --> 00:09:31,040
Since we're talking chips and it is kind of wild to think about like, you know, that's why GPT-4

158
00:09:31,040 --> 00:09:34,080
is constrained in terms of like how fast it generates.

159
00:09:34,080 --> 00:09:35,920
It's literally the compute.

160
00:09:35,920 --> 00:09:40,960
So what's starting to take place is a lot of these big companies are completely rethinking

161
00:09:40,960 --> 00:09:42,400
how compute happens.

162
00:09:42,960 --> 00:09:46,560
You know, I'm sure people, while they may not know the ins and outs of it,

163
00:09:46,560 --> 00:09:52,240
you can conceptualize that all of our stuff, our websites, the apps be like clouds.

164
00:09:52,240 --> 00:09:54,240
They're running in these big data centers.

165
00:09:54,240 --> 00:09:58,640
These data centers are power hungry and inside of each one of these servers is essentially

166
00:09:58,640 --> 00:10:02,880
a different type of computer that is CPU centric, right?

167
00:10:02,880 --> 00:10:05,360
It has a central processing unit.

168
00:10:05,360 --> 00:10:11,040
So one of the things that gets inefficient about these is they spend as much time kind

169
00:10:11,040 --> 00:10:16,560
of moving around data as they do actually like computing, like doing computations.

170
00:10:16,560 --> 00:10:21,680
So that tends to tax and take longer for processes to take place.

171
00:10:21,680 --> 00:10:26,640
But with the onset of some of these generative AI technologies,

172
00:10:26,640 --> 00:10:32,160
the video who's the world's leading chip maker has just put out, it's totally worth the YouTube

173
00:10:32,160 --> 00:10:32,560
watch.

174
00:10:32,560 --> 00:10:36,880
Even if you don't geek out a bunch of this hardware stuff, it is pretty incredible in

175
00:10:36,880 --> 00:10:38,480
terms of some of the things they demo.

176
00:10:38,480 --> 00:10:44,000
So this is the Brace Hopper AI supercomputer.

177
00:10:44,000 --> 00:10:52,400
It's basically a singular GPU, the size of a school bus, 150 miles of fiber optic cable

178
00:10:52,400 --> 00:10:53,200
inside of it.

179
00:10:53,200 --> 00:10:59,200
But this singular row of servers, granted it's a lot of servers, are GPU based.

180
00:10:59,200 --> 00:11:06,160
So it uses something to the tune of like a 44th of the amount of power that to achieve

181
00:11:06,160 --> 00:11:12,240
the same level of compute as this GPU server would, it would need 45 times more power in

182
00:11:12,240 --> 00:11:14,000
our conventional servers right now.

183
00:11:14,000 --> 00:11:19,200
So you can get way more cost effective bang for your buck, so to speak, train a generative

184
00:11:19,200 --> 00:11:25,120
image model, train a language model in fractions of the time, just because of how efficiently

185
00:11:25,120 --> 00:11:26,080
this can handle.

186
00:11:26,080 --> 00:11:32,320
Like this is a 140 terabytes of GPU compute power that is all running in series.

187
00:11:32,320 --> 00:11:34,080
So it works as one.

188
00:11:34,080 --> 00:11:38,320
These are the kind of things that when all these data center transformations happen,

189
00:11:38,320 --> 00:11:43,200
not only will we be able to use AI like we want to, these are the kind of things that

190
00:11:43,200 --> 00:11:48,800
you need to say, okay, computer, supercomputer, we've got 20 of these chained together, please

191
00:11:48,800 --> 00:11:50,320
cure all diseases for us.

192
00:11:50,320 --> 00:11:56,000
And they just can start processing these huge, inconceivably large problems.

193
00:11:56,000 --> 00:11:57,680
I was listening to you.

194
00:11:57,680 --> 00:12:02,320
This is fascinating, but I was checking stock price of any media.

195
00:12:02,320 --> 00:12:04,240
It definitely shot up.

196
00:12:04,240 --> 00:12:09,360
And what's that part is yesterday, I was like, I need to buy some shares of new media.

197
00:12:09,360 --> 00:12:13,200
And then something happened and this is something because it's on mobile.

198
00:12:13,200 --> 00:12:15,760
So it's almost like, oh yeah, let me know.

199
00:12:15,760 --> 00:12:17,040
Now it sounds bad.

200
00:12:17,040 --> 00:12:21,360
It's not like you are just casually buying shares, but I think the media is the place

201
00:12:21,360 --> 00:12:22,640
where it makes sense.

202
00:12:22,640 --> 00:12:30,880
They already control something like 85% of the GPU chip market, which is dangerous.

203
00:12:30,880 --> 00:12:32,480
It is, it totally is.

204
00:12:32,480 --> 00:12:36,160
And I think there's going to be antitrust stuff coming down the pike eventually that,

205
00:12:36,160 --> 00:12:40,480
you know, Microsoft had to face the same thing when they were, you know, this huge software

206
00:12:40,480 --> 00:12:43,040
company in the late nineties, early two thousands.

207
00:12:43,040 --> 00:12:47,280
Yeah, you're going to have this AI arms race that no doubt is going to emerge between these

208
00:12:47,280 --> 00:12:53,280
big players like Microsoft, like Google, like Meta, but they all still need this compute.

209
00:12:53,280 --> 00:12:59,520
So this huge new GPU central server that the videos invented and put together, it has

210
00:12:59,520 --> 00:13:02,720
something like six of these chips inside of each server blade.

211
00:13:02,720 --> 00:13:07,840
The chips cost $200,000 each and there's over a thousand server blades inside of one of

212
00:13:07,840 --> 00:13:09,600
these exa-flock machines.

213
00:13:09,600 --> 00:13:14,560
So that's not something that anybody other than like fortune 50, fortune 500 companies

214
00:13:14,560 --> 00:13:17,600
can even afford to have access to at this point.

215
00:13:17,600 --> 00:13:22,400
But considering we've already blown Moore's law out of the water with some of the, some

216
00:13:22,400 --> 00:13:27,200
of the level of innovations that are happening, you know, it's only going to get cheaper.

217
00:13:27,200 --> 00:13:33,600
I remember I saw this graph learning from history is always a good way to go and was

218
00:13:33,600 --> 00:13:38,560
looking at what happened with semiconductors and SaaS companies, like, you know, big tech

219
00:13:38,560 --> 00:13:43,760
that historically SaaS companies are the ones who win and not the hardware.

220
00:13:43,760 --> 00:13:44,560
Yeah.

221
00:13:44,560 --> 00:13:50,240
But I think when Nvidia case is interesting because as much as they are hardware controlling

222
00:13:50,240 --> 00:13:55,520
market, they are coming with their larger language models like Neo, right?

223
00:13:55,520 --> 00:14:02,960
So it's because we can really attack from all sides and do all this collaborations with

224
00:14:02,960 --> 00:14:04,000
the companies.

225
00:14:04,640 --> 00:14:11,520
It is even like scary to think when we live in a world where all compute is in one company's

226
00:14:11,520 --> 00:14:12,080
hand.

227
00:14:12,080 --> 00:14:13,120
Totally crazy thing.

228
00:14:13,120 --> 00:14:15,440
But I got two computers right here in front of me.

229
00:14:15,440 --> 00:14:17,600
I've got Nvidia GPUs in it.

230
00:14:17,600 --> 00:14:18,400
Right.

231
00:14:18,400 --> 00:14:20,480
Everything else is different and diversified.

232
00:14:20,480 --> 00:14:22,480
The GPUs are both Nvidia.

233
00:14:22,480 --> 00:14:23,280
That's interesting.

234
00:14:23,280 --> 00:14:24,400
I just got Mac.

235
00:14:24,400 --> 00:14:30,080
So remember we were talking about if to get Nvidia and this one has M2 and I'm very happy

236
00:14:30,080 --> 00:14:30,720
with it.

237
00:14:30,720 --> 00:14:35,600
But yeah, that's another thing like maybe Apple, especially with their, you know, Vision

238
00:14:35,600 --> 00:14:39,680
Pro and how much compute that but that's completely two ball games.

239
00:14:39,680 --> 00:14:43,280
Nvidia is on the whole other racetrack.

240
00:14:43,280 --> 00:14:50,640
And what's always made Mac run so efficiently is since they are developing the silicon and

241
00:14:50,640 --> 00:14:56,000
the software in that case, they are able to develop processes, system architectures that

242
00:14:56,000 --> 00:14:58,880
run super efficiently on their chips.

243
00:14:58,880 --> 00:15:02,080
So they're able to kind of optimize both things.

244
00:15:02,640 --> 00:15:09,120
You know, so in many instances, an M2 chip does some of the same things that a big beefy

245
00:15:09,120 --> 00:15:12,400
GPU could do for machine learning purposes.

246
00:15:12,400 --> 00:15:18,000
But it's only going to enable more creativity, more problems being solved.

247
00:15:18,000 --> 00:15:19,600
It's an exciting thing to think about.

248
00:15:19,600 --> 00:15:22,960
You know, on this kind of mass global power.

249
00:15:23,760 --> 00:15:29,680
Now with Sam Altman and you know, the leaders of AI labs on Trophy Cloud, they're on this

250
00:15:29,680 --> 00:15:32,000
world tour for some of them.

251
00:15:32,000 --> 00:15:38,320
And you see these pictures and every time we see it, Sam Altman meeting some leaders

252
00:15:38,320 --> 00:15:42,640
of our presidents or prime minister, those looks like he's a president now.

253
00:15:42,640 --> 00:15:44,720
Didn't it cost you?

254
00:15:44,720 --> 00:15:46,560
He wields a lot of power.

255
00:15:46,560 --> 00:15:51,200
Yeah, and when you look at this picture, you're like, who in this picture is actually the

256
00:15:51,200 --> 00:15:52,640
man of power now?

257
00:15:52,640 --> 00:15:53,280
Yeah.

258
00:15:53,280 --> 00:15:58,400
Speaking of power, one of the things that is an interesting kind of discussion point

259
00:15:58,400 --> 00:16:03,040
as we're talking about more efficient GPUs and chips and all the things that run behind

260
00:16:03,040 --> 00:16:08,720
the scenes to enable these really wonderful AI technological innovations.

261
00:16:08,720 --> 00:16:12,240
I didn't realize this week at the Good Castle came out one of the talks that

262
00:16:12,240 --> 00:16:18,160
Sam Altman's a big investor into fusion power research and maintains even talked about

263
00:16:18,160 --> 00:16:18,640
a little bit.

264
00:16:18,640 --> 00:16:22,960
Some of the recent talks this week, we'll link a few in the description on the newsletter,

265
00:16:22,960 --> 00:16:28,640
talked about how to achieve things like GPT-567 where you're going to need just so much compute,

266
00:16:28,640 --> 00:16:31,840
not only to train, but then to run it, let everybody run it.

267
00:16:31,840 --> 00:16:35,200
It's going to need fusion power to achieve.

268
00:16:35,200 --> 00:16:39,600
So, you know, getting to a point where it's a tenth of the cost of any energy and then

269
00:16:39,600 --> 00:16:44,800
we can make enough of them to go around the world is pretty incredible prospect.

270
00:16:44,800 --> 00:16:50,080
And, you know, it's been a promise of, I've heard it my whole life, fusions 30 years away

271
00:16:50,080 --> 00:16:52,640
or 40 years away, you know, it's going to happen in the future.

272
00:16:52,640 --> 00:16:56,560
But a lot of things that I hear when it comes up, it's like five years away.

273
00:16:56,560 --> 00:16:57,200
Right.

274
00:16:57,200 --> 00:16:59,440
It kind of gives me an idea of it.

275
00:16:59,440 --> 00:17:03,120
What Sam Altman is not part of or invested.

276
00:17:03,120 --> 00:17:07,520
World coin, like scanning your eyeballs, universal basic income.

277
00:17:07,520 --> 00:17:10,320
And probably list goes on.

278
00:17:10,320 --> 00:17:11,680
It's incredible.

279
00:17:11,680 --> 00:17:14,080
The power dynamics are very interesting.

280
00:17:14,080 --> 00:17:17,920
He'll be our main ambassador to the AI overlords later on.

281
00:17:17,920 --> 00:17:23,520
But on the flip side, there is, you know, again, if we look at historically first movers

282
00:17:23,520 --> 00:17:29,120
in the market and not necessarily the ones leading the market at the end of the day.

283
00:17:29,120 --> 00:17:34,320
I'm kind of always running with that thought, but if it's not open AI, who is coming?

284
00:17:34,320 --> 00:17:34,960
Yeah.

285
00:17:34,960 --> 00:17:40,720
I will say though, just having, you know, worked and been exploring the space for six months.

286
00:17:40,720 --> 00:17:43,520
There is something about the first mover advantage.

287
00:17:43,520 --> 00:17:48,480
That's a little just kind of flipped on its head with some, some of these things,

288
00:17:48,480 --> 00:17:52,480
because people are inventing and creating so much and stitching these AI technologies

289
00:17:52,480 --> 00:17:56,320
together to form innovative new products and solutions.

290
00:17:56,320 --> 00:18:02,320
But at the end of the day, it's to solve very similar problems that people can kind of conceive

291
00:18:02,320 --> 00:18:05,200
of themselves that can go, Oh, I see that.

292
00:18:05,200 --> 00:18:09,840
I see that they're not using any, all using the same technology, all using the same language

293
00:18:09,840 --> 00:18:10,640
models.

294
00:18:10,640 --> 00:18:11,040
Okay.

295
00:18:11,040 --> 00:18:13,920
So it's really not a game about whose tech is better.

296
00:18:13,920 --> 00:18:15,440
It's about who can market it better.

297
00:18:15,440 --> 00:18:21,280
So in a lot of instances I've seen, you know, just as long as you're kind of there to the

298
00:18:21,280 --> 00:18:26,320
table first with an idea that might be enough in some instances.

299
00:18:26,320 --> 00:18:29,120
I'll give you an example as it relates to prompt engineering.

300
00:18:29,120 --> 00:18:31,280
You know, I happened to start selling products.

301
00:18:31,280 --> 00:18:35,680
I happened to start selling some prompts on prompt base earlier this year.

302
00:18:35,680 --> 00:18:36,400
Right.

303
00:18:36,400 --> 00:18:41,760
It really didn't start taking off till like March or April, but what that's done because

304
00:18:41,760 --> 00:18:44,960
of things like Amazon and how we online shop before.

305
00:18:44,960 --> 00:18:45,760
Well, what do you do?

306
00:18:45,760 --> 00:18:48,720
The first thing you do when you get on Amazon, Hey, you're looking for a toothbrush.

307
00:18:48,720 --> 00:18:52,080
Hey, let me sort by the five stars with the most sales.

308
00:18:52,080 --> 00:18:52,880
Right.

309
00:18:52,880 --> 00:18:54,000
That's just like our condition.

310
00:18:54,000 --> 00:18:55,920
I want the best thing that's out there.

311
00:18:55,920 --> 00:18:57,200
I do that all the time.

312
00:18:57,200 --> 00:18:58,320
Read reviews.

313
00:18:58,320 --> 00:18:59,200
Right.

314
00:18:59,200 --> 00:19:04,240
Something like this, like I feel like in some instances I've unfairly benefited from that

315
00:19:04,240 --> 00:19:09,360
just because I had plenty of sales before this started taking off a little bit more.

316
00:19:09,920 --> 00:19:11,200
And what do people do?

317
00:19:11,200 --> 00:19:13,360
Hey, what's the top selling in this category?

318
00:19:13,360 --> 00:19:14,160
All right.

319
00:19:14,160 --> 00:19:19,680
You know, by virtue of being to the table first, that probably benefits me, you know,

320
00:19:19,680 --> 00:19:22,720
in that instance, it doesn't, it doesn't for other things.

321
00:19:22,720 --> 00:19:26,560
And I would hope that the creators of prop base or some of these marketplace do things

322
00:19:26,560 --> 00:19:30,160
to encourage new creators to put themselves out there.

323
00:19:30,160 --> 00:19:35,680
That makes me think I had this dilemma when I made the first videos going about

324
00:19:35,680 --> 00:19:38,880
chadgpt and learn prompting and prompt engineering.

325
00:19:38,880 --> 00:19:44,640
And I remember there was this moment I just made a video about how chadgpt works

326
00:19:44,640 --> 00:19:46,640
and explaining, you know, the model behind it.

327
00:19:46,640 --> 00:19:48,160
And it was 3.5.

328
00:19:48,160 --> 00:19:51,680
I released this video in a week and a half where is GPT-4.

329
00:19:51,680 --> 00:19:53,680
And I was like, oh, okay.

330
00:19:53,680 --> 00:19:58,080
So if I make a video about something which is happening, and this is the thing why

331
00:19:58,080 --> 00:20:03,680
I don't like to go news direction because our video is not relevant next week.

332
00:20:03,680 --> 00:20:04,000
Yeah.

333
00:20:04,000 --> 00:20:08,000
But, and I was thinking at the time, like, oh, I need to really, you know,

334
00:20:08,000 --> 00:20:11,520
think about videos which live long, very meaningful.

335
00:20:12,320 --> 00:20:18,400
The thing is that that video plus prompt engineering are getting views every single day.

336
00:20:19,040 --> 00:20:20,000
Like the most.

337
00:20:20,000 --> 00:20:24,560
And I'm like, in my world, I'm like, everyone knows already this.

338
00:20:24,560 --> 00:20:28,800
And I still read comments from people like, oh, I just discovered this.

339
00:20:28,800 --> 00:20:29,280
Wow.

340
00:20:29,280 --> 00:20:30,320
I didn't know this.

341
00:20:30,320 --> 00:20:34,480
I'm like, it feels like not three months, but three years ago.

342
00:20:34,480 --> 00:20:38,640
And that just makes me think this is why, you know, I was sharing this story,

343
00:20:38,640 --> 00:20:45,520
bringing these AI tools outside of the bubble and actually getting real world perspective.

344
00:20:45,520 --> 00:20:51,040
You know, you have to see how the real world actually adapts this fix.

345
00:20:51,040 --> 00:20:53,360
It's still, still trickling out there.

346
00:20:53,360 --> 00:20:53,840
Yeah.

347
00:20:53,840 --> 00:20:59,920
And I think with Chai GPT, when I was doing, you know, the, the keynote in embassy,

348
00:20:59,920 --> 00:21:03,600
three people came to me saying, oh yeah, yeah, I know it.

349
00:21:03,600 --> 00:21:07,280
I tried it, but I'm so disappointed with the results.

350
00:21:07,280 --> 00:21:08,480
Just so basic.

351
00:21:08,480 --> 00:21:09,040
Yep.

352
00:21:09,040 --> 00:21:13,840
And after my presentation, they're like, oh, so it can not like, you can actually

353
00:21:13,840 --> 00:21:18,000
control it and make it interesting and it can write in different things.

354
00:21:18,000 --> 00:21:25,760
And I was like, to me, it was like, yeah, of course, but people had tried and left it off.

355
00:21:25,760 --> 00:21:28,240
And I think it also happens with image tools.

356
00:21:28,240 --> 00:21:30,800
You can get very quickly discouraged.

357
00:21:30,800 --> 00:21:35,280
And I think this learning prompt engineering, that's why I think it's relevant and it's

358
00:21:35,280 --> 00:21:36,320
going to be relevant.

359
00:21:36,320 --> 00:21:36,960
Definitely.

360
00:21:36,960 --> 00:21:37,280
Yeah.

361
00:21:37,280 --> 00:21:41,760
I mean, it kind of boggles my mind that some people don't necessarily see the

362
00:21:41,760 --> 00:21:42,800
possibilities of it.

363
00:21:42,800 --> 00:21:47,360
But I also, people come at this with different needs and different expectations, you know,

364
00:21:47,360 --> 00:21:52,000
like obviously I'm coming at it from someone who's an avid consumer of anything

365
00:21:52,000 --> 00:21:53,520
electronic or technological.

366
00:21:53,520 --> 00:21:55,520
Does one need eight keyboards?

367
00:21:55,520 --> 00:21:58,000
My wife would say no, but I say otherwise.

368
00:21:58,000 --> 00:21:58,400
Yeah.

369
00:21:58,400 --> 00:22:01,600
I mean, I remember the first time my dad tried chat GPT.

370
00:22:01,600 --> 00:22:03,920
He's like, yeah, I put myself into it.

371
00:22:03,920 --> 00:22:05,920
Came up, didn't know anything about me.

372
00:22:05,920 --> 00:22:07,120
It's kind of milk toast.

373
00:22:07,120 --> 00:22:11,520
And I'm like, this is not how large language models work.

374
00:22:11,520 --> 00:22:16,800
Let me tell you about how it's only trained up till September, 2021 and about hallucinations,

375
00:22:16,800 --> 00:22:19,440
you know, so basic things you need to know.

376
00:22:19,440 --> 00:22:21,920
But like, but that's not on the front page at all.

377
00:22:21,920 --> 00:22:25,440
No, it's at the bottom, tiny letters.

378
00:22:25,440 --> 00:22:25,920
Yeah.

379
00:22:25,920 --> 00:22:27,760
This might be totally made up.

380
00:22:27,760 --> 00:22:28,240
Yeah.

381
00:22:28,240 --> 00:22:31,360
Not factual, wrong facts information.

382
00:22:31,360 --> 00:22:32,160
Oh yeah.

383
00:22:32,160 --> 00:22:33,440
That's the whole other thing.

384
00:22:33,440 --> 00:22:38,880
You know, we already talked about this, that there is certain level of responsibility when

385
00:22:38,880 --> 00:22:41,680
you drop a product like that to a mass market.

386
00:22:41,680 --> 00:22:42,080
Yeah.

387
00:22:42,080 --> 00:22:46,480
No information, what it is, how it is, nothing.

388
00:22:46,480 --> 00:22:48,720
It's just like, Hey, chatbot, that's it.

389
00:22:48,720 --> 00:22:54,800
But, but I think some of it is like, Herod, let's let people play around with it and

390
00:22:54,800 --> 00:22:59,840
explore while the stakes are low, as opposed to have this AGI that's ready.

391
00:22:59,840 --> 00:23:03,840
And then it just, they drop it on us and there's this huge shift that needs to happen.

392
00:23:03,840 --> 00:23:08,640
There's probably some value in it slowly, gradually, but it's not like,

393
00:23:08,640 --> 00:23:14,560
granted how slowly can you say it's still the fastest growing app or service in history.

394
00:23:15,120 --> 00:23:20,480
But, you know, it's still probably some value in letting it kind of trickle down.

395
00:23:20,480 --> 00:23:20,800
Yeah.

396
00:23:20,800 --> 00:23:22,320
Who values from that?

397
00:23:22,320 --> 00:23:27,760
I mean, it's probably the altruism of our great ambassador overlord, Sam Altman,

398
00:23:28,560 --> 00:23:30,800
you know, and those kind of investors.

399
00:23:31,360 --> 00:23:33,680
I hope people learned one thing.

400
00:23:33,680 --> 00:23:35,520
Nothing on internet is free.

401
00:23:36,080 --> 00:23:36,400
Yeah.

402
00:23:36,400 --> 00:23:39,360
So if there is something free, it's not free.

403
00:23:39,360 --> 00:23:42,080
Like you are trading off different value.

404
00:23:42,080 --> 00:23:48,640
And I worked in big data analytics company, the amount people pay for data and especially

405
00:23:48,640 --> 00:23:51,840
investment banks, it's insane.

406
00:23:52,320 --> 00:23:54,560
It's millions upon millions.

407
00:23:55,120 --> 00:24:02,320
So I think that's what people don't realize that that just the volume of data from this type of

408
00:24:02,320 --> 00:24:08,960
usage, but what you said, it's a number one, like most trending website or something like that.

409
00:24:09,680 --> 00:24:11,120
Literally vertical growth.

410
00:24:11,120 --> 00:24:12,960
That's if you map it against anything else.

411
00:24:13,840 --> 00:24:17,200
No, but let's talk something prompting.

412
00:24:17,200 --> 00:24:17,360
Yes.

413
00:24:17,360 --> 00:24:22,000
Now that we're leading to that, I wanted to share something with you coming back to the,

414
00:24:22,640 --> 00:24:26,880
my past afterlife in architecture and design.

415
00:24:26,880 --> 00:24:27,360
Yeah.

416
00:24:27,360 --> 00:24:38,000
I saw this design piece, it's foldable sofa, which can fit in the envelope designed by AI.

417
00:24:38,000 --> 00:24:40,800
It's studio 10 in Copenhagen.

418
00:24:40,800 --> 00:24:47,520
I actually know the studio and I think I'm almost sure I've been there, but anyway,

419
00:24:47,520 --> 00:24:50,720
so it's developed closely with IKEA.

420
00:24:50,720 --> 00:24:56,160
So you see this modular system and it's around 10 kilos.

421
00:24:56,160 --> 00:25:02,160
But the interesting thing that they went with first talking with Chad GPT, for example,

422
00:25:02,160 --> 00:25:07,200
one of them is saying, I actually wrote into Chad GPT, could a couch fit into an envelope?

423
00:25:07,200 --> 00:25:08,800
And of course it said no.

424
00:25:08,800 --> 00:25:14,080
And then we went to Midjourney and Delhi and we're spending like hundreds of prompts

425
00:25:14,080 --> 00:25:20,400
testing different words, terminology, and nothing worked really to achieve what they

426
00:25:20,400 --> 00:25:26,160
were looking for. And it was that you on a sofa, you're not front facing like to the TV,

427
00:25:26,160 --> 00:25:28,960
but you're actually sitting in front facing each other.

428
00:25:28,960 --> 00:25:35,600
And the key word which we discovered that actually led to that was conversation pit.

429
00:25:36,560 --> 00:25:37,520
Conversation pit.

430
00:25:38,080 --> 00:25:40,400
Right. So now we build it.

431
00:25:40,400 --> 00:25:47,520
So I would think that something like that is product of a machine learning or a labeling AI

432
00:25:47,520 --> 00:25:54,400
where it takes in tons of different designs and analyzes it and goes, this is the most ergonomic

433
00:25:54,400 --> 00:26:00,880
way to do this, but it's from a language set, generative language model, just kind of going,

434
00:26:00,880 --> 00:26:02,400
this is what I think it would look like.

435
00:26:02,400 --> 00:26:09,440
Yeah, but generative models is actually nothing new in AI space, especially when you look in a

436
00:26:09,440 --> 00:26:15,760
parametric design. So this, like even when I was doing my bachelor thesis, I was actually using

437
00:26:15,760 --> 00:26:22,080
this grasshopper plugins in the rhinos model to do parametric design. And then you can play with some

438
00:26:22,080 --> 00:26:28,400
generative design pieces, but the diffusion models is different. That's something new.

439
00:26:28,400 --> 00:26:34,720
And the generative AI, you have, in a way you have more control because it's like more like

440
00:26:34,720 --> 00:26:38,720
real database. And I could be completely wrong. Like this is just, you know,

441
00:26:38,720 --> 00:26:45,840
I'm kind of surface sliding through this topic right now, but diffusion models in architecture

442
00:26:45,840 --> 00:26:52,720
for render amazing. But then you want to real design vector based because, you know, you need

443
00:26:53,360 --> 00:26:59,680
centimeter millimeter control of the dimensions. That's kind of a different realm. I'm keeping a

444
00:26:59,680 --> 00:27:05,600
close eye on what's happening in architecture space design, because, you know, imagine like

445
00:27:05,600 --> 00:27:09,200
if it goes that route explosion, another explosion.

446
00:27:09,200 --> 00:27:15,840
There's, there's a huge market. I've seen some things for people exploring it from prompting to

447
00:27:15,840 --> 00:27:22,320
some 3d design prompting to vectors. No one has cracked it yet, but you know, there's a few things

448
00:27:22,320 --> 00:27:26,240
I think we brought up on the podcast before we have people that are doing generative AI be a

449
00:27:26,240 --> 00:27:32,000
prompting to create assets for computer games that can be used inside of Unreal five engine.

450
00:27:32,000 --> 00:27:37,760
I'll try to see if I can find them right now, but I know there's a company that does text prompt to

451
00:27:37,760 --> 00:27:43,760
3d STL model. So you can basically just type in a prompt and then have what you would need to

452
00:27:43,760 --> 00:27:48,720
3d print it, you know, throw it into the slicer and then, you know, put it through.

453
00:27:49,440 --> 00:27:56,320
Leonardo AI for anybody looking for a game development, like what the guys behind it,

454
00:27:56,320 --> 00:28:01,280
like I think I told you about Ethan, one of the co-founders, we were, you know, back in the days

455
00:28:01,280 --> 00:28:07,200
before Leonardo, we were just geeking out in the discord channels. It's just incredible. It's,

456
00:28:07,200 --> 00:28:14,640
I think they are reaching millions of users now in such short time. And that also makes me think

457
00:28:15,280 --> 00:28:22,880
a year ago, I made this video, which is who's going to build in metaverse AI or architects.

458
00:28:22,880 --> 00:28:28,480
And in that video, I made completely like just, you know, going down the rabbit hole of this idea.

459
00:28:28,480 --> 00:28:36,560
I was like, in the future, you just prompt and say, I want them desert with flying buildings made

460
00:28:36,560 --> 00:28:44,400
of diamonds, wherever space and boom, AI generates that 3d environment space for you. And now a year

461
00:28:44,400 --> 00:28:51,920
later, we have VR spaces generated by AI. That's almond down the pike. Back then, I remember when

462
00:28:51,920 --> 00:28:57,520
I wrote it, I was like, Oh, probably people will be like, what she's talking about. It's crazy.

463
00:28:57,520 --> 00:29:04,800
Yeah. Well, I mean, we're every day, like I said, it must be, but it felt like to see an airplane

464
00:29:04,800 --> 00:29:10,400
flying the sky for the first time, something so divergent from your experiences up into this date.

465
00:29:10,400 --> 00:29:15,440
But like speaking of things that we haven't seen before, you know, a natural progression,

466
00:29:15,440 --> 00:29:19,840
we've talked about some text to video, we've talked about some of our friends that are,

467
00:29:19,840 --> 00:29:24,960
you know, some luminaries and thought leaders in the space. How long do you think it's going to be

468
00:29:24,960 --> 00:29:34,560
until we see a full on prompted movie? I think it's in the workings right now. Definitely. I mean,

469
00:29:34,560 --> 00:29:40,400
so this was set up question. You're like, come on, I got an agenda here. We gotta, we just gotta

470
00:29:40,400 --> 00:29:45,840
try to be like, you know, professional podcasters here. Yeah. Okay. So I was more thinking about

471
00:29:45,840 --> 00:29:54,320
like full length cinema, but of course this piece from the frost is already like, takes you a step

472
00:29:54,320 --> 00:30:00,640
back to think what is the future of animation and not even animation. You know, I think, look,

473
00:30:00,640 --> 00:30:05,280
make no mistake. There's still some uncanny Valley. Our brains are so used to looking at

474
00:30:05,280 --> 00:30:11,600
images of, you know, not images, but just looking at other people. So anytime their face or hands

475
00:30:11,600 --> 00:30:16,480
are doing something that's not natural to what we're just used to seeing all the time, we notice it.

476
00:30:16,480 --> 00:30:22,800
So that's definitely apparent in these types of things too, but this is a 13 minute completely

477
00:30:22,800 --> 00:30:28,560
generated, you know, AI movie and we're going to include the link to it, but you know, it,

478
00:30:28,560 --> 00:30:34,800
you could follow the narrative structure easily. That's not lost at all. And in some ways, you know,

479
00:30:34,800 --> 00:30:39,280
some of the choices they've made because of the current technological limitations could just be

480
00:30:39,280 --> 00:30:45,120
stylistic ones that, you know, might be said, Hey, we made this deliberately, this kind of blocky

481
00:30:45,120 --> 00:30:50,640
chunky kind of movements and stuff like that. I'd still say like half of these shots, I think the,

482
00:30:50,640 --> 00:30:56,880
the human ones admittedly are hard, but like half of them are still like ones that, you know, I,

483
00:30:56,880 --> 00:31:00,960
I'd be like, all right, that'd be something I'd see in a movie.

484
00:31:00,960 --> 00:31:05,120
I want to kind of take the different camp on this one.

485
00:31:05,120 --> 00:31:05,600
Please do.

486
00:31:06,160 --> 00:31:13,360
Especially as a content creator and editor, my own videos, when you see the composition,

487
00:31:13,360 --> 00:31:20,400
the color grading, the transitions, because I, it was edited by a human, the creative decisions

488
00:31:20,400 --> 00:31:26,720
of the storytelling of the scenes, that central composition leading lines, all this matters

489
00:31:27,280 --> 00:31:34,800
were still decided by a human. Or if you tell me that this was completely just a prompt in the

490
00:31:34,800 --> 00:31:43,040
composition, the mood, like all these things are deliberately there, the color theory, and it's

491
00:31:43,040 --> 00:31:49,520
human who makes those decisions. But of course, for example, Major and he makes incredible images

492
00:31:49,520 --> 00:31:55,680
and you can see this composition decisions just adapted. But do you think in this case was it too?

493
00:31:55,680 --> 00:32:01,120
I mean, I think it probably was in a lot of instances, but you know, you're absolutely

494
00:32:01,120 --> 00:32:05,760
right. The true power here is when you have someone with expertise and creativity,

495
00:32:05,760 --> 00:32:11,120
you know, working alongside AI. Right now, some of these things like this is not,

496
00:32:11,120 --> 00:32:16,240
they wouldn't pass for a Hollywood movie at this point, but it's not too far off.

497
00:32:16,240 --> 00:32:20,720
You know, it looks kind of like 80s or 90s special effects in some degree.

498
00:32:20,720 --> 00:32:27,200
So I'm not saying that it's not good, but I'm still want to think that final creative decisions

499
00:32:27,200 --> 00:32:35,440
is going to be made by human. Or maybe if we go like what I did a year ago, and we say that,

500
00:32:35,440 --> 00:32:40,880
okay, in a year or two, the whole movies are going to be generated by AI. I think in Black

501
00:32:40,880 --> 00:32:48,320
Mirror episode, they teased this moment where you're watching kind of Netflix and seeing

502
00:32:48,320 --> 00:32:53,680
representation of yourself. And the character is like, oh, it's like, she's just like me talks like

503
00:32:53,680 --> 00:33:00,560
me looks like me. So where did they get this data from? Well, kind of on the same vein as you know,

504
00:33:00,560 --> 00:33:07,440
what we're discussing here about creativity, and still needing to have that final creative decision.

505
00:33:07,440 --> 00:33:12,560
One thing I don't think we've brought up on the pod yet is we've touched on it a little bit in terms

506
00:33:12,560 --> 00:33:18,720
of some of the legal implications for these generative models, but they've been trained on.

507
00:33:18,720 --> 00:33:23,520
And if folks aren't aware, there's, you know, it's going to be a fight in a lot of different ways.

508
00:33:23,520 --> 00:33:29,280
And the first one that's going to be the waging the war during the battle is stable diffusion.

509
00:33:29,280 --> 00:33:33,040
This article right here is back from February, but just in the past week,

510
00:33:33,040 --> 00:33:38,320
they released that this is going to trial. But I mean, Getty Images suing stable diffusion for

511
00:33:39,200 --> 00:33:47,280
the landmark amount of 1.8 trillion, the largest lawsuit ever for the 12 million images that they

512
00:33:47,280 --> 00:33:53,040
claim were in the sample database. And they're claiming damages of 150 grand per image.

513
00:33:53,680 --> 00:34:00,240
Yeah. And oh my God, we can dive into this because I was looking into this training data model,

514
00:34:00,240 --> 00:34:06,800
which was stable diffusion, but it just Lyon it's called. And we will include in the newsletter,

515
00:34:06,800 --> 00:34:13,120
there is a link where you can actually upload and check if what is the chance that your image

516
00:34:13,120 --> 00:34:20,160
was used in the training data, but the amounts are staggering. And it's known that they did use

517
00:34:20,720 --> 00:34:26,960
this data. And it's just interesting that I read somewhere that there was shutter stock in Getty

518
00:34:26,960 --> 00:34:34,480
Images was striking a deal with these to get paid. And it just looks to me like that deal maybe

519
00:34:34,480 --> 00:34:40,880
didn't go through. And with the sentiment from art community, they just went full on. If you go in a

520
00:34:40,880 --> 00:34:47,440
law case and you say like, yeah, none of the images, new produced images actually have exactly,

521
00:34:47,440 --> 00:34:53,920
exactly copy of your image. And it's a tricky argument they are using, you know, but there's

522
00:34:53,920 --> 00:35:00,880
an easy solution to that. What's that just develop like a coordinate system or metadata system or

523
00:35:00,880 --> 00:35:07,600
like, you know, licensing or copyright thing in the image and just define the radius that okay,

524
00:35:07,600 --> 00:35:14,000
there is a certain spectrum, everything is packed from nothing is like just true or not. So if my

525
00:35:14,000 --> 00:35:22,240
image and the spectrum, the images get close to my style, to, to my original image, I get a royalty.

526
00:35:22,240 --> 00:35:29,440
I mean, free open source model. It's not quite that simple because no, when you look at that,

527
00:35:29,440 --> 00:35:35,440
when you look at that globe there, right? That is an undirected network. What that means in network

528
00:35:35,440 --> 00:35:41,520
science context. It's not like you start with one thing and it makes these decisions linearly.

529
00:35:41,520 --> 00:35:48,000
The way these diffusion models work is they not only can identify what images are, they understand

530
00:35:48,000 --> 00:35:54,240
how they interact with each other. So it won't necessarily be this linear decision of, okay,

531
00:35:54,240 --> 00:35:58,240
we use a little bit from this sample image, then went over here and use some of this and came back

532
00:35:58,240 --> 00:36:02,960
to this, this neighborhood of cluster, you know, and on the way we have to fill in some stuff that

533
00:36:02,960 --> 00:36:09,600
we, we didn't have any samples for. It's more so, it just kind of does it all at once. So I think

534
00:36:09,600 --> 00:36:17,360
that still being able to track it in a way that's methodical where you can even say that like, hey,

535
00:36:17,360 --> 00:36:24,400
75% of this is, is the same. Well, all right, let's use that argument. So say that's the case.

536
00:36:24,400 --> 00:36:30,000
If I'm not mistaken, I am no legal expert, but I found this kind of surprising in my beloved country

537
00:36:30,000 --> 00:36:36,240
to the North Canada, with my understanding that for using a trademark logo, for example, if you

538
00:36:36,240 --> 00:36:42,480
make one change to that logo, change one color, add something that is now a derivative and it's

539
00:36:42,480 --> 00:36:50,880
a separate thing. Something is 75% the same, but 25% different. That's a derivative. Yeah.

540
00:36:50,880 --> 00:36:58,480
And this is absolutely true. And I'm looking at another law case, which actually this artist,

541
00:36:58,480 --> 00:37:05,920
Carol Heintz, met, and this was in 2016. She sued back then Getty images for 1 billion

542
00:37:06,480 --> 00:37:11,760
because they were using her images and actually send her, this is a funny story. She didn't win,

543
00:37:11,760 --> 00:37:18,800
but the funny story is they send her a bill for license, but she's using license of her own work

544
00:37:18,800 --> 00:37:24,880
illegally. And this is how she found out that they've been stealing her work for years and

545
00:37:24,880 --> 00:37:31,440
licensing and charging people. And now we try to charge author herself. Oops. Yeah. So what,

546
00:37:31,440 --> 00:37:39,360
so she went like 1 billion damages. Yeah. We need to have a lawyer on here. It's such

547
00:37:39,360 --> 00:37:44,080
interesting decisions. I'm hopeful, but also the people that are going to be making these decisions

548
00:37:44,080 --> 00:37:49,760
around the world are not of our generation of conqueror. I mean, these are people that

549
00:37:49,760 --> 00:37:55,440
may not even be able to concede you and understand and have an appreciation for some of these things.

550
00:37:55,440 --> 00:38:00,320
So, I mean, I don't envy them, but it's going to be very challenging kind of decisions and

551
00:38:00,320 --> 00:38:06,720
complexities to sort through. I'm just running in my mind about like all the network of legal.

552
00:38:06,720 --> 00:38:12,320
Even if I know a lawyer, I would be like, does this lawyer know the intricacies of this?

553
00:38:12,320 --> 00:38:13,520
We need an AI lawyer.

554
00:38:13,520 --> 00:38:19,840
This is so new. Yeah. It's funny. Like we have a contact in the patent department. So now I'm

555
00:38:19,840 --> 00:38:24,000
thinking that would be interesting conversation, but yeah, we'll be bringing up guests.

556
00:38:24,000 --> 00:38:30,480
We have a lot of, a lot more of that on the horizon in future weeks of filmed video,

557
00:38:30,480 --> 00:38:37,840
HTTTA. So I think with that, this is a good enough place to call it quits for the week.

558
00:38:37,840 --> 00:38:41,920
So for GoToGo, I am Wes the SynthMind saying happy prompting everybody.

559
00:38:42,560 --> 00:38:44,480
Happy prompting everybody.

560
00:38:45,680 --> 00:38:52,320
Thanks for listening to how to talk to AI with your hosts, GoToGo and Wes the SynthMind.

561
00:38:53,200 --> 00:38:59,040
As always, you can check out the show notes and links at howtotalkto.ai.

562
00:38:59,040 --> 00:39:02,720
That's all for this week's episode. Happy prompting everyone.

