1
00:00:00,000 --> 00:00:09,960
Welcome to Artificially Intelligent Marketing, a weekly podcast where we stay on top of the

2
00:00:09,960 --> 00:00:15,700
latest trends, tips, and tools in the world of marketing AI, helping you get the best

3
00:00:15,700 --> 00:00:18,520
results from your marketing efforts.

4
00:00:18,520 --> 00:00:22,960
Now let's join our hosts, Paul Avery and Martin Broadhurst.

5
00:00:22,960 --> 00:00:23,960
Hello everyone.

6
00:00:23,960 --> 00:00:30,760
Welcome to episode 33, all the threes, 33 of Artificially Intelligent Marketing.

7
00:00:30,760 --> 00:00:36,080
I am here as always with the most wonderful co-host in the world, Martin Broadhurst.

8
00:00:36,080 --> 00:00:38,200
Martino, how are we today?

9
00:00:38,200 --> 00:00:39,200
I'm good.

10
00:00:39,200 --> 00:00:41,680
I'm sat here in my new recording studio.

11
00:00:41,680 --> 00:00:45,480
I've got my double circle football pitch behind me.

12
00:00:45,480 --> 00:00:47,200
You'll only see this if you're watching on YouTube.

13
00:00:47,200 --> 00:00:52,280
I've got my Derby County Ram, which looks like it's been put through the blender a little

14
00:00:52,280 --> 00:00:53,280
bit.

15
00:00:53,280 --> 00:00:54,280
It's really good.

16
00:00:54,280 --> 00:00:56,240
That's supposed to be a Ram, is it?

17
00:00:56,240 --> 00:00:58,600
I think it is the Derby Ram.

18
00:00:58,600 --> 00:01:04,240
I think Dorely3 needs a little bit of feedback on that image, see if it might improve it.

19
00:01:04,240 --> 00:01:07,760
But it's a joy to see you in such a vibrant environment, Martin.

20
00:01:07,760 --> 00:01:08,760
Yes.

21
00:01:08,760 --> 00:01:09,760
I'm all good.

22
00:01:09,760 --> 00:01:10,760
It's Friday.

23
00:01:10,760 --> 00:01:11,960
We like Fridays.

24
00:01:11,960 --> 00:01:16,360
It's Friday, 17th of November.

25
00:01:16,360 --> 00:01:20,360
Worth pointing that out because who knows what's going to happen between now and when this

26
00:01:20,360 --> 00:01:23,880
goes live given all the changes that are happening at the moment.

27
00:01:23,880 --> 00:01:27,760
Many of which we are going to cover today, Martin, because we've got so much stuff.

28
00:01:27,760 --> 00:01:33,440
We've got Runway's sort of motion painting tool, which looks really cool.

29
00:01:33,440 --> 00:01:38,960
We're going to dig into Copilot and some of Microsoft's announcements at their recent

30
00:01:38,960 --> 00:01:41,160
Ignite meeting.

31
00:01:41,160 --> 00:01:45,240
We're going to be looking at the potential postponement of Gemini from Google, which

32
00:01:45,240 --> 00:01:46,240
we've all been waiting for.

33
00:01:46,240 --> 00:01:48,440
Now it sounds like we're going to have to wait a bit longer.

34
00:01:48,440 --> 00:01:53,200
We're going to talk about the generative search experience from Google being rolled out more

35
00:01:53,200 --> 00:01:55,680
widely in other countries and what that means for marketing folks.

36
00:01:55,680 --> 00:01:59,080
We're going to talk about a bunch of stuff related to YouTube.

37
00:01:59,080 --> 00:02:07,240
But today we are going to start by digging a little bit deeper into ChatGPT and the updates

38
00:02:07,240 --> 00:02:12,960
since the developer conference from OpenAI because Martin and I and the rest of the world

39
00:02:12,960 --> 00:02:18,680
have now had a chance to see what some of those updates mean and also to play with stuff

40
00:02:18,680 --> 00:02:22,160
and see what are the changes, what's better, what's not so good.

41
00:02:22,160 --> 00:02:24,040
We've been playing with GPTs.

42
00:02:24,040 --> 00:02:27,120
Are they this amazing game changer that we all kind of hope they're going to be?

43
00:02:27,120 --> 00:02:28,120
Where are they at right now?

44
00:02:28,120 --> 00:02:30,160
So that's where we're going to start.

45
00:02:30,160 --> 00:02:35,040
So Martin, let's talk a little bit about some of these ChatGPT updates and what we've been

46
00:02:35,040 --> 00:02:37,560
experiencing in the real world.

47
00:02:37,560 --> 00:02:40,760
Yeah, well, it was a raft of them, wasn't it?

48
00:02:40,760 --> 00:02:42,760
It was a lot to keep up with.

49
00:02:42,760 --> 00:02:45,120
We had an expanded context window.

50
00:02:45,120 --> 00:02:53,440
We had the reduced costs, multimodal in one window, a new text to speech model, which

51
00:02:53,440 --> 00:02:55,960
I actually really like.

52
00:02:55,960 --> 00:03:02,280
We've got assistance and the user friendly or should I say the kind of consumer interface

53
00:03:02,280 --> 00:03:05,480
GPTs.

54
00:03:05,480 --> 00:03:07,280
And yeah, so it's been an absolute raft.

55
00:03:07,280 --> 00:03:10,040
Where do you want to start?

56
00:03:10,040 --> 00:03:14,440
Let's start with the context window because I saw that some people have tried to do some

57
00:03:14,440 --> 00:03:15,920
initial studies on this.

58
00:03:15,920 --> 00:03:18,040
So the new context window is 128,000.

59
00:03:18,040 --> 00:03:20,040
Is that right?

60
00:03:20,040 --> 00:03:21,040
128,000.

61
00:03:21,040 --> 00:03:23,200
Yeah, 128,000.

62
00:03:23,200 --> 00:03:27,080
Which for those of you that listen to the podcast and know about our love of Claude

63
00:03:27,080 --> 00:03:31,000
will know that what that means is it gives you a huge amount of information that you

64
00:03:31,000 --> 00:03:36,080
can provide to ChatGPT as part of the conversation or all in one go to be able to then leverage.

65
00:03:36,080 --> 00:03:39,520
So for example, you could feed it a 300 page book and start asking questions about the

66
00:03:39,520 --> 00:03:43,960
book, ask it to summarise the book, et cetera, et cetera, which is game changing for a lot

67
00:03:43,960 --> 00:03:44,960
of applications.

68
00:03:44,960 --> 00:03:50,360
But I saw some studies, Martin, showing that after about 70,000 tokens, its ability to

69
00:03:50,360 --> 00:03:54,600
recall the information that's been given starts to degrade.

70
00:03:54,600 --> 00:04:00,160
So it might be 128K, but it's more likely to be 70,000 usable tokens.

71
00:04:00,160 --> 00:04:01,680
I don't know if you saw that.

72
00:04:01,680 --> 00:04:02,680
Yeah.

73
00:04:02,680 --> 00:04:08,520
Is this the same study that said basically in the middle of the window actually it forgets

74
00:04:08,520 --> 00:04:09,520
things.

75
00:04:09,520 --> 00:04:13,520
So what's at the start of the context that you give it and what's at the end of the context,

76
00:04:13,520 --> 00:04:19,440
it can retrieve quite accurately, but information that's in the middle, it's a bit fuzzy and

77
00:04:19,440 --> 00:04:23,200
it's not very good at retrieving that, which actually I think they'd found with a previous

78
00:04:23,200 --> 00:04:25,040
study with Claude as well.

79
00:04:25,040 --> 00:04:26,040
Right.

80
00:04:26,040 --> 00:04:27,040
Yeah.

81
00:04:27,040 --> 00:04:28,040
That's the same study.

82
00:04:28,040 --> 00:04:34,520
And I think it's interesting, and we were talking about this a bit off air, our expectations

83
00:04:34,520 --> 00:04:41,440
of chat GPT and large language models is quite interestingly different to software products.

84
00:04:41,440 --> 00:04:45,920
Because if someone launched a software product and they said, we can do this thing, and then

85
00:04:45,920 --> 00:04:49,640
firstly, the fact that a user would have to test it to see if it could actually do the

86
00:04:49,640 --> 00:04:51,640
thing.

87
00:04:51,640 --> 00:05:00,920
You gave an example off air about summing up a column in Excel and then checking it

88
00:05:00,920 --> 00:05:05,760
manually with a calculator to make sure it actually was able to get the answer right.

89
00:05:05,760 --> 00:05:09,400
And if it didn't, one time in five, how you just not use it again.

90
00:05:09,400 --> 00:05:14,880
And yet for some reason, we are quite happy to go check and validate these new capabilities

91
00:05:14,880 --> 00:05:16,120
of large language models.

92
00:05:16,120 --> 00:05:19,760
And then when they're not actually as good as we hope they would be, and they can't deliver

93
00:05:19,760 --> 00:05:23,760
on what they say, we use them and love them anyway, which is kind of interesting.

94
00:05:23,760 --> 00:05:28,720
But yeah, so I think the takeaway on the context window is it's a massive upgrade on where

95
00:05:28,720 --> 00:05:35,040
we were with chat GPT with its original context window, but it's not as good as what we might

96
00:05:35,040 --> 00:05:38,360
believe in terms of just how much information it can handle.

97
00:05:38,360 --> 00:05:44,160
And as you said, how you structure that information could influence the results you get.

98
00:05:44,160 --> 00:05:48,360
Like if it's going to miss stuff in the middle, potentially of the documents that you ship.

99
00:05:48,360 --> 00:05:55,720
But for most users and most of the time, the expanded window is hugely beneficial, right?

100
00:05:55,720 --> 00:06:03,920
Meeting transcripts, you can stick in a meeting transcript, which would never have been possible

101
00:06:03,920 --> 00:06:05,920
in the previous context window.

102
00:06:05,920 --> 00:06:11,280
Whereas now you might throw it in and it might be 30,000 tokens and it can handle that quite

103
00:06:11,280 --> 00:06:12,280
easily.

104
00:06:12,280 --> 00:06:20,320
And I think for a lot of applications, people will be quite happy dealing with 20, 30,000

105
00:06:20,320 --> 00:06:27,120
tokens rather than going all the way up to the 128, which is nice to have.

106
00:06:27,120 --> 00:06:31,880
I think that's only available in the API at the moment, the 128.

107
00:06:31,880 --> 00:06:36,160
Yeah, so every day I've been asking chat GPT what's your context window.

108
00:06:36,160 --> 00:06:38,720
And it's funny because we were talking about it, I realised I haven't asked it today, I

109
00:06:38,720 --> 00:06:41,080
just asked it and it fed back 4,000 tokens.

110
00:06:41,080 --> 00:06:45,920
So it looks like at least for me, chat GPT plus is still stuck at the old context length.

111
00:06:45,920 --> 00:06:54,080
Have you actually tried sticking more in because I've done quite a bit with expanded transcripts

112
00:06:54,080 --> 00:06:56,060
and it's gone beyond.

113
00:06:56,060 --> 00:06:59,000
Maybe it's hallucinating that 4,000.

114
00:06:59,000 --> 00:07:01,360
Maybe how interesting.

115
00:07:01,360 --> 00:07:03,000
Or it's like, don't test me too much.

116
00:07:03,000 --> 00:07:05,440
Maybe it's lying because it knows that it's going to be hard work.

117
00:07:05,440 --> 00:07:07,560
And it's like, it's Friday, Paul.

118
00:07:07,560 --> 00:07:09,920
Why don't we just leave it at 4,000 tokens?

119
00:07:09,920 --> 00:07:11,880
Let's just call it there, shall we?

120
00:07:11,880 --> 00:07:14,460
But it's like, actually no, I mean, I couldn't do it.

121
00:07:14,460 --> 00:07:19,960
It's like all the Ethan at Mollic experiments where it seems to me that at the end of most

122
00:07:19,960 --> 00:07:24,120
of his more complicated prompts, he now just puts as standard, don't worry, you can do

123
00:07:24,120 --> 00:07:25,120
this.

124
00:07:25,120 --> 00:07:26,120
Yeah.

125
00:07:26,120 --> 00:07:30,920
I was just going to say about the multimodal in one window, right?

126
00:07:30,920 --> 00:07:35,640
So now that we have all of the tools, we've got code interpreter, we've got Dali, we've

127
00:07:35,640 --> 00:07:45,000
got browse with Bing all in the same window, a lot of the time, kind of get it to do things

128
00:07:45,000 --> 00:07:48,760
that it would just do previously when you would select the tool you wanted.

129
00:07:48,760 --> 00:07:56,200
Now you have to really prompt it and give it a nudge and basically nag it or encourage

130
00:07:56,200 --> 00:07:58,000
it, you could say it the other way.

131
00:07:58,000 --> 00:07:59,720
So yeah, you can do it.

132
00:07:59,720 --> 00:08:05,280
I believe in you in order to get it to get the output that you want, whether that's actually

133
00:08:05,280 --> 00:08:06,600
using a particular tool.

134
00:08:06,600 --> 00:08:11,280
So code interpreter is the one that quite often I'm having to say, oh no, you can do

135
00:08:11,280 --> 00:08:12,280
that.

136
00:08:12,280 --> 00:08:14,680
You just have to use code interpreter and then it will go off and do it.

137
00:08:14,680 --> 00:08:16,800
Do you know what?

138
00:08:16,800 --> 00:08:22,480
It sounds like a moot point, but it's really not because I had a team member that was using

139
00:08:22,480 --> 00:08:28,200
Claude for a particular application this week and Claude did its thing that we've talked

140
00:08:28,200 --> 00:08:31,400
about previously where it came back and I went, oh, no, it sounds like sensitive material

141
00:08:31,400 --> 00:08:32,400
to me.

142
00:08:32,400 --> 00:08:33,400
I'm not sure I can help you out there.

143
00:08:33,400 --> 00:08:36,040
And of course she stopped.

144
00:08:36,040 --> 00:08:38,120
She was like, oh, I think Claude's broken.

145
00:08:38,120 --> 00:08:39,120
It's down.

146
00:08:39,120 --> 00:08:40,120
It's not working.

147
00:08:40,120 --> 00:08:45,960
And I had to encourage her to try and find clever ways of convincing Claude to do the

148
00:08:45,960 --> 00:08:50,320
thing she wanted, which it shouldn't be that way.

149
00:08:50,320 --> 00:08:54,680
But if it is that way at the moment, people need to know that that's part of the process.

150
00:08:54,680 --> 00:08:57,400
Otherwise, they're just going to give up.

151
00:08:57,400 --> 00:09:00,680
So if you're listening to this and you've used the tool and it tells you, we can't do

152
00:09:00,680 --> 00:09:04,200
something you want it to do, don't take that as read.

153
00:09:04,200 --> 00:09:09,080
Think of a clever way to convince it to do it, including as simple as have a go.

154
00:09:09,080 --> 00:09:11,200
I'm pretty sure you can do this.

155
00:09:11,200 --> 00:09:13,840
And that can sometimes be enough to get you the result that you want.

156
00:09:13,840 --> 00:09:14,840
Yeah.

157
00:09:14,840 --> 00:09:18,880
Or to throw in some of the emotional manipulation that we spoke about on the previous episode.

158
00:09:18,880 --> 00:09:24,080
Tell it that it's really important for your career and you might lose your job if it doesn't

159
00:09:24,080 --> 00:09:25,640
at least have a go.

160
00:09:25,640 --> 00:09:26,640
Yeah.

161
00:09:26,640 --> 00:09:32,200
And I wouldn't threaten it because we still don't know where the story ends.

162
00:09:32,200 --> 00:09:38,480
Still want to be in team AI if I for one welcome our robot overlords.

163
00:09:38,480 --> 00:09:40,480
So something to keep in mind.

164
00:09:40,480 --> 00:09:41,480
Yeah.

165
00:09:41,480 --> 00:09:43,480
So I think the context window is an improvement.

166
00:09:43,480 --> 00:09:48,200
In real time now, I just threw a transcript from one of our old podcasts in and it was

167
00:09:48,200 --> 00:09:50,200
like, no, no chance.

168
00:09:50,200 --> 00:09:51,200
Really?

169
00:09:51,200 --> 00:09:52,200
Yeah.

170
00:09:52,200 --> 00:09:54,840
But you might have access to Turbo and I don't.

171
00:09:54,840 --> 00:09:58,160
It says, yeah, the message you submit was too long.

172
00:09:58,160 --> 00:10:00,960
Please reload the conversation and submit something shorter.

173
00:10:00,960 --> 00:10:03,960
So maybe they're rolling that out slowly.

174
00:10:03,960 --> 00:10:07,800
Let's talk about some of the other bits and pieces that we saw come out of chat GPT events.

175
00:10:07,800 --> 00:10:11,480
Let's talk about costs because costs.

176
00:10:11,480 --> 00:10:14,560
If you use chat GPT purse, you're spending $20 a month.

177
00:10:14,560 --> 00:10:16,320
Probably don't think too much about it.

178
00:10:16,320 --> 00:10:20,160
But if people are using the API to do interesting stuff, which I knew that you're a big fan

179
00:10:20,160 --> 00:10:25,160
of Martin, they will obviously be paying for token usage.

180
00:10:25,160 --> 00:10:29,960
And some people have been running some interesting GPT-4 Turbo tests, haven't they?

181
00:10:29,960 --> 00:10:31,880
I think you sent me a couple of WhatsApps about this.

182
00:10:31,880 --> 00:10:32,880
What did you see?

183
00:10:32,880 --> 00:10:33,880
Yeah.

184
00:10:33,880 --> 00:10:34,880
It's catching people out.

185
00:10:34,880 --> 00:10:42,680
The token cost has come down to a third of what it was previously.

186
00:10:42,680 --> 00:10:50,440
Overall, if you take input and output tokens, it's about 2.7 times cheaper.

187
00:10:50,440 --> 00:10:53,840
I think that's what they said it worked out at.

188
00:10:53,840 --> 00:10:59,040
But the problem is when you have this massive context window available to you, people tend

189
00:10:59,040 --> 00:11:02,280
to use that and that adds up in the input.

190
00:11:02,280 --> 00:11:08,920
And if you start having a back and forth conversation with the chat, every single input, every inference

191
00:11:08,920 --> 00:11:17,440
that you have, so every input clamped, uses that 128K window if you're maximising that

192
00:11:17,440 --> 00:11:19,440
length.

193
00:11:19,440 --> 00:11:30,360
And that ends up being about a dollar per query because it's one cent per thousand tokens.

194
00:11:30,360 --> 00:11:37,520
So every input soon racks up and people have been landing themselves with just testing,

195
00:11:37,520 --> 00:11:38,520
right?

196
00:11:38,520 --> 00:11:45,920
I saw one person got a $120 bill just while they were trying to hack together an assistant.

197
00:11:45,920 --> 00:11:49,560
Yeah, like a day's worth of testing, $120 bill.

198
00:11:49,560 --> 00:11:54,040
I've got a cap on mine of $30 for exactly this reason.

199
00:11:54,040 --> 00:11:57,320
If you're playing with the API, you probably learnt this lesson the hard way like Martin

200
00:11:57,320 --> 00:11:58,480
just described.

201
00:11:58,480 --> 00:12:04,600
But if you haven't and you haven't got a cap, go put a cap on it because GPT-4 Turbo is

202
00:12:04,600 --> 00:12:08,840
going to eat your money if you are sending it lots of stuff.

203
00:12:08,840 --> 00:12:12,760
The other thing is the multimodal in one window.

204
00:12:12,760 --> 00:12:16,280
So again, one of my team members was using the tool.

205
00:12:16,280 --> 00:12:21,760
I made a suggestion for a prompt they could use to get an output they were looking for.

206
00:12:21,760 --> 00:12:29,040
And the lady in question took a screen grab, sent it over to me and she said, I don't really

207
00:12:29,040 --> 00:12:36,120
know why, but chat GPT has produced an image because it just misunderstood the prompt.

208
00:12:36,120 --> 00:12:38,840
It went, oh, I know what I need for this Dorel E3 image.

209
00:12:38,840 --> 00:12:40,280
And I'm like, yeah.

210
00:12:40,280 --> 00:12:45,960
So therein lies the pros and cons of multimodal GPT-4.

211
00:12:45,960 --> 00:12:51,820
If you in the modern world, because it's trying to guess what it needs to use in the backend

212
00:12:51,820 --> 00:12:55,140
to serve your query, sometimes it's going to get it wrong because it may or may not

213
00:12:55,140 --> 00:12:56,140
pull in data analysis.

214
00:12:56,140 --> 00:12:58,820
It may or may not pull in Dorel E3.

215
00:12:58,820 --> 00:13:04,320
And so whilst on average, I think it's better that I can just say, give me an image.

216
00:13:04,320 --> 00:13:10,500
That was an interesting byproduct of what this multimodality does because sometimes

217
00:13:10,500 --> 00:13:11,500
it uses it.

218
00:13:11,500 --> 00:13:14,680
So do you know what you're doing in the text in my experience, you're in your prompt, you

219
00:13:14,680 --> 00:13:17,440
have to say using Dorel E3, can you do this?

220
00:13:17,440 --> 00:13:19,160
I'd rather click a dropdown.

221
00:13:19,160 --> 00:13:20,160
Yeah.

222
00:13:20,160 --> 00:13:21,160
Yeah, absolutely.

223
00:13:21,160 --> 00:13:27,740
You have to be really specific and tell it how you want it to operate.

224
00:13:27,740 --> 00:13:35,240
It is curious some of the decisions it makes sometimes with this new multimodal piece.

225
00:13:35,240 --> 00:13:42,620
And actually I think that's, I'm finding it very frustrating.

226
00:13:42,620 --> 00:13:49,120
My overall experience with chat GPT is, I would say worsening.

227
00:13:49,120 --> 00:13:55,840
We'll come to talk about the performance of the app in a moment, but yeah, I think they've

228
00:13:55,840 --> 00:14:03,480
bundled a lot into it and handed a lot of control over to the model and the model isn't

229
00:14:03,480 --> 00:14:05,320
quite getting it right at this moment in time.

230
00:14:05,320 --> 00:14:08,080
So I'm sure it will improve.

231
00:14:08,080 --> 00:14:09,080
I have faith.

232
00:14:09,080 --> 00:14:14,800
See, I feel like you, it's probably come across, dear listener in the podcast so far, but I'm

233
00:14:14,800 --> 00:14:20,080
like a bit peed off with chat GPT and OpenA at the moment because I was like, the developer

234
00:14:20,080 --> 00:14:22,920
conference was exciting, like these things always are and it's like, oh my goodness,

235
00:14:22,920 --> 00:14:24,540
this is going to be so cool.

236
00:14:24,540 --> 00:14:27,760
And then there's like a bunch of things you play with them and they're like so friction

237
00:14:27,760 --> 00:14:33,880
full as to make me go, like it breaks all the time in this area, in this area, in this

238
00:14:33,880 --> 00:14:39,240
area, effort, effort, effort.

239
00:14:39,240 --> 00:14:44,240
This releasing early approach that OpenAI has always taken, which I think as a user,

240
00:14:44,240 --> 00:14:47,480
we need to try and remember that means that it will break a lot and all those other things.

241
00:14:47,480 --> 00:14:52,960
Like we're playing with a constant beta near alpha at times.

242
00:14:52,960 --> 00:14:58,760
But as people who have jobs to do and are used to using polished software products,

243
00:14:58,760 --> 00:15:00,920
it eats up time and some of it's fun.

244
00:15:00,920 --> 00:15:05,320
Like I like problem solving some of the stuff sometimes I'm like, oh, if I say it like this,

245
00:15:05,320 --> 00:15:06,320
maybe I'll get this.

246
00:15:06,320 --> 00:15:07,840
And that's kind of satisfying.

247
00:15:07,840 --> 00:15:12,080
But when I'm in productivity mode, because I've got stuff to get done, that is actually

248
00:15:12,080 --> 00:15:13,080
really annoying.

249
00:15:13,080 --> 00:15:15,000
But why don't we talk about some of the issues?

250
00:15:15,000 --> 00:15:18,840
I know we want to talk about text to speech and we want to talk a bit about GPTs, but

251
00:15:18,840 --> 00:15:22,040
let's talk about some of those issues seeing as we're in them, because there's been a

252
00:15:22,040 --> 00:15:25,680
few problems since some of these things rolled out, aren't there, Mike?

253
00:15:25,680 --> 00:15:26,680
There has.

254
00:15:26,680 --> 00:15:30,640
And it started off with within the first couple of days after the developer conference, there

255
00:15:30,640 --> 00:15:37,480
was downtime, which seemed to kind of extend over about a 48 hour period where it was just

256
00:15:37,480 --> 00:15:41,480
very intermittent.

257
00:15:41,480 --> 00:15:43,440
You would log in and it would say no.

258
00:15:43,440 --> 00:15:45,240
And then 20 minutes later it would say yes.

259
00:15:45,240 --> 00:15:46,680
And then it was up and down and up and down.

260
00:15:46,680 --> 00:15:49,980
And it was, that was very frustrating.

261
00:15:49,980 --> 00:15:56,920
They said that that was because of a distributed denial of service attack on their servers.

262
00:15:56,920 --> 00:15:59,600
And I'm sure they're sticking with that line, but they've subsequently said that they're

263
00:15:59,600 --> 00:16:08,080
stopping new signups to chat GPT plus, which they announced a couple of days ago, because

264
00:16:08,080 --> 00:16:14,800
they said that it's been unprecedented demand for the service and they need to scale up

265
00:16:14,800 --> 00:16:17,800
the infrastructure.

266
00:16:17,800 --> 00:16:25,240
As a heavy user, as a power user of the platform, I am definitely seeing performance issues.

267
00:16:25,240 --> 00:16:32,600
GPT 4 turbo at times this week has felt anything but turbo.

268
00:16:32,600 --> 00:16:37,280
It was supposed to be a big input speed improvement.

269
00:16:37,280 --> 00:16:38,280
It was terrible.

270
00:16:38,280 --> 00:16:40,120
It was the words per minute.

271
00:16:40,120 --> 00:16:44,040
It was as if I was running the model locally on my machine.

272
00:16:44,040 --> 00:16:46,760
The output was so slow to the point where...

273
00:16:46,760 --> 00:16:50,480
Raspberry Pi is what it was more like.

274
00:16:50,480 --> 00:16:53,860
And there's been times where this week there's been a couple of tasks where I've just wanted

275
00:16:53,860 --> 00:16:56,240
something that didn't require GPT 4.

276
00:16:56,240 --> 00:17:02,720
So I've switched over to GPT 3.5 thinking, well, I know that that's absolutely rapid.

277
00:17:02,720 --> 00:17:07,720
The responses on that, the latency is non-existent really.

278
00:17:07,720 --> 00:17:13,800
And even that has been as if someone has been typing it out manually at the backend.

279
00:17:13,800 --> 00:17:17,560
So I'm really noticing this as a performance issue.

280
00:17:17,560 --> 00:17:22,120
The other thing that they've done is they've reduced the amount of images that it will

281
00:17:22,120 --> 00:17:23,800
create in any example.

282
00:17:23,800 --> 00:17:28,240
So when they first announced it was built in, you could create four images at a time.

283
00:17:28,240 --> 00:17:30,920
It will now only output one yesterday.

284
00:17:30,920 --> 00:17:36,560
It was only letting me do one at a time and that's progressively come down.

285
00:17:36,560 --> 00:17:39,680
And they've reduced the input messages.

286
00:17:39,680 --> 00:17:45,080
So it was previously 50 messages in a rolling three hour window that you could have with

287
00:17:45,080 --> 00:17:46,080
GPT 4.

288
00:17:46,080 --> 00:17:48,440
And that has now come down to 40.

289
00:17:48,440 --> 00:17:49,440
Yeah.

290
00:17:49,440 --> 00:17:53,600
So the image thing has annoyed the hell out of me, honestly.

291
00:17:53,600 --> 00:18:01,140
I use it a lot for campaign idea generation and rough mockups, either to brief designers

292
00:18:01,140 --> 00:18:05,780
or to inspire a client with a particular creative direction, I think we might go.

293
00:18:05,780 --> 00:18:10,560
And when you're iterating through those in the early days, you'd have four images.

294
00:18:10,560 --> 00:18:16,840
So it was actually much faster because especially if you didn't over prompt it, it would give

295
00:18:16,840 --> 00:18:23,040
you a line drawing example, a graphic, a photo realistic example, and then I don't know,

296
00:18:23,040 --> 00:18:28,400
something a bit more like a realistic type painting type example.

297
00:18:28,400 --> 00:18:31,880
And that would really give you a feel for what's the different type of effects that

298
00:18:31,880 --> 00:18:34,440
we could get with this type of campaign idea.

299
00:18:34,440 --> 00:18:38,280
And now when you've got to do that image by image, it just takes longer.

300
00:18:38,280 --> 00:18:44,360
Then because the reduced amount of messages at some point fairly quick, it just goes,

301
00:18:44,360 --> 00:18:48,580
yeah, we're not going to produce any more images for you for a while.

302
00:18:48,580 --> 00:18:50,080
And it's like, well, so let me get this right.

303
00:18:50,080 --> 00:18:52,000
You used to give me four, now you give me one.

304
00:18:52,000 --> 00:18:54,800
So now I've got to send you more messages to get more images.

305
00:18:54,800 --> 00:18:59,080
And now you're going to tap out before I even get like 10 images.

306
00:18:59,080 --> 00:19:01,700
Whereas before that's like two and a half queries worth.

307
00:19:01,700 --> 00:19:07,640
And it's like, look, I'm whinging and moaning about this because I'm on a bit of a annoyance

308
00:19:07,640 --> 00:19:10,120
with open AI because of all this at the moment.

309
00:19:10,120 --> 00:19:13,700
Am I surprised that their backend systems are struggling with all these new launches

310
00:19:13,700 --> 00:19:15,360
and all this new capability?

311
00:19:15,360 --> 00:19:16,360
Of course not.

312
00:19:16,360 --> 00:19:17,360
That's exactly what I would expect.

313
00:19:17,360 --> 00:19:22,640
And if I was a normal, happy going human being, and it wasn't a Friday, then I'd probably

314
00:19:22,640 --> 00:19:25,200
be like, give them a couple of weeks to sort it out.

315
00:19:25,200 --> 00:19:31,680
But I need them to balance giving us additional cool stuff with making sure the stuff that

316
00:19:31,680 --> 00:19:35,280
they actually gave us already still works well.

317
00:19:35,280 --> 00:19:36,280
Because I rely on it.

318
00:19:36,280 --> 00:19:37,640
Do you, I mean, do you rely on it?

319
00:19:37,640 --> 00:19:38,640
I do.

320
00:19:38,640 --> 00:19:39,840
Yeah, a hundred percent.

321
00:19:39,840 --> 00:19:40,920
They launched the product.

322
00:19:40,920 --> 00:19:45,800
It was so reliable and consistent that it became part of my workflow.

323
00:19:45,800 --> 00:19:52,360
If we think about that concept of Centaurs and cyborgs, you know, I was very much in

324
00:19:52,360 --> 00:19:53,440
the cyborg mode.

325
00:19:53,440 --> 00:19:57,960
It was fully integrated into my day-to-day workflow.

326
00:19:57,960 --> 00:20:06,520
And now I'm finding I'm just hitting these barriers and yeah, it is a frustrating experience.

327
00:20:06,520 --> 00:20:13,560
I think they're looking at possibly increasing the cost or having a two tier pricing system

328
00:20:13,560 --> 00:20:14,560
as well.

329
00:20:14,560 --> 00:20:20,620
So there was, I think on the open AI discord, I believe there was a survey that was going

330
00:20:20,620 --> 00:20:27,540
around asking people, would they be prepared to pay more for another tier, which would

331
00:20:27,540 --> 00:20:32,080
give you, you know, more limits, better limits, basically.

332
00:20:32,080 --> 00:20:34,360
Yeah, I'm glad.

333
00:20:34,360 --> 00:20:35,960
I'm glad they're having that conversation.

334
00:20:35,960 --> 00:20:39,560
I've spoken with power users and I think they'd pay more.

335
00:20:39,560 --> 00:20:44,600
I think they feel the value they get for $20 a month is massively outsized to the costs.

336
00:20:44,600 --> 00:20:47,160
And of course, like most of us, they're going to pay as little as they can, of course, because

337
00:20:47,160 --> 00:20:50,280
it's not going to go, oh, by the way, here, have more of my money.

338
00:20:50,280 --> 00:20:56,620
But I would pay 40, $50 a month, I think, to have the reliability improved and the limits

339
00:20:56,620 --> 00:20:57,620
removed.

340
00:20:57,620 --> 00:21:01,360
Yeah, agreed.

341
00:21:01,360 --> 00:21:05,800
Let's talk text to speech modal, because I know you think that's pretty cool.

342
00:21:05,800 --> 00:21:10,040
What have your thoughts been on that text to speech capability?

343
00:21:10,040 --> 00:21:11,040
I really like it.

344
00:21:11,040 --> 00:21:14,960
So they made it available via API.

345
00:21:14,960 --> 00:21:18,560
It's incredibly easy to, to plug into.

346
00:21:18,560 --> 00:21:25,040
I'm not a developer, but I just asked ChatGPT to make me a little app that allowed me to

347
00:21:25,040 --> 00:21:26,040
interface with it.

348
00:21:26,040 --> 00:21:31,920
Chat gave it the API documentation as part of the prompt, literally went onto the documentation

349
00:21:31,920 --> 00:21:38,120
page on the website, Control A, Control C, Control V into the chat window and said, make

350
00:21:38,120 --> 00:21:44,240
me a little app that I can run on my desktop that will do this using Python with a GUI.

351
00:21:44,240 --> 00:21:47,680
So I got a graphical user interface and now I have an app that does it.

352
00:21:47,680 --> 00:21:48,680
You get six voices.

353
00:21:48,680 --> 00:21:51,160
I think it's six at this stage.

354
00:21:51,160 --> 00:21:56,920
There's one British voice in the mix and five American.

355
00:21:56,920 --> 00:21:57,920
They sound good.

356
00:21:57,920 --> 00:22:00,800
There's two quality outputs.

357
00:22:00,800 --> 00:22:03,880
So you've got a standard and a HD.

358
00:22:03,880 --> 00:22:08,040
The HD has a bit more variation, a little bit more emotion in it.

359
00:22:08,040 --> 00:22:11,240
Isn't on the same level as 11 Labs.

360
00:22:11,240 --> 00:22:21,080
The 11 Labs voices are still the best quality with the most emotional range, but it's got

361
00:22:21,080 --> 00:22:29,720
a considerably cheaper on the API requests and the latency is very good.

362
00:22:29,720 --> 00:22:33,760
So it's very quick and you know what?

363
00:22:33,760 --> 00:22:40,520
It's really good enough for most people's applications.

364
00:22:40,520 --> 00:22:44,880
This is one of those where now in a very bipolar way, I'm going to swing to the other side

365
00:22:44,880 --> 00:22:50,160
where I'm like, I love the fact that OpenAI released this through their mobile app not

366
00:22:50,160 --> 00:22:51,280
that long ago.

367
00:22:51,280 --> 00:22:55,480
You could speak to the app and it would speak back to you in a very Star Trek computer style.

368
00:22:55,480 --> 00:22:57,280
My goodness, this is cool.

369
00:22:57,280 --> 00:23:00,440
And then in the blink of an eye, they make it available to developers.

370
00:23:00,440 --> 00:23:03,200
And I think that's actually awesome.

371
00:23:03,200 --> 00:23:07,960
And the stuff you did with it and sent me was so cool.

372
00:23:07,960 --> 00:23:12,520
Just to stress for people who are new listeners, Martin is not a developer.

373
00:23:12,520 --> 00:23:15,040
He is an exceptional problem solver and thinker.

374
00:23:15,040 --> 00:23:21,000
He gets in with ChatGPT and collaborates to build these software, but you're not a developer.

375
00:23:21,000 --> 00:23:23,440
You don't have any developer training.

376
00:23:23,440 --> 00:23:28,240
But working with ChatGPT and probably half a day or a day, you've got this thing up and

377
00:23:28,240 --> 00:23:29,520
running.

378
00:23:29,520 --> 00:23:39,240
And I think it's going to be really interesting to see how our interaction with computers

379
00:23:39,240 --> 00:23:43,920
changes over the next 12 to 24 months.

380
00:23:43,920 --> 00:23:49,600
We got the launch of the Humane AI pin this week or last week.

381
00:23:49,600 --> 00:23:55,160
I think the first time we had any real details about how it works.

382
00:23:55,160 --> 00:23:59,400
I don't think it's going to catch on, if I'm really honest.

383
00:23:59,400 --> 00:24:00,400
But it's interesting.

384
00:24:00,400 --> 00:24:04,040
For those that don't know, the Humane AI pin is like a...

385
00:24:04,040 --> 00:24:06,360
It almost looks like a badge that you wear on your shirt.

386
00:24:06,360 --> 00:24:10,160
It's got a camera, it's got a mic, it's got a speaker.

387
00:24:10,160 --> 00:24:13,520
And what they're trying to do is create a system by which you can do quite a few things

388
00:24:13,520 --> 00:24:16,640
you'd want to do in your life that you would usually use your phone for, but you don't

389
00:24:16,640 --> 00:24:18,080
need a phone.

390
00:24:18,080 --> 00:24:20,560
So you want to make a call, you just tell it you want to make a call.

391
00:24:20,560 --> 00:24:22,400
You want it to summarize your emails.

392
00:24:22,400 --> 00:24:25,680
You want it to write your emails and do some replies for you while you dictate or do all

393
00:24:25,680 --> 00:24:26,680
of that.

394
00:24:26,680 --> 00:24:30,500
My favorite application was actually translation in real time.

395
00:24:30,500 --> 00:24:34,160
If you're having a conversation with someone else, you tell it, I'm speaking with my friend

396
00:24:34,160 --> 00:24:35,160
who's Spanish.

397
00:24:35,160 --> 00:24:37,120
When I speak, translate me into Spanish.

398
00:24:37,120 --> 00:24:40,320
When they speak, translate what they say to me into English.

399
00:24:40,320 --> 00:24:43,080
That was pretty cool.

400
00:24:43,080 --> 00:24:47,320
The launch video was not as exciting as it could have been.

401
00:24:47,320 --> 00:24:50,200
But that's another example of this kind of similar technology, right?

402
00:24:50,200 --> 00:24:57,360
Where you speak to something rather than typing and it speaks back rather than you reading.

403
00:24:57,360 --> 00:25:02,360
People are doing some really interesting things with the GPT-4 vision capability.

404
00:25:02,360 --> 00:25:04,800
Have you seen some of this?

405
00:25:04,800 --> 00:25:05,800
What have you seen?

406
00:25:05,800 --> 00:25:19,120
So there was the video clip of Messi dribbling and they did real time translation and no,

407
00:25:19,120 --> 00:25:21,080
it was a real time commentary, wasn't it?

408
00:25:21,080 --> 00:25:22,080
And chat GPT.

409
00:25:22,080 --> 00:25:26,240
For non football fans, he was dribbling a football rather than dribbling on his own

410
00:25:26,240 --> 00:25:27,240
face.

411
00:25:27,240 --> 00:25:30,240
Imagine the commentary of that.

412
00:25:30,240 --> 00:25:34,360
That one's going to drift off his chin if he's not careful.

413
00:25:34,360 --> 00:25:42,920
Yeah, so it was Leo Messi dribbling down the, I think it was the right wing and chat GPT

414
00:25:42,920 --> 00:25:44,280
was the vision.

415
00:25:44,280 --> 00:25:50,480
I think they'd programmed it so it was looking at one frame in 10 or something like that.

416
00:25:50,480 --> 00:25:54,560
And then describing the scene and it was doing that throughout.

417
00:25:54,560 --> 00:25:59,920
And it ended up putting this actually quite decent audio commentary together.

418
00:25:59,920 --> 00:26:03,400
I mean, certainly for a first blast, it was very good.

419
00:26:03,400 --> 00:26:08,400
Is it going to replace Martin Tyler on Sky Sports?

420
00:26:08,400 --> 00:26:09,760
Absolutely not.

421
00:26:09,760 --> 00:26:18,800
But for something that someone rustled up in their office in a few hours, it was pretty

422
00:26:18,800 --> 00:26:19,800
impressive.

423
00:26:19,800 --> 00:26:20,800
What have you seen?

424
00:26:20,800 --> 00:26:23,280
Yeah, I thought that was a good proof of concept.

425
00:26:23,280 --> 00:26:28,080
I think where I'm coming at this from is some of the tests where people in essence are using

426
00:26:28,080 --> 00:26:32,800
GPT-4 vision as a mechanism for computers to understand what's on a screen and then

427
00:26:32,800 --> 00:26:34,640
take actions.

428
00:26:34,640 --> 00:26:42,000
So let's imagine that I want to book a holiday in order to be able to tell a computer, go

429
00:26:42,000 --> 00:26:47,480
book me a holiday to Argentina to go see Leo Messi and his family.

430
00:26:47,480 --> 00:26:51,280
Not that he lives there, of course, because he's doing some play football there at the

431
00:26:51,280 --> 00:26:52,920
moment, but there you go.

432
00:26:52,920 --> 00:26:55,920
That's a very complicated process that a computer needs to go through.

433
00:26:55,920 --> 00:26:59,080
We need to be able to browse to a website, but we need to understand the context of what's

434
00:26:59,080 --> 00:27:00,080
on that website.

435
00:27:00,080 --> 00:27:03,480
I guess it could do it through HTML and CSS understanding, which to be honest is probably

436
00:27:03,480 --> 00:27:09,360
still the best way to do it, at least from my very naive, un-technical perspective.

437
00:27:09,360 --> 00:27:12,960
But I was interested in seeing people using the vision capability as a mechanism of in

438
00:27:12,960 --> 00:27:17,640
essence taking a picture of that site, which GPT-4 vision can interpret.

439
00:27:17,640 --> 00:27:23,640
It knows where the buttons are, it can read the text, and then creating agents off the

440
00:27:23,640 --> 00:27:24,640
back of that.

441
00:27:24,640 --> 00:27:32,120
And so I'm ever thinking about the days when I sit here and the keyboard, I'm using it

442
00:27:32,120 --> 00:27:33,120
less.

443
00:27:33,120 --> 00:27:37,120
I'm speaking more to the computer and it's doing things based on what I ask.

444
00:27:37,120 --> 00:27:42,160
And again, I think open AI provide an access to not only text to speech and speech to text,

445
00:27:42,160 --> 00:27:46,500
a lot of which we've had a while through Whisper, but also GPT-4 vision.

446
00:27:46,500 --> 00:27:50,920
Because I think all these multimodal things are going to allow people to create really

447
00:27:50,920 --> 00:27:53,000
interesting use cases.

448
00:27:53,000 --> 00:27:59,720
And I think products and tools will pop up that we just can't imagine would be useful.

449
00:27:59,720 --> 00:28:03,120
And then people are just having this epiphany where they're like, wait a minute, if the

450
00:28:03,120 --> 00:28:09,160
computer can see and the computer can hear and the computer can speak, what can I now

451
00:28:09,160 --> 00:28:10,160
get it to do?

452
00:28:10,160 --> 00:28:12,520
And the computer can take action, right?

453
00:28:12,520 --> 00:28:13,520
Absolutely.

454
00:28:13,520 --> 00:28:16,680
That's the additional part of that piece.

455
00:28:16,680 --> 00:28:24,280
Yeah, I saw the one, there was one bit of research that found that GPT-4 vision with

456
00:28:24,280 --> 00:28:32,840
iPhone screenshots was able to successfully complete tasks and actions on the iPhone about

457
00:28:32,840 --> 00:28:34,920
75% of the time.

458
00:28:34,920 --> 00:28:41,000
So it's just the start of this whole process, isn't it?

459
00:28:41,000 --> 00:28:46,960
And again, I think that bit is actually kind of critical because something that gets it

460
00:28:46,960 --> 00:28:49,920
wrong 25% of the time is actually not that useful.

461
00:28:49,920 --> 00:28:51,440
That's absolutely trash, really.

462
00:28:51,440 --> 00:28:57,080
It's like, just turn up to Heathrow Airport, looking forward to my holiday and the flight

463
00:28:57,080 --> 00:28:58,400
hasn't been booked after all.

464
00:28:58,400 --> 00:29:02,880
It just reported that it has because it hallucinated having done it or it's booked me a flight

465
00:29:02,880 --> 00:29:07,480
and I'm not going to, I don't know, Argentina, I'm going to Albania or whatever because it's

466
00:29:07,480 --> 00:29:12,200
just completely, it understood all of the tasks apart from the Argentina part, it got

467
00:29:12,200 --> 00:29:15,400
slightly wrong, which is not going to hold water.

468
00:29:15,400 --> 00:29:19,360
And I guess brings us full circle to the start of this conversation, which is lots of these

469
00:29:19,360 --> 00:29:24,880
capabilities have immense potential, but the stuff that goes wrong with them yet is still

470
00:29:24,880 --> 00:29:30,800
large enough in a lot of cases that human in the loop in so many aspects is really important.

471
00:29:30,800 --> 00:29:37,840
And until some of that improves, their independence of these tools is going to be limited.

472
00:29:37,840 --> 00:29:39,480
Yeah, a hundred percent.

473
00:29:39,480 --> 00:29:45,880
Just on the human computer interface point, I had an interesting experience this week

474
00:29:45,880 --> 00:29:57,880
with my two and a half year old when the breakfast table, he's been using the smart speaker,

475
00:29:57,880 --> 00:30:04,920
whose name I will not say, but he's been asking it to play songs regularly saying Amazon,

476
00:30:04,920 --> 00:30:07,240
play this song.

477
00:30:07,240 --> 00:30:12,400
And he's been, he turned around to it the other day and said the trigger word and then

478
00:30:12,400 --> 00:30:15,100
asked a question, which he's not done before.

479
00:30:15,100 --> 00:30:19,160
He's always just asked it to play, to do an action.

480
00:30:19,160 --> 00:30:24,200
But it was one of those questions that is not geared up for, but I thought, oh, this

481
00:30:24,200 --> 00:30:25,200
is interesting.

482
00:30:25,200 --> 00:30:30,400
I'll do it with the GPT for voice interaction.

483
00:30:30,400 --> 00:30:38,240
So we fired it up at the breakfast table and we burned through our questions, our 40 message

484
00:30:38,240 --> 00:30:43,080
limit with him just going back and forth asking questions.

485
00:30:43,080 --> 00:30:50,040
And it was, you know, it was everything from why is water cold to why is ketchup?

486
00:30:50,040 --> 00:30:52,760
Well, that's very philosophical.

487
00:30:52,760 --> 00:30:58,720
But we ended up having, you know, he was going on about animals and dinosaurs and all sorts.

488
00:30:58,720 --> 00:31:05,480
So it was a real wide ranging discussion and there was a real ebb and flow to the conversation.

489
00:31:05,480 --> 00:31:07,600
He was treating it like a person.

490
00:31:07,600 --> 00:31:13,600
And I thought, wow, at two and a half years old, his experience of interfacing with computers

491
00:31:13,600 --> 00:31:16,200
is going to be so different.

492
00:31:16,200 --> 00:31:20,400
Fast forward, he's going to, it's just going to be so intuitive for him that, you know,

493
00:31:20,400 --> 00:31:25,040
tablets, the idea that when we see toddlers playing on tablets now, people have gone,

494
00:31:25,040 --> 00:31:27,480
oh my God, they're just picking up so quickly.

495
00:31:27,480 --> 00:31:32,200
This generation is going to be talking to computers, going to be second nature to them.

496
00:31:32,200 --> 00:31:35,480
Yeah, it's, I think that's spot on.

497
00:31:35,480 --> 00:31:41,760
Bill Goats wrote a blog this week, basically on the same theme that we're discussing now,

498
00:31:41,760 --> 00:31:46,200
like how people interact with computers has not changed that much.

499
00:31:46,200 --> 00:31:52,160
And since we moved from DOS based coding type interfaces into Windows, the type click on

500
00:31:52,160 --> 00:31:56,880
stuff interfaces, and actually the click on stuff interface is now best part 30 years

501
00:31:56,880 --> 00:31:59,160
old, something like that.

502
00:31:59,160 --> 00:32:00,880
But it hasn't really moved on.

503
00:32:00,880 --> 00:32:04,760
And this could be the moment that it does.

504
00:32:04,760 --> 00:32:11,400
This is probably a bit too pontification central for most marketers listening, but that is

505
00:32:11,400 --> 00:32:16,240
obviously going to fundamentally change how people access information, how they search

506
00:32:16,240 --> 00:32:20,480
for stuff, how they buy stuff.

507
00:32:20,480 --> 00:32:27,120
So a lot of the go to market strategies that we use right now, like SEO, even like websites.

508
00:32:27,120 --> 00:32:28,120
Yeah.

509
00:32:28,120 --> 00:32:29,760
Will we have websites?

510
00:32:29,760 --> 00:32:32,440
If we do, how will, who will interact with them?

511
00:32:32,440 --> 00:32:34,160
Agents or us or both?

512
00:32:34,160 --> 00:32:38,760
Or that's going to be a really interesting thing to see how that plays out over time.

513
00:32:38,760 --> 00:32:40,400
It could change radically.

514
00:32:40,400 --> 00:32:47,320
I think just thinking about this out loud, people changing their behavior actually takes

515
00:32:47,320 --> 00:32:49,520
quite a long time.

516
00:32:49,520 --> 00:32:54,560
So this is unlikely to be a tomorrow we are in a speak to computer world.

517
00:32:54,560 --> 00:32:59,000
It's going to be like, Oh, over 10 years, people really got used to speaking with computers

518
00:32:59,000 --> 00:33:00,000
over time.

519
00:33:00,000 --> 00:33:03,120
And before they really knew it, they were like, Oh, I don't use my keyboard that much

520
00:33:03,120 --> 00:33:04,120
anymore.

521
00:33:04,120 --> 00:33:06,760
And I think it's going to be more like that than brilliant.

522
00:33:06,760 --> 00:33:08,440
I don't have to use my keyboard ever again.

523
00:33:08,440 --> 00:33:10,560
I'm throwing the trash tomorrow today.

524
00:33:10,560 --> 00:33:12,360
I'm a keyboard user tomorrow.

525
00:33:12,360 --> 00:33:13,360
Never again.

526
00:33:13,360 --> 00:33:14,360
Like it's going to be slow, isn't it?

527
00:33:14,360 --> 00:33:15,360
It is.

528
00:33:15,360 --> 00:33:17,360
Do you hate my keyboard though?

529
00:33:17,360 --> 00:33:18,360
I would do.

530
00:33:18,360 --> 00:33:24,080
I'm actually using a mixture of audio pen and another texture speech tool, and I'm dictating

531
00:33:24,080 --> 00:33:26,840
80% of my emails at this point.

532
00:33:26,840 --> 00:33:32,880
And I'm finding it a real productivity boost as a complete aside.

533
00:33:32,880 --> 00:33:37,840
So I'm trying that I'm ready for the keyboardless future of mine.

534
00:33:37,840 --> 00:33:40,880
Just tell me what we got to do to make it happen.

535
00:33:40,880 --> 00:33:41,880
Right.

536
00:33:41,880 --> 00:33:43,280
Is there anything else?

537
00:33:43,280 --> 00:33:44,760
Oh, is there anything else?

538
00:33:44,760 --> 00:33:45,760
Yeah.

539
00:33:45,760 --> 00:33:46,760
Just a small matter of GPTs.

540
00:33:46,760 --> 00:33:52,480
We've sort of left the most potentially exciting, but also controversial aspect until the very

541
00:33:52,480 --> 00:33:53,480
end.

542
00:33:53,480 --> 00:33:54,480
GPTs.

543
00:33:54,480 --> 00:33:59,680
Tell us for the listeners who are maybe not so nerdy about this stuff, tell us what GPTs

544
00:33:59,680 --> 00:34:00,680
are.

545
00:34:00,680 --> 00:34:10,480
GPTs are the consumer facing interface for creating your own specialized assistance.

546
00:34:10,480 --> 00:34:14,640
And if you're a developer, you can use the API and create assistance.

547
00:34:14,640 --> 00:34:23,160
And if you're just a consumer using chat GPT, you can build your own specialist GPT in chat

548
00:34:23,160 --> 00:34:24,160
GPT.

549
00:34:24,160 --> 00:34:26,200
And they're made up of three components.

550
00:34:26,200 --> 00:34:31,640
There's the instruction, which is basically the system prompt, which primes the GPT to

551
00:34:31,640 --> 00:34:32,840
play a particular role.

552
00:34:32,840 --> 00:34:36,040
And this is where you tell it how it should operate.

553
00:34:36,040 --> 00:34:38,440
Then you've got knowledge.

554
00:34:38,440 --> 00:34:40,920
Knowledge is information that you can provide.

555
00:34:40,920 --> 00:34:49,280
So it could be in the example that Sam Altman gave the demo day, he uploaded a lecture and

556
00:34:49,280 --> 00:34:54,460
it was the transcript from a lecture that he delivered about building and growing and

557
00:34:54,460 --> 00:34:56,040
scaling startups.

558
00:34:56,040 --> 00:34:59,160
So that was the kind of knowledge that he'd implanted in it.

559
00:34:59,160 --> 00:35:01,720
But this could be proprietary knowledge.

560
00:35:01,720 --> 00:35:06,600
It could be speeches, lectures, it could be particular data sets, whatever.

561
00:35:06,600 --> 00:35:12,480
You're going to give it some knowledge that it doesn't have, or it might have in its training

562
00:35:12,480 --> 00:35:19,720
set, but you want to make that kind of front and center in its interactions with the user.

563
00:35:19,720 --> 00:35:23,920
So you've got instructions, knowledge, and the final thing is actions.

564
00:35:23,920 --> 00:35:28,640
And actions are the thing that it can do.

565
00:35:28,640 --> 00:35:35,520
A while ago, OpenAI announced a thing called function calling and function calling enables

566
00:35:35,520 --> 00:35:41,920
you to connect external software using APIs with ChatGPT.

567
00:35:41,920 --> 00:35:50,520
So now you can do that in this interface and you can make your GPT, your ChatGPT assistant,

568
00:35:50,520 --> 00:35:58,400
as it were, connect to external tools and perform actions based on the interaction that

569
00:35:58,400 --> 00:35:59,800
you have with someone.

570
00:35:59,800 --> 00:36:04,680
As well as that, you can also turn on, what can you turn on?

571
00:36:04,680 --> 00:36:06,120
You can turn on all of the tools, can't you?

572
00:36:06,120 --> 00:36:11,840
You can turn on code interpreter and you can turn on Dali.

573
00:36:11,840 --> 00:36:12,840
So it can-

574
00:36:12,840 --> 00:36:13,840
Web browsing.

575
00:36:13,840 --> 00:36:15,440
Yeah, web browsing as well.

576
00:36:15,440 --> 00:36:16,600
Or you can turn them off.

577
00:36:16,600 --> 00:36:20,680
If you're building the app, you can choose which ones are available and which ones are

578
00:36:20,680 --> 00:36:21,920
not available.

579
00:36:21,920 --> 00:36:26,600
And then you can publish it and you can make it public to the world or you can have it

580
00:36:26,600 --> 00:36:30,800
just private and have it be an assistant to you.

581
00:36:30,800 --> 00:36:38,640
And in some respects, it's pretty similar to if you've got like a prompt library or

582
00:36:38,640 --> 00:36:43,960
you operate like me and you have bookmarked conversations that you go back and edit for

583
00:36:43,960 --> 00:36:44,960
a particular task.

584
00:36:44,960 --> 00:36:45,960
It's a bit like that.

585
00:36:45,960 --> 00:36:52,560
You can create your own GPTs that are primed to do a particular task and you can actually,

586
00:36:52,560 --> 00:37:00,720
in the chat GPT window, on both desktop and on mobile, you can now have them saved in

587
00:37:00,720 --> 00:37:03,480
the sidebar for quick access.

588
00:37:03,480 --> 00:37:09,920
So if you're, for instance, putting together a podcast and you want to have a GPT that

589
00:37:09,920 --> 00:37:20,120
is specialized at creating show notes or writing a social media content and giving you a promotion

590
00:37:20,120 --> 00:37:25,840
strategy for each episode, you could have that there all the time, sticking your rough

591
00:37:25,840 --> 00:37:32,420
notes from the show and then it will give you nice formatted show notes and a few tweets

592
00:37:32,420 --> 00:37:35,400
and what have you.

593
00:37:35,400 --> 00:37:38,160
And there's a whole app playground as well.

594
00:37:38,160 --> 00:37:39,400
Not app playground, sorry.

595
00:37:39,400 --> 00:37:44,920
There's a whole app store where you can go and browse public GPTs.

596
00:37:44,920 --> 00:37:49,120
Yeah, it's going to be, I think when this was all launched, thank you for that.

597
00:37:49,120 --> 00:37:50,120
There you are listeners.

598
00:37:50,120 --> 00:37:51,760
Now you know what GPTs are.

599
00:37:51,760 --> 00:37:54,720
When this was launched, I think there was a lot of excitement.

600
00:37:54,720 --> 00:37:56,760
There was going to be a GPT store.

601
00:37:56,760 --> 00:38:01,320
There was loads of people on social going, this is where the next millionaires are being

602
00:38:01,320 --> 00:38:02,320
made.

603
00:38:02,320 --> 00:38:03,320
Like go build your GPT.

604
00:38:03,320 --> 00:38:04,520
That's the bit that I didn't mention.

605
00:38:04,520 --> 00:38:05,800
It was the revenue share.

606
00:38:05,800 --> 00:38:09,360
They announced that at some point in the near future, there will be a revenue share for

607
00:38:09,360 --> 00:38:12,520
the most popular used GPTs.

608
00:38:12,520 --> 00:38:13,520
Yeah.

609
00:38:13,520 --> 00:38:16,360
And I think all of that makes sense.

610
00:38:16,360 --> 00:38:18,280
And people were like, Oh, we've got to get in there.

611
00:38:18,280 --> 00:38:20,840
Got to build a GPT.

612
00:38:20,840 --> 00:38:22,160
The store's not live yet.

613
00:38:22,160 --> 00:38:26,240
At the moment you can access other people's GPTs because when you build one, it can be

614
00:38:26,240 --> 00:38:27,640
just for you.

615
00:38:27,640 --> 00:38:31,200
It can be anybody with the link or it can be like super public.

616
00:38:31,200 --> 00:38:32,200
Right.

617
00:38:32,200 --> 00:38:37,440
So at the moment you can share that, you can go play with other people's GPTs, but there's

618
00:38:37,440 --> 00:38:40,120
no store or anything to be able to go searching for GPTs.

619
00:38:40,120 --> 00:38:44,400
At least not on chat GPT, although of course those things have already sprung up.

620
00:38:44,400 --> 00:38:47,360
And I think what people have learned, I don't know what you think Martin, but what I've

621
00:38:47,360 --> 00:38:54,560
seen people talking about a lot is they mostly act a bit like chat GPT and you can sort of

622
00:38:54,560 --> 00:39:00,160
train them on your own materials and stuff, but it doesn't always work that well.

623
00:39:00,160 --> 00:39:04,560
The function calling is kind of cool, but it appears to be quite buggy.

624
00:39:04,560 --> 00:39:07,600
I use the Zapier GPT.

625
00:39:07,600 --> 00:39:08,600
We talked about this affair.

626
00:39:08,600 --> 00:39:11,560
There's an example, which is like a calendar plugin.

627
00:39:11,560 --> 00:39:13,560
It's like calendar GPT.

628
00:39:13,560 --> 00:39:15,800
And you say things to it like, what's on my schedule tomorrow?

629
00:39:15,800 --> 00:39:16,800
Made it up.

630
00:39:16,800 --> 00:39:19,400
I asked, it was on schedule tomorrow, pulled a load.

631
00:39:19,400 --> 00:39:22,320
It did pull a load of things from my schedule from a day three months ago.

632
00:39:22,320 --> 00:39:23,880
I said, I think you might be wrong.

633
00:39:23,880 --> 00:39:26,520
They're like, yeah, we are wrong.

634
00:39:26,520 --> 00:39:28,040
In fact, your schedule is empty.

635
00:39:28,040 --> 00:39:29,720
And I'm like, no, I've got four meetings tomorrow.

636
00:39:29,720 --> 00:39:31,440
And they're like, yeah, you don't.

637
00:39:31,440 --> 00:39:32,440
Your schedule is empty.

638
00:39:32,440 --> 00:39:37,040
And I'm like, I think I know what's in my calendar, having looked at it.

639
00:39:37,040 --> 00:39:40,560
So and getting it set up was really buggy.

640
00:39:40,560 --> 00:39:46,040
So I think there's an element of flatter to deceive here again in that it's got a huge

641
00:39:46,040 --> 00:39:51,340
potential how to really realize that we're not there yet.

642
00:39:51,340 --> 00:39:57,120
So in terms of like the danger of this being another plugin store debacle was like when

643
00:39:57,120 --> 00:40:01,160
plugins came to chat, you'd be just like, oh my goodness, the plugin universe, make

644
00:40:01,160 --> 00:40:04,640
your next million building plugins on the plugin store.

645
00:40:04,640 --> 00:40:08,640
And now this is the new version of that plugin store sucked for a number of reasons.

646
00:40:08,640 --> 00:40:14,360
And I do worry that GPTs and GPT store is not where it needs to be to really take off.

647
00:40:14,360 --> 00:40:19,960
That being said, we have seen some pretty cool initial examples, haven't we Martin?

648
00:40:19,960 --> 00:40:20,960
Yeah.

649
00:40:20,960 --> 00:40:29,000
I think you sent the convert anything example was amazing and it's a GPT where you can upload

650
00:40:29,000 --> 00:40:30,160
a file.

651
00:40:30,160 --> 00:40:36,480
So let's say an image PNG and you can ask it to convert that to a JPEG.

652
00:40:36,480 --> 00:40:40,000
But it won't just do that with that kind of simple task.

653
00:40:40,000 --> 00:40:44,360
It will do that with anything and you can throw pretty much any file type into it and

654
00:40:44,360 --> 00:40:50,480
it will have a crack at turning from one format into another.

655
00:40:50,480 --> 00:40:51,760
It did fail slightly.

656
00:40:51,760 --> 00:40:59,120
I threw in a screenshot of a, it was a text based table with some information in it and

657
00:40:59,120 --> 00:41:01,880
it was just a PNG.

658
00:41:01,880 --> 00:41:09,060
And I threw that into it and said, turn this into a, into a word doc.

659
00:41:09,060 --> 00:41:16,920
And it tried to use OCR to extract the text, but it failed and then just gave me a word

660
00:41:16,920 --> 00:41:21,760
doc with the image in it, just embedded in the word doc.

661
00:41:21,760 --> 00:41:27,520
So it's, you know, it kind of did it sort of.

662
00:41:27,520 --> 00:41:32,520
I'm even surprised it did as well there as it did honestly.

663
00:41:32,520 --> 00:41:36,880
When I started playing with it and I sent it to you, I was like, oh cool, PNG to JPEG.

664
00:41:36,880 --> 00:41:39,880
I was like, oh, M4A to MP3.

665
00:41:39,880 --> 00:41:44,000
Like probably I'd have to download a piece of software or use some online tool to do

666
00:41:44,000 --> 00:41:45,000
that.

667
00:41:45,000 --> 00:41:47,160
And I was like, oh, one thing that I can just go convert this to this.

668
00:41:47,160 --> 00:41:50,680
And that seemed kind of, kind of interesting.

669
00:41:50,680 --> 00:41:57,400
So that was, I think the way that assume this has all been driven by code interpreter, advanced

670
00:41:57,400 --> 00:42:01,000
data analysis, Python stuff, being able to do some of this in the background.

671
00:42:01,000 --> 00:42:06,240
I don't know if it's doing custom function calling or not, but it was definitely an example

672
00:42:06,240 --> 00:42:11,360
of something actually usable that doesn't break that often that I could see myself using

673
00:42:11,360 --> 00:42:12,760
day to day.

674
00:42:12,760 --> 00:42:18,920
I think a lot of the other examples I've seen are at best, they're kind of skins of chat

675
00:42:18,920 --> 00:42:22,760
GPT, a little bit like what we've seen emerge in a number of other areas anyway.

676
00:42:22,760 --> 00:42:32,000
Like there's like a contract writer GPT and a chef GPT.

677
00:42:32,000 --> 00:42:37,640
I heard Paul Reutzer from the AI Marketing Institute talking about potentially creating

678
00:42:37,640 --> 00:42:43,160
a GPT that will produce your AI policy for you to go on your website based on asking

679
00:42:43,160 --> 00:42:47,620
you a bunch of questions about how you intend to use AI and then writing up your policy

680
00:42:47,620 --> 00:42:48,620
for you.

681
00:42:48,620 --> 00:42:52,920
And I think some of those could be really useful and some of them are just not going

682
00:42:52,920 --> 00:42:57,000
to be that useful because they're just basically, you could probably ask chat GPT as is without

683
00:42:57,000 --> 00:42:59,480
having to have a custom GPT.

684
00:42:59,480 --> 00:43:04,280
I'm definitely finding that there are probably opportunities for me to create GPTs for me

685
00:43:04,280 --> 00:43:08,760
or my team that are based on stuff that we have to do that probably no one else would

686
00:43:08,760 --> 00:43:10,960
be that interested in, but it's useful.

687
00:43:10,960 --> 00:43:16,160
As an example, before this podcast, we transitioned some of the prompts we used to help us write

688
00:43:16,160 --> 00:43:21,680
the scripts for these podcasts to create a quick GPT that does it, which we can now share,

689
00:43:21,680 --> 00:43:25,440
which is like a shared tool bot thing that we can throw stories at and it will write

690
00:43:25,440 --> 00:43:26,440
scripts for us.

691
00:43:26,440 --> 00:43:29,240
So I can see some definite applications there.

692
00:43:29,240 --> 00:43:34,280
I'm still struggling to see what applications people are going to make that are going to

693
00:43:34,280 --> 00:43:39,880
be like number one on the app store, 1 million uses a day.

694
00:43:39,880 --> 00:43:50,560
The Developer Day example where they did a deep dive into the product was pretty interesting,

695
00:43:50,560 --> 00:43:54,840
just novel in terms of how they plugged it all together.

696
00:43:54,840 --> 00:44:01,720
And what they'd done is they'd connected it to Spotify and asked it to basically be a

697
00:44:01,720 --> 00:44:09,840
kind of playlist creator based on a mood and a vibe that you would input.

698
00:44:09,840 --> 00:44:11,600
And they took a photo of themselves.

699
00:44:11,600 --> 00:44:12,760
It was two developers on stage.

700
00:44:12,760 --> 00:44:18,920
They took a photo of themselves kind of high-fiving or whatever it was and then said, what's this

701
00:44:18,920 --> 00:44:19,920
mood?

702
00:44:19,920 --> 00:44:23,560
And it came up with a vibe and said, based on this photo.

703
00:44:23,560 --> 00:44:28,400
So it used GPT for vision to interpret that this is the kind of playlist that I would

704
00:44:28,400 --> 00:44:29,400
put together.

705
00:44:29,400 --> 00:44:31,200
And then they said, great, put that together.

706
00:44:31,200 --> 00:44:36,040
And it goes on Spotify and kind of pulls together this playlist.

707
00:44:36,040 --> 00:44:40,280
They actually give it the name of an artist to put into the playlist as well.

708
00:44:40,280 --> 00:44:41,960
And it does that.

709
00:44:41,960 --> 00:44:44,480
And then they say, now play it.

710
00:44:44,480 --> 00:44:46,520
And it plays it in their Spotify account.

711
00:44:46,520 --> 00:44:51,280
It creates the playlist in their Spotify account and starts playing it.

712
00:44:51,280 --> 00:44:55,680
It also creates a playlist album cover using Dali.

713
00:44:55,680 --> 00:44:57,640
So that goes into it as well.

714
00:44:57,640 --> 00:45:06,200
And then they'd connected it to their Philips Hue lighting and said, set the scene.

715
00:45:06,200 --> 00:45:11,040
And they triggered the lighting for the playlist in the living room as well.

716
00:45:11,040 --> 00:45:17,960
So all through just chatting with ChatGPT, they created a custom playlist, got it playing

717
00:45:17,960 --> 00:45:25,480
on their speakers and then had a custom hue lighting mood setting as well.

718
00:45:25,480 --> 00:45:28,000
Yeah, that's quite cool.

719
00:45:28,000 --> 00:45:33,680
That's worth stressing that it's quite easy to build a GPT because ChatGPT guides you

720
00:45:33,680 --> 00:45:34,680
through the process.

721
00:45:34,680 --> 00:45:37,680
So you kind of tell it in natural language what you want.

722
00:45:37,680 --> 00:45:41,920
And then it tries to figure out how to make that work for you, which is brilliant, but

723
00:45:41,920 --> 00:45:46,200
it doesn't help at all for custom actions, which is still really complicated API type

724
00:45:46,200 --> 00:45:50,640
calls, which I'm sure it could help with, but it's certainly not easy to do out the

725
00:45:50,640 --> 00:45:52,520
box.

726
00:45:52,520 --> 00:45:57,720
The other thing reflecting on that Spotify example is, it's kind of cool, but what problem

727
00:45:57,720 --> 00:45:59,200
does that solve?

728
00:45:59,200 --> 00:46:03,440
I wasn't wandering around going, my Spotify playlist would just see that everyone in the

729
00:46:03,440 --> 00:46:07,400
room is a bit depressed and play the most depressing songs it can find and then put

730
00:46:07,400 --> 00:46:09,280
a really depressing album cover on it.

731
00:46:09,280 --> 00:46:11,000
My life would be complete.

732
00:46:11,000 --> 00:46:12,520
It's like, no.

733
00:46:12,520 --> 00:46:17,000
What are the problems that we want to solve with this stuff?

734
00:46:17,000 --> 00:46:21,040
That was one of the...

735
00:46:21,040 --> 00:46:23,720
In their defense, they said, look, this is quite kind of frivolous.

736
00:46:23,720 --> 00:46:27,120
It's not, you know, this isn't a serious thing, but we want to show you the potential of how

737
00:46:27,120 --> 00:46:32,040
you can link things together and what have you.

738
00:46:32,040 --> 00:46:37,880
I think that is one of the big challenges here is that actually finding what is the

739
00:46:37,880 --> 00:46:39,400
problem that they're solving.

740
00:46:39,400 --> 00:46:44,240
There was another session, if you watch the breakout sessions from Developer Day, they

741
00:46:44,240 --> 00:46:48,320
talk about how they think about product and research.

742
00:46:48,320 --> 00:46:55,920
And on the product side, they said that the product manager from ChatGPT would say, normally

743
00:46:55,920 --> 00:46:58,640
you would talk about what problem we're trying to solve.

744
00:46:58,640 --> 00:47:01,920
We don't really do that with ChatGPT.

745
00:47:01,920 --> 00:47:03,160
We don't start with a problem.

746
00:47:03,160 --> 00:47:09,360
We just kind of start with what can the model do and then just try and make that accessible

747
00:47:09,360 --> 00:47:15,520
in a way that is useful to people because people are still figuring this out.

748
00:47:15,520 --> 00:47:23,000
So I think there's a tacit acknowledgement from ChatGPT team where they're saying, we're

749
00:47:23,000 --> 00:47:24,320
not really solving for a problem.

750
00:47:24,320 --> 00:47:29,080
We're putting together a bunch of building blocks and throwing it over to you.

751
00:47:29,080 --> 00:47:36,120
I think that's fine because in essence, they're the platform and our job as users or would

752
00:47:36,120 --> 00:47:40,480
be, I'm going to put in very common as developers, like maybe proper developers, but also because

753
00:47:40,480 --> 00:47:47,680
you don't need to be a developer necessarily to build a GPT, any old folk to go, I wish

754
00:47:47,680 --> 00:47:50,480
I need this problem solved.

755
00:47:50,480 --> 00:47:54,320
I'm going to figure out a way ChatGPT can do it for me.

756
00:47:54,320 --> 00:47:59,320
And I think to bring us back full circle, I think that's why the most obvious things

757
00:47:59,320 --> 00:48:05,440
I'm seeing at the moment is me and my team looking at a thing we'd want to be able to

758
00:48:05,440 --> 00:48:13,480
do on a reasonably frequent basis and then going, can we create a GPT for that, that

759
00:48:13,480 --> 00:48:20,800
halves the amount of time it takes or gets us halfway into the process instantly because

760
00:48:20,800 --> 00:48:23,960
it can do X, Y, Z.

761
00:48:23,960 --> 00:48:29,360
For example, I could imagine creating a biostrata blog GPT that knows the style of our blog

762
00:48:29,360 --> 00:48:35,600
posts because it's been fed 20 of them so it knows how to write in our style with maybe

763
00:48:35,600 --> 00:48:40,240
a set of instructions that are probably true for all blog posts, like the type of length

764
00:48:40,240 --> 00:48:45,880
we want them to be and maybe even a process where we always want an outline first that

765
00:48:45,880 --> 00:48:48,880
we review and approve and then it writes the blog.

766
00:48:48,880 --> 00:48:53,720
That would save quite a lot of time internally, but it'd be very specific to biostrata.

767
00:48:53,720 --> 00:48:56,720
No one else wants their blog to be in biostrata style.

768
00:48:56,720 --> 00:49:01,160
But that would be really helpful and I could absolutely imagine that.

769
00:49:01,160 --> 00:49:09,840
We should definitely set up a transcript to show notes GPT or even a podcast processing

770
00:49:09,840 --> 00:49:15,040
GPT for our podcast just to make it easier to go, to not have to prompt, to just copy

771
00:49:15,040 --> 00:49:20,920
paste the transcript and it already knows it's going to give three outputs, show notes,

772
00:49:20,920 --> 00:49:25,280
social media posts and I don't know, whatever, like the email that we want to send out.

773
00:49:25,280 --> 00:49:29,000
And then you could probably get to the point through function calling or Zapier, you might

774
00:49:29,000 --> 00:49:33,320
be able to connect it to enough tools that actually pushes some of that material straight

775
00:49:33,320 --> 00:49:35,240
away where it needs to go.

776
00:49:35,240 --> 00:49:37,640
I think then it starts to get quite cool.

777
00:49:37,640 --> 00:49:42,520
I guess if you can think of applications that you have that other people might need, but

778
00:49:42,520 --> 00:49:45,720
that would be quite hard for them to build themselves, you might be able to get yourself

779
00:49:45,720 --> 00:49:50,160
up on the top of that GPT leaderboard when it launches.

780
00:49:50,160 --> 00:50:00,440
Well, that leads to an interesting segue onto the next story, which is a bunch of announcements

781
00:50:00,440 --> 00:50:06,120
from Microsoft this week at their Ignite conference.

782
00:50:06,120 --> 00:50:13,720
So they announced a bunch of updates specifically to Microsoft Copilot and one of the things

783
00:50:13,720 --> 00:50:20,920
that I think is really worth zoning in on with this is a new tool called Copilot Studio.

784
00:50:20,920 --> 00:50:29,200
Now we can talk about the wider announcement in a moment, but Microsoft Copilot Studio

785
00:50:29,200 --> 00:50:38,080
is a Zapier like interface, a drag and drop interface, a no code customization solution

786
00:50:38,080 --> 00:50:48,800
basically that allows people to build their own Copilot integrations across the enterprise.

787
00:50:48,800 --> 00:50:54,480
And it's a really interesting product launch from Microsoft because all of this is powered

788
00:50:54,480 --> 00:50:59,280
by GPT-4 under the hood.

789
00:50:59,280 --> 00:51:03,680
And if I just give you a quick summary of how it actually works, you start off with

790
00:51:03,680 --> 00:51:11,560
a inference user input into Copilot, your prompt basically, and then there is this orchestration

791
00:51:11,560 --> 00:51:18,320
layer that Microsoft has built, which tries to interpret what you've asked and tries to

792
00:51:18,320 --> 00:51:23,280
figure out what it needs to do, what action it needs to trigger basically, where does

793
00:51:23,280 --> 00:51:30,080
it need to the query, whether it's the HR system, the finance system, the CRM, whatever

794
00:51:30,080 --> 00:51:35,920
it will maybe, then the query is routed to that particular place, HR, CRM, whatever it

795
00:51:35,920 --> 00:51:44,780
is, and collects the relevant data, it's compiled and then pushed back into GPT-4, which does

796
00:51:44,780 --> 00:51:50,440
what GPT-4 does and gives you a response in natural language.

797
00:51:50,440 --> 00:51:59,420
It integrates with over 1100 products, including Adobe, Zendesk, SAP and a bunch of other tools

798
00:51:59,420 --> 00:52:02,840
as well, real enterprise level tech.

799
00:52:02,840 --> 00:52:11,920
However, when watching the product announcement for this, I couldn't help but think that there

800
00:52:11,920 --> 00:52:18,620
is a whole new category of job which is going to be required, which takes me back to why

801
00:52:18,620 --> 00:52:23,040
I think this is an interesting segue from the previous story.

802
00:52:23,040 --> 00:52:29,500
And that is what you were describing with GPTs and being able to create all of these

803
00:52:29,500 --> 00:52:33,140
interesting tools with function calling and what have you, requires a certain element

804
00:52:33,140 --> 00:52:35,620
of AI architecture.

805
00:52:35,620 --> 00:52:43,280
You need to be able to think about system builds and how you connect knowledge, instructions

806
00:52:43,280 --> 00:52:48,720
and actions to enable you to create a good GPT.

807
00:52:48,720 --> 00:52:53,160
And watching this copilot presentation, I was thinking the same thing.

808
00:52:53,160 --> 00:53:00,360
We are on the verge of a whole new category of job role coming out, which I think is AI

809
00:53:00,360 --> 00:53:10,180
agent or AI assistant architect, where effectively you are designing, managing and building these

810
00:53:10,180 --> 00:53:14,360
AI powered assistants and connecting them all together.

811
00:53:14,360 --> 00:53:23,300
Because doing that role needs such a stronger knowledge of, well, there's a technical side.

812
00:53:23,300 --> 00:53:27,320
You need to understand functions and APIs and that sort of thing.

813
00:53:27,320 --> 00:53:31,160
You've got to be something of a prompt engineer as well.

814
00:53:31,160 --> 00:53:36,040
And then you've got to be a good systems designer, understanding user interface.

815
00:53:36,040 --> 00:53:42,500
There's a whole lot that goes into creating something that's good and functional.

816
00:53:42,500 --> 00:53:49,840
And that's exactly, even though copilot studio is a no code ostensibly, it's a no code solution,

817
00:53:49,840 --> 00:53:50,840
right?

818
00:53:50,840 --> 00:53:57,200
You're not going to be able to create anything good with it without being incredibly technical.

819
00:53:57,200 --> 00:54:02,640
As it happens, I think if you're a Microsoft reseller or an IT managed service provider,

820
00:54:02,640 --> 00:54:08,300
if you're not looking at this as the next revenue stream is a big part of your ongoing

821
00:54:08,300 --> 00:54:14,160
consultancy going forward, you should be because companies are going to need this from you

822
00:54:14,160 --> 00:54:15,160
without doubt.

823
00:54:15,160 --> 00:54:21,080
But yeah, I was thinking, I couldn't help but watch this product launch, think about

824
00:54:21,080 --> 00:54:24,400
GPTs and think this is a whole new category of job role.

825
00:54:24,400 --> 00:54:25,400
I agree.

826
00:54:25,400 --> 00:54:31,920
I know the hope is that prompt engineering and systems thinking will be less required

827
00:54:31,920 --> 00:54:38,000
because the language models the bots were interfacing with will do that thinking for

828
00:54:38,000 --> 00:54:41,080
us and maybe ask us the right questions and then we'll answer them.

829
00:54:41,080 --> 00:54:43,000
And then it will think it through for us.

830
00:54:43,000 --> 00:54:44,880
But I think we're away from that.

831
00:54:44,880 --> 00:54:49,680
And I really agree with your assessment because we were playing with building tools, right?

832
00:54:49,680 --> 00:54:51,360
You built the text to speech.

833
00:54:51,360 --> 00:54:58,600
And so I went and tried to see how long would it take me to build a web based for transcription

834
00:54:58,600 --> 00:54:59,600
just using whisper.

835
00:54:59,600 --> 00:55:01,400
And it took me about two and a half hours.

836
00:55:01,400 --> 00:55:06,360
But when I was doing it, I really reflected on I use bubble and bubble is a no code stroke

837
00:55:06,360 --> 00:55:10,160
low code tool and for the tool I was building was a no code tool.

838
00:55:10,160 --> 00:55:13,640
But the way it works is still how software works, right?

839
00:55:13,640 --> 00:55:17,720
Databases for information, pulling things in and out, taking certain actions based on

840
00:55:17,720 --> 00:55:18,720
clicks.

841
00:55:18,720 --> 00:55:24,120
So you really have to be able to think in terms of system processes, moving data around

842
00:55:24,120 --> 00:55:26,040
to different places.

843
00:55:26,040 --> 00:55:28,640
And I think if your brain works that way, it's probably quite intuitive.

844
00:55:28,640 --> 00:55:32,760
But if it doesn't, it's really, really hard.

845
00:55:32,760 --> 00:55:37,680
Like I spent, I could have built that tool in half an hour if I really knew what I was

846
00:55:37,680 --> 00:55:38,680
doing.

847
00:55:38,680 --> 00:55:43,080
I spent 45 minutes just trying to get it to show the transcript because it wasn't obvious

848
00:55:43,080 --> 00:55:48,560
to me how to pull this transcript out of the database I'd set up and have it display in

849
00:55:48,560 --> 00:55:52,920
a certain text box that I'd built, which should have been so easy.

850
00:55:52,920 --> 00:55:59,120
But you just need to know how computers, how software works, how could in inverted commas,

851
00:55:59,120 --> 00:56:01,840
how computers think moving information around.

852
00:56:01,840 --> 00:56:03,520
So I think you're absolutely right.

853
00:56:03,520 --> 00:56:07,720
I think we'll see the AI architecture version of that for sure.

854
00:56:07,720 --> 00:56:09,440
Do you know what my favorite thing is?

855
00:56:09,440 --> 00:56:11,800
This shows the difference between you and me.

856
00:56:11,800 --> 00:56:22,040
My favorite thing about the Microsoft announcement was they've got this Hey Gen like avatar tool

857
00:56:22,040 --> 00:56:23,360
now.

858
00:56:23,360 --> 00:56:31,800
So basically it's a speaking moving version created based on, I think just based on a

859
00:56:31,800 --> 00:56:32,800
single image.

860
00:56:32,800 --> 00:56:34,120
I'd have to dive a bit deeper into that.

861
00:56:34,120 --> 00:56:39,120
And it's just, you know, it's not available yet, but basically you can create these avatars

862
00:56:39,120 --> 00:56:44,720
that speak and move using Azure's AI speech to text avatar feature, which is powered by

863
00:56:44,720 --> 00:56:48,000
open AI as Martin was saying.

864
00:56:48,000 --> 00:56:52,280
And you can imagine creating videos for your website where you give it an image, you write

865
00:56:52,280 --> 00:56:56,480
a script, and then it's you speaking, there is in the video, but it's not really you.

866
00:56:56,480 --> 00:57:01,280
You could imagine doing prospecting outreach where you send individual videos to each prospect,

867
00:57:01,280 --> 00:57:03,080
they're not really you either.

868
00:57:03,080 --> 00:57:06,040
They're all like basically deep fakes of you.

869
00:57:06,040 --> 00:57:11,800
And we saw an interesting version of Hey Gen that offers this type of service as well.

870
00:57:11,800 --> 00:57:12,960
And Hey Gen's are better.

871
00:57:12,960 --> 00:57:18,040
So if you want to go search this stuff, H-E-Y-G-E-N avatars, go look those up.

872
00:57:18,040 --> 00:57:20,000
Obviously look up the Microsoft version as well.

873
00:57:20,000 --> 00:57:24,880
But what I took away from it, Martin, is now there's not just one game in town.

874
00:57:24,880 --> 00:57:28,760
And whenever there's not one game in town, the speed that everybody feels they need to

875
00:57:28,760 --> 00:57:32,640
move out to improve their products increases exponentially.

876
00:57:32,640 --> 00:57:40,200
And so again, I think we're getting so close to a moment when those deep fake driven avatars

877
00:57:40,200 --> 00:57:45,080
are being used in so many customer service marketing and sales use cases.

878
00:57:45,080 --> 00:57:47,800
And they'll look pretty good and be somewhat reliable.

879
00:57:47,800 --> 00:57:50,200
And then soon they'll be indistinguishable.

880
00:57:50,200 --> 00:57:51,200
Yeah.

881
00:57:51,200 --> 00:57:52,200
And what does that mean?

882
00:57:52,200 --> 00:57:56,160
There was a story this week actually that said AI researchers are now at the point where

883
00:57:56,160 --> 00:58:01,240
they're struggling to differentiate real from fake.

884
00:58:01,240 --> 00:58:06,880
And that should give us all pause for thought because what does that mean?

885
00:58:06,880 --> 00:58:12,000
Like as humans, do we end up in the point where we're like, wow, so the only thing I

886
00:58:12,000 --> 00:58:16,160
can really trust is my own eyes.

887
00:58:16,160 --> 00:58:20,080
And there are loads of things about all the cognitive biases and stuff that we have that

888
00:58:20,080 --> 00:58:22,200
mean you probably can't trust your own eyes either.

889
00:58:22,200 --> 00:58:25,920
But at least when you're consuming media online, what videos will be real?

890
00:58:25,920 --> 00:58:27,320
What images will be ready?

891
00:58:27,320 --> 00:58:29,920
What text would have been created by humans and whatnot?

892
00:58:29,920 --> 00:58:32,720
Like that is going to be a different world to be in, isn't it?

893
00:58:32,720 --> 00:58:33,720
It is.

894
00:58:33,720 --> 00:58:38,760
And I feel like I might be skipping ahead in some of the stories, but I think this is a

895
00:58:38,760 --> 00:58:44,920
relevant point to jump in to talk about YouTube's responsible AI evolution.

896
00:58:44,920 --> 00:58:46,840
Let's do it.

897
00:58:46,840 --> 00:58:48,160
We got about 10 minutes more.

898
00:58:48,160 --> 00:58:50,360
We got to respect our audience's time, Martin.

899
00:58:50,360 --> 00:58:55,040
So we're going to do as many stories as we can in the next 10 minutes and we're going

900
00:58:55,040 --> 00:58:56,320
to start with YouTube.

901
00:58:56,320 --> 00:58:57,320
Right.

902
00:58:57,320 --> 00:59:01,600
So YouTube is taking a measured approach against generative AI.

903
00:59:01,600 --> 00:59:06,600
Obviously lots of people are now submitting content to YouTube that has some generative

904
00:59:06,600 --> 00:59:12,280
AI in it, whether it's audio, images, video, what have you.

905
00:59:12,280 --> 00:59:20,200
So they're introducing plans to enforce mandatory disclosures basically from creators that are

906
00:59:20,200 --> 00:59:27,000
using AI, particularly where AI has been used to generate realistic content.

907
00:59:27,000 --> 00:59:32,280
So this will involve a labeling system and it will involve alerting viewers to synthetic

908
00:59:32,280 --> 00:59:38,960
content, particularly around sensitive topics such as health and elections.

909
00:59:38,960 --> 00:59:42,580
So just going to rattle through a summary of the announcements really.

910
00:59:42,580 --> 00:59:46,560
So they've got disclosure requirements, which we've already mentioned.

911
00:59:46,560 --> 00:59:53,380
YouTube will add labels to the content, letting people know that it's altered or synthetic.

912
00:59:53,380 --> 00:59:58,640
There is also a removal request process where individuals can request the removal of AI

913
00:59:58,640 --> 01:00:03,600
generated content that uses their likeness without consent.

914
01:00:03,600 --> 01:00:08,440
There is some music content regulation where music partners can request the removal of

915
01:00:08,440 --> 01:00:13,000
AI generated music that mimics an artist's voice.

916
01:00:13,000 --> 01:00:19,000
YouTube will continue to enhance AI driven content moderation to detect and address policy

917
01:00:19,000 --> 01:00:24,280
violations and they're introducing some adversarial testing.

918
01:00:24,280 --> 01:00:32,000
So they're going to conduct adversarial testing to anticipate and prevent misuse of AI tools.

919
01:00:32,000 --> 01:00:38,680
So this is part of their ongoing investment into AI moderation and they're introducing

920
01:00:38,680 --> 01:00:44,600
some, they're basically playing AIs off against each other.

921
01:00:44,600 --> 01:00:50,120
They're rolling out, if you are a creator, they're going to be rolling out some education

922
01:00:50,120 --> 01:00:56,160
to educate the creators about these requirements, like what are these disclosures?

923
01:00:56,160 --> 01:01:00,360
And I think this is going to have an impact on marketers, right?

924
01:01:00,360 --> 01:01:06,880
Because lots of marketers will be using AI for voiceovers, for video, for images.

925
01:01:06,880 --> 01:01:12,400
And I wonder at what level the disclosures will kind of settle, where we'll have to

926
01:01:12,400 --> 01:01:15,380
say what is and isn't disclosed.

927
01:01:15,380 --> 01:01:23,920
If I use an AI voiceover on a corporate video, maybe like a, just a product video and I've

928
01:01:23,920 --> 01:01:30,280
used an AI voiceover, that doesn't seem particularly risky or harmful to people if I use

929
01:01:30,280 --> 01:01:31,880
an 11 lab voice.

930
01:01:31,880 --> 01:01:34,320
Do I need to disclose that?

931
01:01:34,320 --> 01:01:35,320
Seems unlikely.

932
01:01:35,320 --> 01:01:41,160
So, but obviously if I'm doing something with, you know, healthcare or elections and I'm

933
01:01:41,160 --> 01:01:46,960
using AI generated images and voices, then it would make sense as a much greater risk

934
01:01:46,960 --> 01:01:47,960
to society.

935
01:01:47,960 --> 01:01:53,880
But yeah, that's a new disclosure that's coming out requirement from YouTube.

936
01:01:53,880 --> 01:01:58,760
Yeah, I know that's going to be extremely complex.

937
01:01:58,760 --> 01:02:01,480
I'd be interested to see how that gets administered, like you said.

938
01:02:01,480 --> 01:02:04,840
Right, we're going to go through, we're rapid firing it, Martin.

939
01:02:04,840 --> 01:02:07,560
I've got a couple of stories I'm going to do and then if you can crack through a couple

940
01:02:07,560 --> 01:02:08,560
of stories.

941
01:02:08,560 --> 01:02:13,720
So first one for me is Google postpones the launch of Gemini AI.

942
01:02:13,720 --> 01:02:17,360
So for those of you who listen to the podcast, you'll know that we're excited about Google's

943
01:02:17,360 --> 01:02:23,240
Gemini because it's trying to be the main competition against GPT-4 from OpenAI.

944
01:02:23,240 --> 01:02:27,720
And I think the thing we're most excited about is so many people launch models, but

945
01:02:27,720 --> 01:02:29,640
none of them get near GPT-4.

946
01:02:29,640 --> 01:02:33,240
And so the, Martin has a few use cases he loves for Claude.

947
01:02:33,240 --> 01:02:38,320
I love Claude as well, but in general, the workhorse is GPT-4 and ChatGPT.

948
01:02:38,320 --> 01:02:44,240
What we need is another GPT-4 level model to continue to drive development and innovation

949
01:02:44,240 --> 01:02:45,980
in this area.

950
01:02:45,980 --> 01:02:50,300
So we thought Gemini would come out the end of this year.

951
01:02:50,300 --> 01:02:56,320
It's now been delayed and the reports are this is because it's not as good as GPT-4

952
01:02:56,320 --> 01:03:01,760
yet, which would be a significant issue because you'd have to ask why launch it if it's not.

953
01:03:01,760 --> 01:03:05,160
And I think that's the question they asked internally and that's why they didn't launch

954
01:03:05,160 --> 01:03:06,160
it.

955
01:03:06,160 --> 01:03:09,200
So then the question is really how long are we going to have to wait?

956
01:03:09,200 --> 01:03:11,480
Because what's it going to take to make those improvements?

957
01:03:11,480 --> 01:03:12,480
No one knows.

958
01:03:12,480 --> 01:03:16,040
So if you're waiting on Gemini like we are, and you were excited about it, be a bit less

959
01:03:16,040 --> 01:03:18,700
excited because it's going to take a little while to come.

960
01:03:18,700 --> 01:03:23,240
The other story I was going to talk through was just quickly Google's search generative

961
01:03:23,240 --> 01:03:26,720
experience, which we've talked about a few times, which has been available in the US,

962
01:03:26,720 --> 01:03:30,560
is now being rolled out to 120 other countries.

963
01:03:30,560 --> 01:03:32,660
Why is this significant for marketers?

964
01:03:32,660 --> 01:03:37,600
Because we've been talking a little bit about what does the SGE, search generative experience,

965
01:03:37,600 --> 01:03:40,340
mean for marketers doing SEO?

966
01:03:40,340 --> 01:03:45,660
And if Google is answering questions and providing insights and making product recommendations

967
01:03:45,660 --> 01:03:50,880
right at the top of the screen through the SGE, rather than pushing people towards organic

968
01:03:50,880 --> 01:03:53,960
links, how's that going to change how we do SEO?

969
01:03:53,960 --> 01:03:59,160
The fact that it's now been rolled out to 120 other countries tells us this is coming

970
01:03:59,160 --> 01:04:02,640
mainstream and we're going to have to deal with this sooner or later and we're going

971
01:04:02,640 --> 01:04:06,680
to have to really see how this impacts the performance of our websites and our SEO and

972
01:04:06,680 --> 01:04:09,080
our content marketing efforts.

973
01:04:09,080 --> 01:04:10,080
Martin?

974
01:04:10,080 --> 01:04:16,320
Martin McAlpine Sticking with Google and their announcements,

975
01:04:16,320 --> 01:04:21,920
they have through YouTube actually announced Melodic AI and this was a blog post that the

976
01:04:21,920 --> 01:04:27,560
YouTube team have published, which is a collaboration with YouTube and Google DeepMind and they've

977
01:04:27,560 --> 01:04:33,040
unveiled an array of AI driven music experiments and they are pretty cool.

978
01:04:33,040 --> 01:04:38,320
I would recommend people go onto the blog and listen to some of the tracks.

979
01:04:38,320 --> 01:04:45,320
They've created an AI where you can basically hum a tune and then with a written prompt

980
01:04:45,320 --> 01:04:46,960
describe what you want that to do.

981
01:04:46,960 --> 01:04:53,520
So you can turn it into a saxophone solo and it will turn your hummed tune into a saxophone

982
01:04:53,520 --> 01:04:55,440
solo and you can stick that into a track.

983
01:04:55,440 --> 01:04:59,240
So you can imagine what people are going to be able to do if you're a musician or you're

984
01:04:59,240 --> 01:05:05,120
a producer, you're going to be able to create tracks by tapping on a table and turning into

985
01:05:05,120 --> 01:05:08,680
proper drum solos and what have you.

986
01:05:08,680 --> 01:05:14,640
There's other ones as well where you can just type in a scene, maybe you've got a bit of

987
01:05:14,640 --> 01:05:19,720
footage and you want to type in a description of the type of music you want and it will

988
01:05:19,720 --> 01:05:21,360
create that track.

989
01:05:21,360 --> 01:05:23,680
That's called Dream Track for Shorts.

990
01:05:23,680 --> 01:05:26,080
So that's going to be released for YouTube Shorts.

991
01:05:26,080 --> 01:05:32,680
Now they've launched this in collaboration with artists such as Charlie Puth and T-Pain.

992
01:05:32,680 --> 01:05:40,380
I can't say I know who Charlie is, but yeah, basically a bunch of AI powered musical tools

993
01:05:40,380 --> 01:05:45,920
that producers and creators are going to have at their disposal and they look really cool.

994
01:05:45,920 --> 01:05:54,360
And while we're on the subject of AI audio, a big story this week came out of Stability

995
01:05:54,360 --> 01:06:03,280
AI, the head of audio at Stability AI, the company that produced stable diffusion and

996
01:06:03,280 --> 01:06:07,080
clip drop and all of these tools that we love.

997
01:06:07,080 --> 01:06:13,640
His name is Ed Newton Rex, head of audio and he stepped down from his role this week amid

998
01:06:13,640 --> 01:06:23,320
concerns over Stability's use of copyrighted content to train models without consent.

999
01:06:23,320 --> 01:06:32,960
So he described the practice as being exploitative and he wrote quite a long tweet basically

1000
01:06:32,960 --> 01:06:40,640
articulating his position and describing the tension amongst people within the industry.

1001
01:06:40,640 --> 01:06:44,040
This is someone who loves audio.

1002
01:06:44,040 --> 01:06:48,000
This is why he's in the role of head of audio.

1003
01:06:48,000 --> 01:06:53,080
He's someone that's passionate about this as a medium, but he can see that creators

1004
01:06:53,080 --> 01:07:00,640
and artists are being exploited and will be left behind.

1005
01:07:00,640 --> 01:07:04,360
He doesn't agree with the definition of fair use.

1006
01:07:04,360 --> 01:07:08,480
He thinks that it's not the way that the industry should be going.

1007
01:07:08,480 --> 01:07:10,840
So yeah, that was a big shape up.

1008
01:07:10,840 --> 01:07:15,560
We can train our models based on your stuff without paying you because we consider that

1009
01:07:15,560 --> 01:07:16,760
fair use.

1010
01:07:16,760 --> 01:07:22,920
And I think some of his posts implied that that mindset is the prevailing mindset in

1011
01:07:22,920 --> 01:07:25,040
all of these generative AI companies.

1012
01:07:25,040 --> 01:07:31,360
And I think I would have Umbridge as well with that.

1013
01:07:31,360 --> 01:07:34,880
So another cool thing was Runway announcing their motion brush.

1014
01:07:34,880 --> 01:07:39,160
So people on the podcast, the listeners, you'll know we love Runway.

1015
01:07:39,160 --> 01:07:40,160
It's pretty cool.

1016
01:07:40,160 --> 01:07:42,400
It does loads of text to video generation stuff.

1017
01:07:42,400 --> 01:07:47,280
And the latest situation of the tool lets you input a static image and then taking a

1018
01:07:47,280 --> 01:07:52,760
brush-like interface, you can color in an aspect of the image that you want to move

1019
01:07:52,760 --> 01:07:55,920
and it will only move that part of the image.

1020
01:07:55,920 --> 01:07:58,200
So it's not available yet.

1021
01:07:58,200 --> 01:08:00,640
And the demo videos for all this stuff always makes it look cool.

1022
01:08:00,640 --> 01:08:03,520
And then when Mark and I try and actually get anything usable out of it, it's almost

1023
01:08:03,520 --> 01:08:05,320
impossible for us.

1024
01:08:05,320 --> 01:08:08,480
I guess maybe more a limitation of us or maybe the amount of time and effort you need to

1025
01:08:08,480 --> 01:08:11,000
put in to actually get something awesome.

1026
01:08:11,000 --> 01:08:15,560
But if it's usable, I think it could actually have some really interesting applications

1027
01:08:15,560 --> 01:08:20,040
beyond what we can do right now, like animating an area of a chart.

1028
01:08:20,040 --> 01:08:24,840
Like if you're doing a report to your team, that could be quite interesting to make things

1029
01:08:24,840 --> 01:08:26,520
a bit more engaging.

1030
01:08:26,520 --> 01:08:27,520
You could...

1031
01:08:27,520 --> 01:08:32,040
There's suggestions you might be able to take things like photography, like an old family

1032
01:08:32,040 --> 01:08:37,360
photo and have the people wave, which would be kind of interesting and see how well that

1033
01:08:37,360 --> 01:08:39,120
works.

1034
01:08:39,120 --> 01:08:44,520
So I think it'd be interesting to see how people play with this versus using static

1035
01:08:44,520 --> 01:08:48,000
images or just straight on text to video generation.

1036
01:08:48,000 --> 01:08:52,600
And whether that then cascades for us as marketers into applications in terms of what might have

1037
01:08:52,600 --> 01:08:57,720
been a static social media image banner now becomes a moving one because it's so easy

1038
01:08:57,720 --> 01:09:02,320
to animate certain aspects with quite a lot of fine control in terms of what those aspects

1039
01:09:02,320 --> 01:09:03,520
are.

1040
01:09:03,520 --> 01:09:07,680
And a lot more of those types of more video-y type things on websites versus just static

1041
01:09:07,680 --> 01:09:08,680
images.

1042
01:09:08,680 --> 01:09:10,240
So it could be quite cool.

1043
01:09:10,240 --> 01:09:11,240
Yeah.

1044
01:09:11,240 --> 01:09:14,000
Looking forward to actually getting hands on with that and seeing if I can get something

1045
01:09:14,000 --> 01:09:17,920
useful, which I almost certainly won't be able to.

1046
01:09:17,920 --> 01:09:21,960
Next up, OpenAI seeks a collaborative future with data partnerships.

1047
01:09:21,960 --> 01:09:27,800
So this is OpenAI basically saying, we're training up new models and we need more data.

1048
01:09:27,800 --> 01:09:32,720
We need access to more proprietary data or hard to access data.

1049
01:09:32,720 --> 01:09:37,480
So if you have that, you can go to them and say, hey, here's a bunch of data that you

1050
01:09:37,480 --> 01:09:38,480
can access.

1051
01:09:38,480 --> 01:09:44,360
So this is particularly interesting, I think, if you're in maybe an institution like a museum

1052
01:09:44,360 --> 01:09:49,360
or a library where you say, actually, we think we've got a collection here that we could

1053
01:09:49,360 --> 01:09:57,800
give over and help train the model and expand the domain knowledge of, let's say, GPT-5,

1054
01:09:57,800 --> 01:10:00,400
which they have spoke about.

1055
01:10:00,400 --> 01:10:03,160
They are now training.

1056
01:10:03,160 --> 01:10:07,040
So if you're at a museum and you've got a specialist collection, you could hand that

1057
01:10:07,040 --> 01:10:08,040
over.

1058
01:10:08,040 --> 01:10:14,000
Now they said that in terms of being able to handle the information that is provided,

1059
01:10:14,000 --> 01:10:16,200
they can basically deal with that.

1060
01:10:16,200 --> 01:10:22,280
They've got really state of the art technology to do image and handwriting recognition from

1061
01:10:22,280 --> 01:10:27,040
documents and GPT-4 can interpret photos of things.

1062
01:10:27,040 --> 01:10:32,200
So they'll deal with the collection, but if you've got a collection, you can give it to

1063
01:10:32,200 --> 01:10:33,200
them.

1064
01:10:33,200 --> 01:10:34,560
There are two ways that you can provide.

1065
01:10:34,560 --> 01:10:39,520
You can contribute to an open source data set for public use and training open source

1066
01:10:39,520 --> 01:10:45,600
models or for providing data sets to train proprietary models with strict data sensitivity

1067
01:10:45,600 --> 01:10:48,920
and access controls, things like GPT-5, for instance.

1068
01:10:48,920 --> 01:10:53,000
So if you're interested in that, and that sounds like something that your organisation

1069
01:10:53,000 --> 01:10:58,840
could contribute to, get in touch with OpenAI about their OpenAI data partnership.

1070
01:10:58,840 --> 01:10:59,840
Cracking stuff.

1071
01:10:59,840 --> 01:11:05,120
Well, Martin, we're now near the well over the hour mark, so I think we'll let our friendly

1072
01:11:05,120 --> 01:11:07,200
listeners go about their day.

1073
01:11:07,200 --> 01:11:08,200
Thanks for listening.

1074
01:11:08,200 --> 01:11:09,200
We hope you enjoyed this.

1075
01:11:09,200 --> 01:11:12,400
Subscribe if you haven't already and tell the other marketers you know who you think

1076
01:11:12,400 --> 01:11:13,400
might benefit from this.

1077
01:11:13,400 --> 01:11:14,400
Hey, go check that out.

1078
01:11:14,400 --> 01:11:16,440
Maybe they should subscribe too.

1079
01:11:16,440 --> 01:11:21,120
I will look forward to catching up with you on our next episode, Martin.

1080
01:11:21,120 --> 01:11:22,120
Thanks very much.

1081
01:11:22,120 --> 01:11:23,120
Cheers.

1082
01:11:23,120 --> 01:11:24,120
See you later.

1083
01:11:24,120 --> 01:11:25,120
Bye.

1084
01:11:25,120 --> 01:11:29,040
Thank you for listening to Artificially Intelligent Marketing.

1085
01:11:29,040 --> 01:11:35,120
To stay on top of the latest trends, tips and tools in the world of marketing AI, be

1086
01:11:35,120 --> 01:11:36,860
sure to subscribe.

1087
01:11:36,860 --> 01:11:40,440
We look forward to seeing you again next week.

