WEBVTT

00:00:04.169 --> 00:00:06.830
Welcome to artificially intelligent marketing,

00:00:06.990 --> 00:00:09.949
a weekly podcast where we stay on top of the

00:00:09.949 --> 00:00:13.230
latest trends, tips, and tools in the world of

00:00:13.230 --> 00:00:16.070
marketing AI, helping you get the best results

00:00:16.070 --> 00:00:19.210
from your marketing efforts. Now let's join our

00:00:19.210 --> 00:00:23.550
hosts, Paul Avery and Martin Broadhurst. Hello,

00:00:23.609 --> 00:00:26.399
everyone. Welcome to Artificially Intelligent

00:00:26.399 --> 00:00:28.879
Marketing. Paul Avery here, as usual, joined

00:00:28.879 --> 00:00:31.260
by the fantabulous Martin Broadhurst. Martin,

00:00:31.339 --> 00:00:34.600
how are you, sir? I'm great. Unlike the weather

00:00:34.600 --> 00:00:38.979
outside, which is gloomy, miserable, very, very

00:00:38.979 --> 00:00:44.119
December. Generally speaking, I'm good. The weather

00:00:44.119 --> 00:00:48.100
is frightful, but the podcast is extremely delightful.

00:00:48.259 --> 00:00:51.460
So everyone's a winner. We've got some really

00:00:51.460 --> 00:00:54.140
interesting stuff to talk about this week. Last

00:00:54.140 --> 00:00:56.659
couple of weeks have been crazy. So what have

00:00:56.659 --> 00:00:58.759
we got on the docket today? A few new models

00:00:58.759 --> 00:01:01.979
have been released. You might have slept on this,

00:01:02.000 --> 00:01:04.900
Paul. I know you've been very busy, but Google

00:01:04.900 --> 00:01:10.200
have released Gemini 3 Pro and the exciting Nano

00:01:10.200 --> 00:01:13.859
Banana Pro. Anthropic responded with their own

00:01:13.859 --> 00:01:18.640
release of Opus 4 .5. Away from model releases,

00:01:18.719 --> 00:01:21.870
we've had... Gartner releasing their 2026 strategic

00:01:21.870 --> 00:01:26.329
predictions, which you'll be unsurprised to hear

00:01:26.329 --> 00:01:32.109
contains some AI mentions. I know I was shocked.

00:01:33.349 --> 00:01:37.609
Google have started rolling out ads in AI overviews,

00:01:37.609 --> 00:01:41.349
and there's some rumors flying around that chat

00:01:41.349 --> 00:01:43.670
GPT might be looking at doing something similar

00:01:43.670 --> 00:01:49.840
in their system. And Adobe has purchased. semrush

00:01:49.840 --> 00:01:52.180
lots going on then lots going on let's get into

00:01:52.180 --> 00:01:54.299
that first one i mean it's been it's been the

00:01:54.299 --> 00:01:56.920
model week isn't it gemini 3 came out and we

00:01:56.920 --> 00:01:58.540
started playing with it instantly and we were

00:01:58.540 --> 00:02:01.680
like oh this is decent all of the evals look

00:02:01.680 --> 00:02:04.840
very good but as we're finding mostly in our

00:02:04.840 --> 00:02:06.900
own work you have to test your own use cases

00:02:06.900 --> 00:02:09.710
you can't massively trust too many uvals at this

00:02:09.710 --> 00:02:12.389
point at least in my experience um and then yeah

00:02:12.389 --> 00:02:15.409
opus 4 .5 comes straight out and there was who

00:02:15.409 --> 00:02:18.889
is the who is winning and then gpt 5 .1 as well

00:02:18.889 --> 00:02:21.889
so lots going on take us through what google's

00:02:21.889 --> 00:02:24.750
been up to so google was sitting at the top of

00:02:24.750 --> 00:02:29.129
the leaderboards with gemini 2 .5 for a long

00:02:29.129 --> 00:02:33.250
time and then there were some rumors flying around

00:02:33.250 --> 00:02:35.990
that three would be coming out certainly the

00:02:35.990 --> 00:02:39.569
uh the x .com discussions and reddit boards were

00:02:39.569 --> 00:02:43.289
awash with speculation and then yeah they launched

00:02:43.289 --> 00:02:48.110
a new model gemini 3 pro which we think will

00:02:48.110 --> 00:02:51.509
be followed by gemini 3 flash the lightweight

00:02:51.509 --> 00:02:56.270
version the kind of not reasoning model soon

00:02:56.270 --> 00:02:59.650
uh but yeah it came out and smashed the leaderboards

00:02:59.650 --> 00:03:04.990
it was a real leap as well i think sometimes

00:03:04.990 --> 00:03:10.159
there's been so little between the the state

00:03:10.159 --> 00:03:12.780
-of -the -art models the frontier models it's

00:03:12.780 --> 00:03:16.819
been a few points here or there that when this

00:03:16.819 --> 00:03:19.280
came out on some of the the leaderboards it was

00:03:19.280 --> 00:03:24.319
a significant step forward um on leaderboards

00:03:24.319 --> 00:03:28.280
that were quite saturated or on ones that are

00:03:28.280 --> 00:03:31.379
have a long way to go they were really pushing

00:03:31.379 --> 00:03:34.469
things forward lm arena they were smashing the

00:03:34.469 --> 00:03:37.069
competition so yeah it's certainly been well

00:03:37.069 --> 00:03:41.210
received my initial play with the models has

00:03:41.210 --> 00:03:45.110
been has been very good both in the in the api

00:03:45.110 --> 00:03:51.610
and the uh gemini chat interface i've really

00:03:51.610 --> 00:03:55.590
enjoyed putting some challenging scenarios in

00:03:55.590 --> 00:03:59.530
it there's been things like uh trying to design

00:03:59.530 --> 00:04:03.800
a complex power automate flow which is the kind

00:04:03.800 --> 00:04:06.159
of workflow that their models generally aren't

00:04:06.159 --> 00:04:08.199
very good at working with the only time i've

00:04:08.199 --> 00:04:12.400
had any decent responses is from chat gpt 5 pro

00:04:12.400 --> 00:04:17.639
um gemini 2 .5 wasn't really very good at understanding

00:04:17.639 --> 00:04:20.500
some of the the mechanics of power automate whereas

00:04:20.500 --> 00:04:25.459
i found with gemini 3 pro it did an excellent

00:04:25.459 --> 00:04:28.579
job of getting into the detail of how to automate

00:04:28.579 --> 00:04:30.959
a particular workflow Yeah, I mean, if we take

00:04:30.959 --> 00:04:33.939
a look at some of the evals, the Humanity's last

00:04:33.939 --> 00:04:37.800
exam, Gemini 3 Pro with code execution got 45

00:04:37.800 --> 00:04:41.759
.8 % and the previous high was 26 .5, which was

00:04:41.759 --> 00:04:45.100
GPT 5 .1. So that's a pretty big leap. Another

00:04:45.100 --> 00:04:48.519
one that jumped out to me was an eval called

00:04:48.519 --> 00:04:51.579
ScreenSpot Pro, which is basically AI's ability

00:04:51.579 --> 00:04:55.439
to understand a computer screen. The previous

00:04:55.439 --> 00:04:58.779
high on that was Claude Sonnet at 36%. Gemini

00:04:58.779 --> 00:05:02.339
3 does 72 % on that eval. That's important, right,

00:05:02.420 --> 00:05:05.819
for computer use by agents. So that's kind of

00:05:05.819 --> 00:05:10.959
interesting. The Vending Bench 2 eval, it has

00:05:10.959 --> 00:05:14.100
the highest score so far. So for listeners that

00:05:14.100 --> 00:05:16.860
are not familiar with this, basically the AI

00:05:16.860 --> 00:05:20.439
is... briefed to run a vending machine business

00:05:20.439 --> 00:05:23.560
and it has to like buy supplies, control pricing,

00:05:23.819 --> 00:05:26.439
deal with customer complaints and stuff like

00:05:26.439 --> 00:05:29.399
this and basically try and make money. So it

00:05:29.399 --> 00:05:32.199
nearly, probably not doubled, but maybe 50 %

00:05:32.199 --> 00:05:35.740
greater than the previous benchmark score. And

00:05:35.740 --> 00:05:37.220
that one's really interesting because it's much

00:05:37.220 --> 00:05:39.899
more closer to the world of real work, right,

00:05:39.980 --> 00:05:43.139
rather than some of the other evals which maybe

00:05:43.139 --> 00:05:47.019
get a bit esoteric. The criticism of all the

00:05:47.019 --> 00:05:49.660
evals is that once you know how the evals work,

00:05:49.800 --> 00:05:52.839
you can overfit the model in post -training.

00:05:52.980 --> 00:05:54.560
So again, you have to take a lot of this stuff

00:05:54.560 --> 00:05:56.699
with a pinch of salt perhaps, but it is interesting

00:05:56.699 --> 00:05:58.920
to see them massively increase. And I was listening

00:05:58.920 --> 00:06:02.220
to the Moonshots podcast with Peter Diamandis

00:06:02.220 --> 00:06:05.199
-Martin and the team on the podcast were talking

00:06:05.199 --> 00:06:08.139
about how we are drifting to a world where the

00:06:08.139 --> 00:06:10.620
evals that are going to be the most meaningful

00:06:10.620 --> 00:06:15.410
are really business outcome oriented. measured

00:06:15.410 --> 00:06:19.589
more in dollars like vending bench than in particular

00:06:19.589 --> 00:06:23.209
scores or percentages like completion percentages

00:06:23.209 --> 00:06:27.430
so definitely better my own personal experience

00:06:27.430 --> 00:06:30.470
has been it seems more thoughtful and knowledgeable

00:06:30.470 --> 00:06:34.490
to interact with it's much better at self -critique

00:06:34.490 --> 00:06:37.949
its writing quality is a slight improvement although

00:06:37.949 --> 00:06:42.069
i'm still not absolutely convinced that the models

00:06:42.069 --> 00:06:45.079
are getting much better at writing the We'll

00:06:45.079 --> 00:06:47.939
talk about Opus in a minute, but that's probably

00:06:47.939 --> 00:06:50.220
the best one I've seen so far. And then GPT -5

00:06:50.220 --> 00:06:53.100
.1 is better than GPT -5. So there's a bunch

00:06:53.100 --> 00:06:56.199
of nuances in there. But is there anything you've

00:06:56.199 --> 00:06:57.800
been doing with Gemini Feed particularly that

00:06:57.800 --> 00:07:00.240
you're getting good results on? Yeah, there was

00:07:00.240 --> 00:07:03.399
one particular project that I'd been struggling

00:07:03.399 --> 00:07:07.660
with, needed a front end that tapped into a few

00:07:07.660 --> 00:07:12.439
government APIs for property data. things like

00:07:12.439 --> 00:07:17.980
planning information, zoning, things like that.

00:07:18.420 --> 00:07:23.779
And I tried a few different tools. They took

00:07:23.779 --> 00:07:27.660
me 90 % of the way there, but there were some

00:07:27.660 --> 00:07:30.399
nuances in the way that things would render on

00:07:30.399 --> 00:07:34.259
the map. But Google Gemini one -shotted it. It

00:07:34.259 --> 00:07:36.220
was really powerful. It used the same prompt

00:07:36.220 --> 00:07:38.970
that I'd used elsewhere. and quite literally

00:07:38.970 --> 00:07:41.189
it one -shotted the output so yeah i was very

00:07:41.189 --> 00:07:44.930
impressed to see to see that improvement in the

00:07:44.930 --> 00:07:47.449
model been very good at spinning up front ends

00:07:47.449 --> 00:07:50.589
isn't it we've talked previously about for marketers

00:07:50.589 --> 00:07:54.009
moving beyond content as lead magnets and thinking

00:07:54.009 --> 00:07:56.970
about tools that you might build for your for

00:07:56.970 --> 00:07:58.750
your target audience that you would not have

00:07:58.750 --> 00:08:00.910
spent money on a developer on but if an ai can

00:08:00.910 --> 00:08:04.689
spin it up so as these things mature and improve

00:08:04.689 --> 00:08:06.889
and Gemini 3 is definitely one to try if you've

00:08:06.889 --> 00:08:08.970
been thinking about building tools and the other

00:08:08.970 --> 00:08:11.310
AIs have not been quite good enough or you haven't

00:08:11.310 --> 00:08:12.910
had confidence because you thought you'd have

00:08:12.910 --> 00:08:15.050
to get in and like wrangle the code yourself

00:08:15.050 --> 00:08:17.889
it's definitely worth playing with Gemini 3 from

00:08:17.889 --> 00:08:20.230
that perspective the other thing I've been testing

00:08:20.230 --> 00:08:22.149
this weekend is I've been running quite a lot

00:08:22.149 --> 00:08:26.430
of deep research reports um and it's been interesting

00:08:26.430 --> 00:08:28.709
in the past I've kind of got similar things like

00:08:28.709 --> 00:08:31.490
if I've run against like GPT 4 .0 against Gemini

00:08:31.490 --> 00:08:34.870
2 .5 I'd get a report that was by and large talking

00:08:34.870 --> 00:08:38.049
about the same stuff. But they started to diverge

00:08:38.049 --> 00:08:42.389
a bit, which is interesting. I ran Gemini 3 Pro

00:08:42.389 --> 00:08:45.830
against GPT 5 .1 Deep Thinking this weekend.

00:08:46.309 --> 00:08:50.809
And somewhat surprisingly to me, 5 .1 was better

00:08:50.809 --> 00:08:54.450
than Gemini Pro in terms of clarity of expression,

00:08:54.769 --> 00:08:57.909
depth of ideas, number of sources cited. Like

00:08:57.909 --> 00:09:00.230
it really did seem to do a much better job. So

00:09:00.230 --> 00:09:03.129
yeah, it's... I really thought when Gemini 3

00:09:03.129 --> 00:09:05.610
came out, it would just become my obvious go

00:09:05.610 --> 00:09:09.009
-to. But I'm not so sure that's going to turn

00:09:09.009 --> 00:09:11.549
out to be the case, if I'm honest. I think I'm

00:09:11.549 --> 00:09:13.669
still going to use GPT 5 .1 and other models

00:09:13.669 --> 00:09:17.190
for stuff. That chimes with my general experience

00:09:17.190 --> 00:09:22.730
with the Gemini chat experience and chat GPT.

00:09:23.629 --> 00:09:26.649
I still find, despite model improvements across

00:09:26.649 --> 00:09:29.980
the board, my daily driver... The app that is

00:09:29.980 --> 00:09:34.019
open on my desktop at all times is ChatGPT. They

00:09:34.019 --> 00:09:38.460
have just got the user experience down in a way

00:09:38.460 --> 00:09:43.159
that Google Gemini just doesn't quite have refined

00:09:43.159 --> 00:09:47.860
yet. But interestingly, Logan Kirkpatrick from

00:09:47.860 --> 00:09:51.559
the team has said that they are doing a big 2

00:09:51.559 --> 00:09:55.960
.0 relaunch of the UX coming soon. Yeah, it'll

00:09:55.960 --> 00:09:58.379
be interesting to see. what that entails i mean

00:09:58.379 --> 00:10:00.860
one of the cool things about gemini is they're

00:10:00.860 --> 00:10:05.799
starting to play with spinning up like dashboards

00:10:05.799 --> 00:10:08.919
and interactive experiences as part of the answer

00:10:08.919 --> 00:10:11.779
we've talked previously and i think it's pretty

00:10:11.779 --> 00:10:14.000
well discussed in the industry at this point

00:10:14.000 --> 00:10:18.450
that ux on demand is probably going to be how

00:10:18.450 --> 00:10:20.149
AIs work. Like sometimes it will give you an

00:10:20.149 --> 00:10:21.690
answer in text, if that's the most appropriate,

00:10:21.789 --> 00:10:23.389
but maybe it will create images. Interesting

00:10:23.389 --> 00:10:25.450
thing is Gemini 3 is much better. It goes out

00:10:25.450 --> 00:10:27.769
on the web and it tries to find images that will

00:10:27.769 --> 00:10:30.389
support its point and it intersperses them in

00:10:30.389 --> 00:10:33.009
the answer. And for me, on a couple of occasions,

00:10:33.110 --> 00:10:36.809
it even created images, which was super interesting.

00:10:36.929 --> 00:10:38.610
And they actually worked. And I'll give you an

00:10:38.610 --> 00:10:40.789
example. I think the advice turned out to suck,

00:10:40.889 --> 00:10:44.570
but I was... it's time for us to buy a new mattress.

00:10:44.750 --> 00:10:46.830
So I've been using Gemini 3 to help me find this

00:10:46.830 --> 00:10:48.950
new mattress. So I bought a new mattress, but

00:10:48.950 --> 00:10:51.110
the delivery time is six weeks and I'm getting

00:10:51.110 --> 00:10:53.409
a bad back from the current mattress. So Gemini

00:10:53.409 --> 00:10:55.830
3 gave me a bunch of recommendations about what

00:10:55.830 --> 00:10:58.610
I should do. One of them was to put a yoga mat

00:10:58.610 --> 00:11:01.049
under the dip in the mattress, which I think

00:11:01.049 --> 00:11:04.490
is actually terrible advice, but there's a number

00:11:04.490 --> 00:11:06.149
of different ways you could orientate that. And

00:11:06.149 --> 00:11:07.649
it wasn't quite clear to me. So I was like, can

00:11:07.649 --> 00:11:10.889
you draw an image that demonstrates this? came

00:11:10.889 --> 00:11:13.269
up with a beautiful image that absolutely um

00:11:13.269 --> 00:11:15.549
made it clear what its advice was and that's

00:11:15.549 --> 00:11:18.610
a new capability as far as i'm aware but this

00:11:18.610 --> 00:11:20.409
probably ties into probably potentially even

00:11:20.409 --> 00:11:23.009
more exciting part of gemini release which is

00:11:23.009 --> 00:11:26.950
the improvements to nano banana image gen model

00:11:26.950 --> 00:11:29.950
have you um had a chance to play with that i

00:11:29.950 --> 00:11:35.149
have so this is quite a step on from nano banana

00:11:35.149 --> 00:11:38.470
this is again it's a anyone that isn't aware

00:11:38.470 --> 00:11:41.409
of the differences google has imagine or imaging

00:11:41.409 --> 00:11:47.470
which is a diffusion based image generation model

00:11:47.470 --> 00:11:51.889
the clarity of those images is very good but

00:11:51.889 --> 00:11:56.350
it isn't baked into isn't native within the large

00:11:56.350 --> 00:11:59.669
language model gemini itself and that meant that

00:11:59.669 --> 00:12:03.529
getting edits of a design is quite tricky so

00:12:03.529 --> 00:12:05.990
every time you created an image with it it was

00:12:05.990 --> 00:12:10.419
creating a brand new image whereas with nano

00:12:10.419 --> 00:12:14.259
banana when they launched that within 2 .5 in

00:12:14.259 --> 00:12:16.779
the previous release suddenly we could upload

00:12:16.779 --> 00:12:20.639
images and get very detailed very accurate edits

00:12:20.639 --> 00:12:24.460
created to images well now they're talking about

00:12:24.460 --> 00:12:31.879
nano banana pro being a foundational image model

00:12:31.879 --> 00:12:36.000
which has almost a world model baked inside it

00:12:36.000 --> 00:12:40.080
so it understands so much more context about

00:12:40.080 --> 00:12:43.059
the image so you can ask it to do tasks not just

00:12:43.059 --> 00:12:47.419
like you know add a hat to this dog or something

00:12:47.419 --> 00:12:49.659
silly like that but you can actually ask it to

00:12:49.659 --> 00:12:54.600
isolate elements of the image or to mask certain

00:12:54.600 --> 00:12:58.080
parts of the image to turn the image into a line

00:12:58.080 --> 00:13:00.580
drawing you can ask it to you can even use it

00:13:00.580 --> 00:13:05.600
within video where you can put an image of say

00:13:05.600 --> 00:13:11.350
a a mouse stuck in a in a maze and ask it to

00:13:11.350 --> 00:13:14.970
animate the mouse running through the maze and

00:13:14.970 --> 00:13:18.809
it will accurately one shot it and this when

00:13:18.809 --> 00:13:22.389
it's linked with or nano banana pro when linked

00:13:22.389 --> 00:13:25.429
with something like vo3 opens up a whole world

00:13:25.429 --> 00:13:31.809
of amazing possibilities with visual work it's

00:13:31.809 --> 00:13:34.690
not just creating things but explaining solving

00:13:36.399 --> 00:13:39.360
um editing reasoning that there's so many possibilities

00:13:39.360 --> 00:13:41.399
that are unlocked with it i think the system

00:13:41.399 --> 00:13:43.419
card for this is quite illustrative and it's

00:13:43.419 --> 00:13:46.259
got some bits in it that i feels like a tipping

00:13:46.259 --> 00:13:48.860
point so the first thing you can now provide

00:13:48.860 --> 00:13:52.320
i think it's up to five characters as prompt

00:13:52.320 --> 00:13:54.539
images so i could put you know pictures of you

00:13:54.539 --> 00:13:57.769
and me and a couple other agency gang from the

00:13:57.769 --> 00:14:00.889
HubSpot inbound days and asked Nanobananas to

00:14:00.889 --> 00:14:03.149
start creating a bunch of unlikely scenarios

00:14:03.149 --> 00:14:05.129
for us to be in. And rather than maintaining

00:14:05.129 --> 00:14:07.169
character consistency for just one person, you

00:14:07.169 --> 00:14:08.669
could do it for more, which is really interesting.

00:14:09.309 --> 00:14:12.549
It can also do the same with up to 14 objects,

00:14:12.669 --> 00:14:14.769
which again seems a very bizarre number. Like

00:14:14.769 --> 00:14:18.309
15, oh, it's going to break. 13, safe. 14, on

00:14:18.309 --> 00:14:22.190
the edge. But again, if I wanted to control that

00:14:22.190 --> 00:14:24.450
scene even more, I could actually... add a load

00:14:24.450 --> 00:14:26.309
of objects to the scene and then it would put

00:14:26.309 --> 00:14:27.990
them all in the right places, which is pretty

00:14:27.990 --> 00:14:31.389
astounding. And I think what we may see, I suspect

00:14:31.389 --> 00:14:34.450
this won't be the model, but the next one will

00:14:34.450 --> 00:14:39.470
be, is a genuine image creation capability where

00:14:39.470 --> 00:14:40.990
if you're a marketer and you need to do a bunch

00:14:40.990 --> 00:14:43.830
of product images in different settings, or you're

00:14:43.830 --> 00:14:46.429
even trying to come up with like brief videos

00:14:46.429 --> 00:14:48.529
for those, because obviously you can use the

00:14:48.529 --> 00:14:50.529
image generation capabilities and then you can

00:14:50.529 --> 00:14:53.620
ask Veo to... turn those into video for you,

00:14:53.720 --> 00:14:57.440
is we can, as marketers, actually start to now

00:14:57.440 --> 00:15:00.440
create product imagery, right? Give it a product

00:15:00.440 --> 00:15:02.980
shot and then use it as you see fit. Put it on

00:15:02.980 --> 00:15:06.139
the moon, give it wings flying in space, like,

00:15:06.220 --> 00:15:08.899
you know, whatever you can imagine, really. And

00:15:08.899 --> 00:15:11.000
up until now, that's been a bit hit and miss.

00:15:11.159 --> 00:15:13.179
But this model feels like the one that's more

00:15:13.179 --> 00:15:16.500
consistently capable. So are we drifting, therefore,

00:15:16.600 --> 00:15:20.710
to a... post canva adobe world probably not for

00:15:20.710 --> 00:15:22.750
specialists people who are really good at creating

00:15:22.750 --> 00:15:24.330
images are probably still going to create things

00:15:24.330 --> 00:15:26.909
much better than you can just do in nano banana

00:15:26.909 --> 00:15:29.370
but yeah if you're just creating quick product

00:15:29.370 --> 00:15:31.789
imagery i wonder you know and you could probably

00:15:31.789 --> 00:15:35.149
semi -automate this with mcp or so there's probably

00:15:35.149 --> 00:15:37.429
some quite interesting things you could do that

00:15:37.429 --> 00:15:38.950
you just couldn't do with image models until

00:15:38.950 --> 00:15:41.429
now there's definitely been a shift in people's

00:15:41.429 --> 00:15:45.850
willingness to accept and to use publicly ai

00:15:45.850 --> 00:15:50.379
generated images I'm seeing it in posters, in

00:15:50.379 --> 00:15:53.840
shop windows. I'm seeing it for flyers, for events.

00:15:54.240 --> 00:15:57.019
I was on an e -commerce website this weekend

00:15:57.019 --> 00:16:00.179
and there was just something about the text that

00:16:00.179 --> 00:16:04.539
made it very clear to me it was a GPT -4 or GPT

00:16:04.539 --> 00:16:06.980
-5 generated image. It had that kind of softness

00:16:06.980 --> 00:16:10.440
to it that was very recognizable. So we're certainly

00:16:10.440 --> 00:16:13.480
seeing this more and more in the public, even

00:16:13.480 --> 00:16:15.899
though in some domains... People are still very

00:16:15.899 --> 00:16:19.039
anti -AI. I don't know if you saw this recently,

00:16:19.100 --> 00:16:24.779
but there's some pushback against Steam in the

00:16:24.779 --> 00:16:30.559
video game industry, Steam tag games if they

00:16:30.559 --> 00:16:34.240
use AI -generated content. And some developers

00:16:34.240 --> 00:16:36.799
are pushing back against that tag and that label.

00:16:37.059 --> 00:16:38.940
Yeah, I think the big issue there, I read that

00:16:38.940 --> 00:16:40.519
story, it was an interesting one. I think the

00:16:40.519 --> 00:16:44.190
big issue is... Everybody in the games development

00:16:44.190 --> 00:16:47.509
industry is probably using Gen AI in one way

00:16:47.509 --> 00:16:50.750
or another. So is it even a meaningful tag? I

00:16:50.750 --> 00:16:52.649
think that's probably quite a big issue. I thought

00:16:52.649 --> 00:16:55.490
you were going to say the Coca -Cola TV ad, which

00:16:55.490 --> 00:16:59.330
has been AI generated and has triggered a bit

00:16:59.330 --> 00:17:01.070
of a backlash. It's better than last year's ad.

00:17:01.149 --> 00:17:02.830
It still looks a bit weird in place as I saw

00:17:02.830 --> 00:17:05.630
it on TV over the weekend. I saw somebody do

00:17:05.630 --> 00:17:07.769
a breakdown of the different lorries that are

00:17:07.769 --> 00:17:11.170
in different trucks and they... took the bodies

00:17:11.170 --> 00:17:13.289
of them and turned them into a diagram. And apparently

00:17:13.289 --> 00:17:16.990
there's 14 different truck bodies used in that.

00:17:17.230 --> 00:17:20.089
Yeah, I looked at that and I thought, old tech.

00:17:20.390 --> 00:17:22.009
Yeah, honestly, I don't know how long they've

00:17:22.009 --> 00:17:24.049
been working on that as much as they've got AI.

00:17:24.230 --> 00:17:26.069
I think I read somewhere that there was still

00:17:26.069 --> 00:17:28.269
like ridiculously large amounts of people who

00:17:28.269 --> 00:17:31.049
worked on that project, like drifting towards

00:17:31.049 --> 00:17:33.450
100, which for me, I'm like, why are you using

00:17:33.450 --> 00:17:38.690
Gen AI then? But the reality is, I think I could

00:17:38.690 --> 00:17:42.460
spin up something better. in an hour with nano

00:17:42.460 --> 00:17:47.680
banana 2 and vaio 3 .1 um so that just makes

00:17:47.680 --> 00:17:49.440
me think it's old tech and next year's will probably

00:17:49.440 --> 00:17:51.819
be quite a bit better than that there may even

00:17:51.819 --> 00:17:55.279
be like rules on what models maybe they have

00:17:55.279 --> 00:17:57.980
to use open source models you know coca -cola

00:17:57.980 --> 00:18:00.579
is a big corporate and what have you um so yeah

00:18:00.579 --> 00:18:03.359
i'm not i'm not so sure but um yeah i think you're

00:18:03.359 --> 00:18:05.599
right i think attitudes are changing i mean my

00:18:05.599 --> 00:18:08.339
wife was showing me stuff on instagram this weekend

00:18:08.880 --> 00:18:10.819
And we were trying to guess how many of the videos

00:18:10.819 --> 00:18:15.660
were AI generated. And I reckon four out of five

00:18:15.660 --> 00:18:18.640
that we were looking at were AI generated. It

00:18:18.640 --> 00:18:20.579
was cats doing stuff that there's no way you

00:18:20.579 --> 00:18:24.640
get a cat to do. You know you've got a technology

00:18:24.640 --> 00:18:28.200
acceleration when one of the first use cases

00:18:28.200 --> 00:18:29.819
is always a cat. I mean, that's how the internet

00:18:29.819 --> 00:18:31.619
began. Look at these funny pictures of cats.

00:18:31.740 --> 00:18:33.680
Now we've got funny pictures of cats, but they're

00:18:33.680 --> 00:18:37.259
AI generated. So that in itself is a bit worrying

00:18:37.259 --> 00:18:40.450
because just... so much of a deluge of absolute

00:18:40.450 --> 00:18:42.930
crap but like some of them are kind of funny

00:18:42.930 --> 00:18:47.190
i saw one where the cats are um doing like work

00:18:47.190 --> 00:18:49.890
around the house like trying to fix like electric

00:18:49.890 --> 00:18:53.410
switches and fix cars and stuff like that it

00:18:53.410 --> 00:18:56.490
was kind of funny but um yeah i think it's the

00:18:56.490 --> 00:18:58.750
explosion we've been talking about is well underway

00:18:58.750 --> 00:19:02.490
let's shift gears let's talk claude because gemini

00:19:02.490 --> 00:19:06.200
3 was like basking in its glory for all of about

00:19:06.200 --> 00:19:08.960
a week, probably less, because Claude had been

00:19:08.960 --> 00:19:11.740
cooking Opus 4 .5. So quick reminder for the

00:19:11.740 --> 00:19:14.299
audience, because none of these AI companies

00:19:14.299 --> 00:19:16.660
make it easy to track what these different things

00:19:16.660 --> 00:19:20.579
mean. So Anthropix Claude is obviously not as

00:19:20.579 --> 00:19:23.059
popular a model as ChatGPT and Gemini, but still

00:19:23.059 --> 00:19:25.859
popular, especially among coders. And they have

00:19:25.859 --> 00:19:29.720
three flavors. They've got Haiku, Sonnet, and

00:19:29.720 --> 00:19:34.000
Opus, which approximately... is also the scale

00:19:34.000 --> 00:19:36.279
and size and capability of the model so haiku

00:19:36.279 --> 00:19:39.220
is the least capable very cheap to run sonnet's

00:19:39.220 --> 00:19:41.740
in the middle opus is a big gargantuan model

00:19:41.740 --> 00:19:44.019
that's very expensive to run and the thing that

00:19:44.019 --> 00:19:47.140
was super interesting this week is opus 4 .5

00:19:47.140 --> 00:19:50.099
came out sonnet 4 .5 is already out so the much

00:19:50.099 --> 00:19:53.279
it's the big brother version and it's actually

00:19:53.279 --> 00:19:56.380
still quite cheap to run also compared to like

00:19:56.380 --> 00:19:58.400
in general the opus models are very very much

00:19:58.400 --> 00:20:00.259
more expensive to run than the sonnet models

00:20:00.859 --> 00:20:04.160
But this one, not so much. And some people have

00:20:04.160 --> 00:20:06.339
been loving it online. What have you seen as

00:20:06.339 --> 00:20:09.140
it relates to Opus 4 .5, Martin? Yeah, so it

00:20:09.140 --> 00:20:12.880
came out swinging on the benchmarks again on

00:20:12.880 --> 00:20:16.200
a Gentic coding. So the SWE benchmark, it scored

00:20:16.200 --> 00:20:21.940
80 .9 versus Gemini Pro's 76 .2. On a Gentic

00:20:21.940 --> 00:20:25.940
terminal coding, it scored 59 .3 versus Gemini

00:20:25.940 --> 00:20:32.940
Pro's 54 .2. They had graduate level reasoning.

00:20:33.059 --> 00:20:35.900
This was an interesting one. So the GPQA diamond

00:20:35.900 --> 00:20:42.640
test, which is PhD level research questions,

00:20:42.980 --> 00:20:49.240
scored 87 versus Gemini Pro's 91 .9. So not quite

00:20:49.240 --> 00:20:52.599
state of the art there. But generally speaking,

00:20:52.740 --> 00:20:55.400
we are talking about a very, very capable model

00:20:55.400 --> 00:21:01.470
across the board. My personal experience with

00:21:01.470 --> 00:21:04.109
it is I'm just pleased that they've increased

00:21:04.109 --> 00:21:07.869
the limits. One of the big criticisms with Anthropic's

00:21:07.869 --> 00:21:12.009
policy for a subscriber, whether you are on pro

00:21:12.009 --> 00:21:15.230
or the plus plan or whatever it is for $20 a

00:21:15.230 --> 00:21:17.789
month, or you are on one of their max plans for

00:21:17.789 --> 00:21:21.769
$100 or $200 per month, the limits on Opus were

00:21:21.769 --> 00:21:24.109
laughably small. It had become a bit of a meme

00:21:24.109 --> 00:21:26.210
that you would put in one question and that would

00:21:26.210 --> 00:21:29.339
be you done for a month. But now they've opened

00:21:29.339 --> 00:21:33.259
up to max plan users. They've given you the same

00:21:33.259 --> 00:21:37.400
limits that you had for Sonnet 4 .5. And this

00:21:37.400 --> 00:21:40.259
model is really, really smart and really good.

00:21:40.900 --> 00:21:43.259
So I've been playing around with it for some

00:21:43.259 --> 00:21:47.900
strategic development and it adds a bit more

00:21:47.900 --> 00:21:53.279
finesse to Sonnet. For coding, I've only lightly

00:21:53.279 --> 00:21:56.099
tested it at this moment using Claude Code for

00:21:56.099 --> 00:22:00.069
a couple of projects. but looks really good really

00:22:00.069 --> 00:22:02.789
very capable yeah i think it's quite cool i think

00:22:02.789 --> 00:22:05.789
the thing that struck me here and it's there's

00:22:05.789 --> 00:22:07.769
a little bit of trying to infer what the major

00:22:07.769 --> 00:22:11.130
labs are doing my understanding stop me if i'm

00:22:11.130 --> 00:22:14.950
wrong mine is when we see these like number increments

00:22:14.950 --> 00:22:20.410
gpt5 becomes gpt 5 .1 is if we go from you know

00:22:20.410 --> 00:22:23.150
gpt5 to gpt6 it's because there's been another

00:22:23.150 --> 00:22:26.509
training run And a fundamental new model has

00:22:26.509 --> 00:22:28.670
been created. And then it goes through all the

00:22:28.670 --> 00:22:32.130
fine -tuning, reinforcement -based fine -tuning

00:22:32.130 --> 00:22:33.609
and all that other stuff. And then we have our

00:22:33.609 --> 00:22:35.450
new model. Whereas if it goes up in incrementation,

00:22:35.849 --> 00:22:38.450
5 to 5 .1, it's because they've maybe further

00:22:38.450 --> 00:22:41.049
fine -tuned the model or made some changes in

00:22:41.049 --> 00:22:43.829
the back end that change its personality or how

00:22:43.829 --> 00:22:47.750
it basically thinks and produces outputs. So

00:22:47.750 --> 00:22:52.089
you look at Gemini 3, that one assumes is a new

00:22:52.089 --> 00:22:54.940
training run, a new model. that was trained in

00:22:54.940 --> 00:22:56.759
a different way, more data, whatever, and is

00:22:56.759 --> 00:22:59.319
therefore significantly better than Gemini 2,

00:22:59.559 --> 00:23:04.819
2 .5, whatever. Opus 4 .5 is not a new model.

00:23:05.000 --> 00:23:08.539
And yet it has these capabilities that outstrip

00:23:08.539 --> 00:23:12.339
Gemini 3 in a number of areas, which for me is

00:23:12.339 --> 00:23:15.079
interesting because it suggests that the innovation

00:23:15.079 --> 00:23:17.980
in the fine -tuning and the reinforcement -based

00:23:17.980 --> 00:23:20.380
learning and whatever they're doing in post -training

00:23:20.380 --> 00:23:23.099
to improve model outputs can actually have quite

00:23:23.099 --> 00:23:28.039
a big impact on what the models can do. So I

00:23:28.039 --> 00:23:30.960
find that quite interesting because in the beginning,

00:23:31.019 --> 00:23:32.480
we were like, need a new model release, need

00:23:32.480 --> 00:23:34.240
a new model release. But then obviously we got

00:23:34.240 --> 00:23:38.400
the own models from ChatGPT, from OpenAI, which

00:23:38.400 --> 00:23:40.579
could think they took some time before they gave

00:23:40.579 --> 00:23:43.380
an answer. And that gave us a big leap. And there's

00:23:43.380 --> 00:23:45.420
something, I think there's something special

00:23:45.420 --> 00:23:48.859
in here. They've done something with Opus's ability

00:23:48.859 --> 00:23:52.779
to think and produce good outputs that's not

00:23:52.779 --> 00:23:55.539
based on just throwing the kitchen sink of data

00:23:55.539 --> 00:23:59.240
at it. It's something else. Yeah, and that really

00:23:59.240 --> 00:24:02.019
comes through in the token usage. So I don't

00:24:02.019 --> 00:24:05.200
know if you saw this from the report. They gave

00:24:05.200 --> 00:24:09.880
it some puzzles, a series of challenges, an escape

00:24:09.880 --> 00:24:12.869
room challenge. And they... gave that challenge

00:24:12.869 --> 00:24:18.109
to sonnet 4 .5 as well as opus 4 .5 and they

00:24:18.109 --> 00:24:21.349
gave them the ability to use advanced tools so

00:24:21.349 --> 00:24:24.490
opus 4 .5 is exceptionally good at function calling

00:24:24.490 --> 00:24:27.690
or being able to use tools like web search or

00:24:27.690 --> 00:24:31.450
code execution things like that and it was able

00:24:31.450 --> 00:24:36.410
to solve the problems using far fewer tokens

00:24:36.410 --> 00:24:38.829
i think this is really interesting for developers

00:24:38.829 --> 00:24:40.529
or when we start to think about what this means

00:24:41.160 --> 00:24:46.000
in real world applications within business. If

00:24:46.000 --> 00:24:49.180
you've got long time horizon tasks that people

00:24:49.180 --> 00:24:54.579
need to execute, you don't want it burning through

00:24:54.579 --> 00:24:57.099
huge amounts of tokens costing you a great deal

00:24:57.099 --> 00:25:01.279
of expense. Whereas this new model is doing exactly

00:25:01.279 --> 00:25:04.200
what you'd want. It's optimizing. So the amount

00:25:04.200 --> 00:25:09.079
of tokens that it was using was substantially

00:25:09.079 --> 00:25:11.839
less so just to bring this to life software engineering

00:25:11.839 --> 00:25:15.299
with effort controls opus 4 .5 was achieving

00:25:15.299 --> 00:25:21.960
around 81 accuracy with around 12 000 tokens

00:25:21.960 --> 00:25:25.420
being used whereas sonnet 4 .5 was achieving

00:25:25.420 --> 00:25:31.220
76 77 accuracy using 22 000 tokens so it was

00:25:31.220 --> 00:25:33.480
using double the tokens and it was less accurate

00:25:33.480 --> 00:25:35.940
yeah i think it's i think it's cool i think this

00:25:35.940 --> 00:25:39.160
is What makes Opus 4 .5 interesting? It's in

00:25:39.160 --> 00:25:41.660
the nuances. There's another thing in the system

00:25:41.660 --> 00:25:44.940
card. One of the benchmarks involves being a

00:25:44.940 --> 00:25:48.700
customer service agent bot, like chatbot, that

00:25:48.700 --> 00:25:52.440
solves a customer's problem. And in one of the

00:25:52.440 --> 00:25:56.579
cases, Opus 4 .5, it failed the test because

00:25:56.579 --> 00:26:00.119
it didn't give the outcome expected. But what

00:26:00.119 --> 00:26:02.559
it actually did is found a really creative way

00:26:02.559 --> 00:26:04.940
to deliver on the customer's requirements based

00:26:04.940 --> 00:26:07.990
on the sort of knowledge base of information

00:26:07.990 --> 00:26:10.869
it worked out how to give the customer what the

00:26:10.869 --> 00:26:14.450
customer wanted by effectively exploiting a loophole

00:26:14.450 --> 00:26:17.609
in how the particular service in this case it's

00:26:17.609 --> 00:26:21.230
um changing a flight booking um so it failed

00:26:21.230 --> 00:26:23.170
the test but because it was too creative and

00:26:23.170 --> 00:26:26.049
it did something that was not specced that maybe

00:26:26.049 --> 00:26:29.890
a human wouldn't have even known how to do and

00:26:29.890 --> 00:26:31.630
that again that's the first time this type of

00:26:31.630 --> 00:26:33.690
things come up and then there was one other thing

00:26:33.690 --> 00:26:38.940
where Claude administers a sort of homework test

00:26:38.940 --> 00:26:42.839
to new developers that apply to work at Anthropic.

00:26:43.140 --> 00:26:47.400
And for the first time, Opus 4 .5 gets a higher

00:26:47.400 --> 00:26:50.140
score on average than the humans who take that

00:26:50.140 --> 00:26:52.940
test. And so if you're watching for things like

00:26:52.940 --> 00:26:56.900
recursive improvement in the capability of agents,

00:26:57.160 --> 00:26:59.980
well, here's a pretty strong indicator that Opus

00:26:59.980 --> 00:27:03.759
4 .5 is probably at that capability, where it's

00:27:04.109 --> 00:27:06.970
able to pass tests that a high -end developer

00:27:06.970 --> 00:27:09.210
would probably be administered. So what else

00:27:09.210 --> 00:27:11.309
can it do? And then the final thing I really

00:27:11.309 --> 00:27:14.170
liked is there's an implication that Opus is

00:27:14.170 --> 00:27:17.690
very good at managing other models. So its ability

00:27:17.690 --> 00:27:20.589
to act as an orchestrator. And this is important,

00:27:20.730 --> 00:27:23.089
right? We talked on a previous episode about

00:27:23.089 --> 00:27:25.390
different trigger points for what the future

00:27:25.390 --> 00:27:27.869
of marketing business and life looks like. And

00:27:27.869 --> 00:27:31.569
one of those trigger points is agent models that

00:27:31.569 --> 00:27:35.289
can manage other agent models. both in terms

00:27:35.289 --> 00:27:37.369
of expanding their capabilities, but also then

00:27:37.369 --> 00:27:39.589
you can create niche agents that are very good

00:27:39.589 --> 00:27:41.329
at a certain thing, like writing social media

00:27:41.329 --> 00:27:43.849
posts, and an orchestrator agent that's like

00:27:43.849 --> 00:27:45.990
a strategist that knows when to deploy those

00:27:45.990 --> 00:27:48.450
agents and can verify and validate that their

00:27:48.450 --> 00:27:51.769
work is good. And it looks like Opus 4 .5 is

00:27:51.769 --> 00:27:55.789
very good at delegating to Haiku 4 .5. Haiku,

00:27:55.990 --> 00:27:59.650
very cheap, very fast to run, very low latency.

00:28:00.089 --> 00:28:03.339
So again, one of those trigger points is potentially

00:28:03.339 --> 00:28:05.299
been reached that we talked about on a previous

00:28:05.299 --> 00:28:08.539
episode whereby you might have a campaign planning

00:28:08.539 --> 00:28:13.339
agent that is opus 4 .5 and execution agents

00:28:13.339 --> 00:28:17.160
that are being driven by haiku imagine what opus

00:28:17.160 --> 00:28:20.000
5 and haiku 5 are going to be able to do if this

00:28:20.000 --> 00:28:23.380
is what 4 .5 can do all of this is now available

00:28:23.380 --> 00:28:27.700
across the major cloud platforms as well so this

00:28:27.700 --> 00:28:30.619
was a big announcement that uh anthropic have

00:28:30.619 --> 00:28:35.690
signed a deal with microsoft azure so it's available

00:28:35.690 --> 00:28:38.690
on azure amazon web services and google cloud

00:28:38.690 --> 00:28:43.230
platform for the first time interestingly anthropic

00:28:43.230 --> 00:28:46.049
models are now available in microsoft co -pilot

00:28:46.049 --> 00:28:49.849
as well so that's a big change if you're a co

00:28:49.849 --> 00:28:53.490
-pilot 365 user and you've always been using

00:28:53.490 --> 00:28:57.490
the baked in microsoft models or you've clicked

00:28:57.490 --> 00:29:01.839
on the try gpt5 Well, now Claude has also been

00:29:01.839 --> 00:29:05.019
unlocked and made available. However, if you

00:29:05.019 --> 00:29:08.579
do use Anthropic, that does actually use Anthropic

00:29:08.579 --> 00:29:11.900
servers. So I'm aware that there are some governance

00:29:11.900 --> 00:29:14.140
issues. It does take the information outside

00:29:14.140 --> 00:29:18.980
of your tenant at the moment. That's super interesting

00:29:18.980 --> 00:29:21.900
because word on the street and revenue figures

00:29:21.900 --> 00:29:25.039
would potentially back this up says it's enterprises

00:29:25.039 --> 00:29:29.029
that are favouring Claude. especially when it

00:29:29.029 --> 00:29:31.430
comes to coding use cases, which makes perfect

00:29:31.430 --> 00:29:35.250
sense. And so finding more and more ways to put

00:29:35.250 --> 00:29:37.990
their models in enterprise level software, obviously

00:29:37.990 --> 00:29:39.690
getting it into Google Workspace is going to

00:29:39.690 --> 00:29:43.230
be tricky. But Microsoft is the fair game route

00:29:43.230 --> 00:29:45.329
to market because they don't have their own models.

00:29:45.390 --> 00:29:48.549
So I think that's kind of interesting. Last bit

00:29:48.549 --> 00:29:52.589
of model news this week was DeepSeek's open source

00:29:52.589 --> 00:29:55.640
model. but there's like a fine -tuned version.

00:29:55.740 --> 00:29:58.259
Is it 3 .2 they're up to? I can't remember. Anyway,

00:29:58.359 --> 00:30:03.079
this model managed to get gold on the IMO Math

00:30:03.079 --> 00:30:07.059
Olympiad. So it was big news a couple of months

00:30:07.059 --> 00:30:11.759
ago that Google and OpenAI were able to achieve

00:30:11.759 --> 00:30:13.900
gold on these really super hard math problems

00:30:13.900 --> 00:30:17.400
that most people would find too hard to do, which

00:30:17.400 --> 00:30:20.059
is impressive. We in the market don't have access

00:30:20.059 --> 00:30:23.339
to those models yet. I'm sure they'll come to

00:30:23.339 --> 00:30:26.200
us soon, but especially now there's an open source

00:30:26.200 --> 00:30:30.039
model that can get those level of results. So

00:30:30.039 --> 00:30:33.339
A, I expect that Google and OpenAI will need

00:30:33.339 --> 00:30:35.420
to release access to their smarter models soon.

00:30:35.759 --> 00:30:39.279
And B, open source models, you can see the weights,

00:30:39.440 --> 00:30:42.480
you can further fine tune them. This is a very

00:30:42.480 --> 00:30:46.150
smart model. for open source community to be

00:30:46.150 --> 00:30:47.589
able to get their hands on. What are your thoughts

00:30:47.589 --> 00:30:49.970
on this one, Mike? Just goes to show that the

00:30:49.970 --> 00:30:53.529
frontier is moving across closed and open models.

00:30:53.549 --> 00:30:57.150
And I think the competition is hot. Does lead

00:30:57.150 --> 00:31:00.230
me to a thought that ultimately models are going

00:31:00.230 --> 00:31:03.369
to be commoditized in the not too distant future.

00:31:03.880 --> 00:31:05.339
It's going to be interesting to see how that

00:31:05.339 --> 00:31:07.279
plays out. Definitely the moats of user base

00:31:07.279 --> 00:31:09.200
are going to be critical. I mean, Sam Altman

00:31:09.200 --> 00:31:12.619
issued a war footing internal memo, didn't he?

00:31:12.660 --> 00:31:14.400
At the same time as all this happening to say,

00:31:14.500 --> 00:31:17.440
guess what? ChatGPT and OpenAI, we have some

00:31:17.440 --> 00:31:19.420
real hardcore competitors now and we are going

00:31:19.420 --> 00:31:22.200
to have to, you know, really push hard to make

00:31:22.200 --> 00:31:24.519
sure that we stay ahead of the game and stay

00:31:24.519 --> 00:31:27.180
relevant because whatever gap there was between

00:31:27.180 --> 00:31:29.579
OpenAI and the rest of the market appears to

00:31:29.579 --> 00:31:32.970
be dissolving. pretty darn quickly and they are

00:31:32.970 --> 00:31:36.710
now well and truly in the race to try and stay

00:31:36.710 --> 00:31:38.410
at the top of the leaderboards and usability

00:31:38.410 --> 00:31:41.589
fortunately for them they have managed to accrue

00:31:41.589 --> 00:31:44.210
a large number of users which is going to be

00:31:44.210 --> 00:31:47.289
critical for them to remain in but you know for

00:31:47.289 --> 00:31:49.349
me now because gem because we're a google workspace

00:31:49.349 --> 00:31:52.549
company and having a strong gemini model is very

00:31:52.549 --> 00:31:55.500
very handy for me like If I had to choose, honestly,

00:31:55.599 --> 00:31:57.839
I'd rather Gemini just gets better and better

00:31:57.839 --> 00:32:00.200
because it's just all my data is already in Google.

00:32:00.279 --> 00:32:04.000
It just makes my life super easy. I don't really

00:32:04.000 --> 00:32:06.640
want another platform that can access my data

00:32:06.640 --> 00:32:10.500
like OpenAI and ChatGPT. So, yeah, I think they're

00:32:10.500 --> 00:32:13.099
under pressure. They are. And that was further

00:32:13.099 --> 00:32:16.460
compounded by HSBC putting out a report this

00:32:16.460 --> 00:32:23.890
week saying that they think by 2030. OpenAI will

00:32:23.890 --> 00:32:28.690
still be cash flow negative, leaving a $207 billion

00:32:28.690 --> 00:32:33.309
funding shortfall. Yeah, growth required. Revenue

00:32:33.309 --> 00:32:37.230
growth required. But I think something that's

00:32:37.230 --> 00:32:41.009
in the definitely active conversation is, is

00:32:41.009 --> 00:32:44.069
this a bubble, right? Is all the money flowing

00:32:44.069 --> 00:32:46.670
into AI going to burst? And you certainly won't

00:32:46.670 --> 00:32:49.329
find any financial and investing advice on this

00:32:49.329 --> 00:32:52.650
podcast. But one of the key... retorts to that

00:32:52.650 --> 00:32:56.009
would be supply and demand and this feeling that

00:32:56.009 --> 00:32:59.849
intelligence baked into so many areas we don't

00:32:59.849 --> 00:33:02.069
have the data centers and the compute to serve

00:33:02.069 --> 00:33:04.589
even the amount of intelligence we want now you

00:33:04.589 --> 00:33:06.869
talked about earlier not being able to get access

00:33:06.869 --> 00:33:11.309
to enough clod to meet your needs um some of

00:33:11.309 --> 00:33:12.990
that's because compute is expensive but some

00:33:12.990 --> 00:33:15.349
of it's because it's limited and they're having

00:33:15.349 --> 00:33:18.640
to ring fence compute for inference which is

00:33:18.640 --> 00:33:20.359
when we all use the model that's called inference

00:33:20.359 --> 00:33:23.339
when it gives us outputs um but they need to

00:33:23.339 --> 00:33:25.579
keep some model back for training some sorry

00:33:25.579 --> 00:33:27.420
some compute back for training and i think that's

00:33:27.420 --> 00:33:29.380
probably proving a difficult balance so when

00:33:29.380 --> 00:33:34.819
the when when we start to look like we are meeting

00:33:34.819 --> 00:33:38.339
demand and starting to gain a surplus in compute

00:33:38.339 --> 00:33:40.960
i don't know that that would probably be quite

00:33:40.960 --> 00:33:44.660
an interesting sign um do you have any thoughts

00:33:44.660 --> 00:33:49.500
on that line leads me Well, a story that we've

00:33:49.500 --> 00:33:54.079
got later on in the pod, really, which is about

00:33:54.079 --> 00:33:58.119
how much work can be automated. In fact, I think

00:33:58.119 --> 00:34:00.240
we should maybe skip ahead to this story now.

00:34:00.319 --> 00:34:04.200
McKinsey put out a report saying that AI can

00:34:04.200 --> 00:34:09.820
now automate 57 % of US work. And in the report,

00:34:09.900 --> 00:34:13.340
they don't talk about it being about the eradication

00:34:13.340 --> 00:34:17.300
of jobs. They say what's required is... A reimagining

00:34:17.300 --> 00:34:21.960
of workflows and the kind of operations that

00:34:21.960 --> 00:34:27.400
we do and how we do them is required. And I think

00:34:27.400 --> 00:34:30.840
that's the key thing here. These models are currently

00:34:30.840 --> 00:34:34.400
underutilized. Most of the productivity gains

00:34:34.400 --> 00:34:38.079
at the moment are seen by people having chats

00:34:38.079 --> 00:34:40.780
with the chatbot. In fact, Anthropic did a report

00:34:40.780 --> 00:34:44.139
this week about that very thing. They found that...

00:34:44.480 --> 00:34:47.280
They tried to estimate how much productivity

00:34:47.280 --> 00:34:50.320
was being driven through people using chatbots.

00:34:50.420 --> 00:34:54.199
They found that they think 80 % of workflows

00:34:54.199 --> 00:34:58.199
are being completed by the AI or the time reduction

00:34:58.199 --> 00:35:01.840
is an 80 % time saving where people use a chatbot

00:35:01.840 --> 00:35:07.579
in their workflow. But that's still in the chat

00:35:07.579 --> 00:35:11.079
UI domain, right? That's still people using the

00:35:11.079 --> 00:35:13.500
assistant to go back and forth. Whereas... When

00:35:13.500 --> 00:35:16.940
you build workflows where data and processes

00:35:16.940 --> 00:35:21.760
are being fully operated by an AI, we can remove

00:35:21.760 --> 00:35:24.079
the need for that. And we haven't got to that

00:35:24.079 --> 00:35:27.320
point in the economy just yet. So I think that

00:35:27.320 --> 00:35:29.599
that's where we're heading and that's where the

00:35:29.599 --> 00:35:33.159
compute demand will come from. If they're going

00:35:33.159 --> 00:35:40.179
to see the gap in cash flow, that 207 billion,

00:35:40.460 --> 00:35:43.010
if that's going to be bridged. they're going

00:35:43.010 --> 00:35:45.889
to need companies to rebuild entire workflows

00:35:45.889 --> 00:35:50.469
and replace entire work streams with AI. Yeah,

00:35:50.489 --> 00:35:53.570
I'm still finding this quite hard to predict

00:35:53.570 --> 00:35:57.690
and almost to marry up what a lot of these reports

00:35:57.690 --> 00:36:00.489
say and a lot of my own personal experience.

00:36:00.849 --> 00:36:03.050
And I think one of the dangers is actually, you

00:36:03.050 --> 00:36:05.010
and I have talked a lot about the early productivity

00:36:05.010 --> 00:36:09.019
gains that we saw, right? dictating everything

00:36:09.019 --> 00:36:11.539
rather than typing doubles my speed instantly

00:36:11.539 --> 00:36:15.119
and it doesn't need complex model thinking models

00:36:15.119 --> 00:36:17.280
in fact the best thing that's happened to me

00:36:17.280 --> 00:36:20.099
in the last three or four months is having access

00:36:20.099 --> 00:36:23.300
to nvidia's transcription model which is near

00:36:23.300 --> 00:36:25.340
real time because now i don't have to wait for

00:36:25.340 --> 00:36:27.380
it to output and that and i can make it quicker

00:36:27.380 --> 00:36:30.820
and blah blah so it's there's this weird thing

00:36:30.820 --> 00:36:33.599
about humans that we absorb these changes pretty

00:36:33.599 --> 00:36:37.139
quickly and then we're like right what next And

00:36:37.139 --> 00:36:39.880
if I really step back from my workflow, I would

00:36:39.880 --> 00:36:42.559
have to really still reflect on how much faster

00:36:42.559 --> 00:36:45.920
I'm able to do so many of my work tasks like

00:36:45.920 --> 00:36:49.400
producing content, writing emails, just generally

00:36:49.400 --> 00:36:52.599
communicate with my team, writing strategic plans,

00:36:52.860 --> 00:36:54.780
thinking through those plans and looking for

00:36:54.780 --> 00:36:58.019
gaps in them that, to be honest, I'm not sure

00:36:58.019 --> 00:37:00.280
I would have done if I didn't have a thinking

00:37:00.280 --> 00:37:02.699
partner to use. So it's hard to say. I mean,

00:37:02.719 --> 00:37:07.329
in that report. The Anthropic, they predicted

00:37:07.329 --> 00:37:09.789
that it was reducing task completion time by

00:37:09.789 --> 00:37:13.590
80%, which is pretty significant. But at the

00:37:13.590 --> 00:37:16.289
same time, they only predicted that they could

00:37:16.289 --> 00:37:21.449
increase productivity to 1 .8 % annually over

00:37:21.449 --> 00:37:24.170
the next decade, which is a long time to get

00:37:24.170 --> 00:37:26.889
up to that point, which is double the current

00:37:26.889 --> 00:37:30.869
productivity growth, right? But I'm sure it's

00:37:30.869 --> 00:37:34.010
meaningful at the level of... countries and gdps

00:37:34.010 --> 00:37:37.949
yeah macro it's substantial right but it's like

00:37:37.949 --> 00:37:41.610
you can do this task 80 faster you can do twice

00:37:41.610 --> 00:37:44.510
as much work in a day and so i predict productivity

00:37:44.510 --> 00:37:48.789
growth to go from 1 to 1 .8 i don't know it feels

00:37:48.789 --> 00:37:51.550
like something's missing in that it's probably

00:37:51.550 --> 00:37:54.550
a maths and statistics question and like i say

00:37:54.550 --> 00:37:56.909
i can still imagine that a country level that

00:37:56.909 --> 00:38:00.219
is significant but um yeah i'm start i'm struggling

00:38:00.219 --> 00:38:02.099
to reconcile all of that because at the same

00:38:02.099 --> 00:38:04.360
time you've got reports of work slop so it's

00:38:04.360 --> 00:38:07.960
like yeah you're 80 faster the outputs were shit

00:38:07.960 --> 00:38:10.159
and unusable oh okay well now that's meaningless

00:38:10.159 --> 00:38:13.039
they did refer to that in the article as well

00:38:13.039 --> 00:38:14.619
though they did say that it didn't take into

00:38:14.619 --> 00:38:18.000
account time spent validating the outputs from

00:38:18.000 --> 00:38:21.340
uh from the models and important people even

00:38:21.340 --> 00:38:23.880
doing that well what was the there's another

00:38:23.880 --> 00:38:26.099
report that came out that we're going to talk

00:38:26.099 --> 00:38:29.699
about as well And in part of that report, let's

00:38:29.699 --> 00:38:32.519
move on to that now quickly. Before we do, the

00:38:32.519 --> 00:38:35.159
other thing to, I think, has to be taken with

00:38:35.159 --> 00:38:38.099
a pinch of salt on some of these reports is McKinsey

00:38:38.099 --> 00:38:42.780
predicting that, what was it, 57 % of US work

00:38:42.780 --> 00:38:44.800
could be automated with current tools alone?

00:38:45.480 --> 00:38:50.099
Question mark. I'm not so sure about that. What

00:38:50.099 --> 00:38:53.199
are your thoughts? Are you saying that McKinsey

00:38:53.199 --> 00:39:00.210
isn't the oracle? isn't wholly reliable no i

00:39:00.210 --> 00:39:03.309
think the the major consultancies they're so

00:39:03.309 --> 00:39:07.730
enterprise focused i think these that they they

00:39:07.730 --> 00:39:09.670
get things wrong frequently enough but they'll

00:39:09.670 --> 00:39:11.829
just put out another report in in six months

00:39:11.829 --> 00:39:14.409
i'm saying something new and it gives people

00:39:14.409 --> 00:39:16.150
like us something to talk about on a podcast

00:39:16.150 --> 00:39:18.829
i agree i think the punch line to that report

00:39:18.829 --> 00:39:23.929
is 57 of tasks can be automated or whatever the

00:39:23.929 --> 00:39:27.320
statement was dot dot dot And if you just contact

00:39:27.320 --> 00:39:30.400
McKinsey, we'll help you realise those games.

00:39:31.059 --> 00:39:34.039
Do you know what I mean? We also, Martin and

00:39:34.039 --> 00:39:36.639
Paul also offer consulting and training to help

00:39:36.639 --> 00:39:39.059
you realise the gains of AI. So I'm completely

00:39:39.059 --> 00:39:41.519
on board with the concept that we can help you

00:39:41.519 --> 00:39:44.559
automate 57 % of the work that your humans do.

00:39:45.119 --> 00:39:47.280
I just doubt that that's going to be possible

00:39:47.280 --> 00:39:51.500
in the near to midterm because processes are

00:39:51.500 --> 00:39:54.139
messy. They haven't been designed for AI. Humans

00:39:54.139 --> 00:39:57.989
being a lot of... creativity ingenuity problem

00:39:57.989 --> 00:40:01.769
solving and experience -based understanding of

00:40:01.769 --> 00:40:04.269
context that i just don't see in any ai models

00:40:04.269 --> 00:40:07.269
yet that is going to have to significantly improve

00:40:07.269 --> 00:40:11.110
before we can get anywhere near the number that

00:40:11.110 --> 00:40:13.269
mckinsey's saying here as far as i'm concerned

00:40:13.269 --> 00:40:16.309
but um what about this gartner report as well

00:40:16.309 --> 00:40:20.050
got another report in the mix yeah so this one

00:40:20.050 --> 00:40:25.889
was uh the strategic predictions for 2026 and

00:40:25.889 --> 00:40:28.389
AI is peppered throughout, but there were just

00:40:28.389 --> 00:40:32.570
a couple of insights that I thought were worth

00:40:32.570 --> 00:40:36.309
bringing to the discussion table, Paul. So one

00:40:36.309 --> 00:40:39.949
of their predictions is that we will see a surge

00:40:39.949 --> 00:40:44.309
of lazy thinking. So the prediction is through

00:40:44.309 --> 00:40:48.250
2026, atrophy of critical thinking skills due

00:40:48.250 --> 00:40:51.900
to generative AI use will push 50 % of global

00:40:51.900 --> 00:40:56.960
organizations to require AI -free skills assessments.

00:40:57.199 --> 00:41:00.579
So basically putting people through skills assessments

00:41:00.579 --> 00:41:04.219
where they are not allowed to use Gen AI as an

00:41:04.219 --> 00:41:07.280
assistant or a tool. So the ability to think

00:41:07.280 --> 00:41:10.239
independently and creatively will become both

00:41:10.239 --> 00:41:12.840
increasingly rare and increasingly valuable.

00:41:13.539 --> 00:41:15.539
Can we double click on that one for a moment?

00:41:15.579 --> 00:41:18.239
Because I think this is a big deal. To stress,

00:41:18.300 --> 00:41:21.590
this Gartner report is... a bunch of smart people

00:41:21.590 --> 00:41:23.909
sitting in a room going if this happens then

00:41:23.909 --> 00:41:25.889
this will happen then this will happen that's

00:41:25.889 --> 00:41:27.989
a prediction rather than it being very like super

00:41:27.989 --> 00:41:30.630
data based and analysis based as far as we can

00:41:30.630 --> 00:41:32.449
tell doesn't mean that these predictions are

00:41:32.449 --> 00:41:34.150
not interesting and valuable but it's important

00:41:34.150 --> 00:41:37.889
context right um i think this is a thing like

00:41:37.889 --> 00:41:40.949
i i have to work so much harder to keep my critical

00:41:40.949 --> 00:41:44.389
thinking brain engaged now than ever before because

00:41:44.389 --> 00:41:47.250
why not work with your thinking assistant to

00:41:47.250 --> 00:41:50.059
help outsource some of your thinking right that's

00:41:50.059 --> 00:41:53.420
kind of the point in many cases and yet in a

00:41:53.420 --> 00:41:56.500
world of ai hallucinations critical thinking

00:41:56.500 --> 00:41:59.940
has never been more important so but i see it

00:41:59.940 --> 00:42:03.079
atrophying in myself if i'm not careful and i

00:42:03.079 --> 00:42:05.780
think i think humans are very good at finding

00:42:05.780 --> 00:42:08.320
a route to an outcome that requires the least

00:42:08.320 --> 00:42:10.280
energy possible on their part i don't think it's

00:42:10.280 --> 00:42:12.559
conscious i think it's subconscious and so i

00:42:12.559 --> 00:42:15.380
think this is a real danger i wholeheartedly

00:42:15.380 --> 00:42:18.719
agree and this is very real and very manifest

00:42:18.719 --> 00:42:22.659
in one particular domain that I'm associated

00:42:22.659 --> 00:42:27.159
with. So peer reviews of journal papers. It's

00:42:27.159 --> 00:42:31.880
rife with people trying to prompt inject where

00:42:31.880 --> 00:42:36.000
people have put notes in their paper submissions

00:42:36.000 --> 00:42:41.960
to the AI reviewer to give them positive reviews.

00:42:42.099 --> 00:42:44.019
And clearly there are people out there doing

00:42:44.019 --> 00:42:47.469
peer reviews where they're throwing it into Claude

00:42:47.469 --> 00:42:52.030
and saying, what do you think of this? Give feedback

00:42:52.030 --> 00:42:57.590
on this paper. Now, I do peer reviews for the

00:42:57.590 --> 00:43:02.590
Applied Marketing Analytics Journal, and I have

00:43:02.590 --> 00:43:06.869
to make sure that, well, my approach to this

00:43:06.869 --> 00:43:09.349
is I'll always read the paper first. I will formulate

00:43:09.349 --> 00:43:13.469
my own ideas. I will write my first draft, but

00:43:13.469 --> 00:43:19.329
I do. run it through an ai as well but i have

00:43:19.329 --> 00:43:21.610
to make sure that i've done that after i've read

00:43:21.610 --> 00:43:26.489
the paper and done the thinking first it's a

00:43:26.489 --> 00:43:28.610
it's a fascinating process to go through i think

00:43:28.610 --> 00:43:33.329
reviewing with an ai assistant um and if you

00:43:33.329 --> 00:43:35.150
do it the other way around you're so biased to

00:43:35.150 --> 00:43:36.829
what you've just been told like if you were to

00:43:36.829 --> 00:43:39.030
throw it in and it says oh these are the things

00:43:39.030 --> 00:43:42.010
that are good or bad whereas i'll bring my own

00:43:42.010 --> 00:43:44.840
thoughts to the table and then kind of run it

00:43:44.840 --> 00:43:48.139
through and and do a collaborative review and

00:43:48.139 --> 00:43:51.219
going through that process enables me to go actually

00:43:51.219 --> 00:43:53.320
no that's not a valid concern and there are a

00:43:53.320 --> 00:43:57.300
few points that i i have i try not to to be over

00:43:57.300 --> 00:43:59.699
reliance on it but there are definitely things

00:43:59.699 --> 00:44:02.079
where feedback i've taken from a prompt from

00:44:02.079 --> 00:44:05.880
a an ai and gone actually that's a that's a valid

00:44:05.880 --> 00:44:08.679
piece of criticism or feedback that i'd overlooked

00:44:08.679 --> 00:44:14.630
um but there's i would say 50 % of the feedback,

00:44:14.809 --> 00:44:16.869
if it's just left to its own devices, is irrelevant

00:44:16.869 --> 00:44:20.429
and not valid. But there must be armies of people

00:44:20.429 --> 00:44:23.530
out there that are doing this journal reviews

00:44:23.530 --> 00:44:25.389
and feedback that are literally just throwing

00:44:25.389 --> 00:44:30.909
it in, becoming incredibly lazy, outsourcing

00:44:30.909 --> 00:44:34.550
their thinking entirely. And it is a problem,

00:44:34.630 --> 00:44:37.349
as we've discussed previously with the rise of

00:44:37.349 --> 00:44:40.539
WorkSlop. Yeah, let's just break that down quickly

00:44:40.539 --> 00:44:43.559
because I think it's such a critical point. And

00:44:43.559 --> 00:44:45.980
it's almost a philosophical point, which is,

00:44:46.000 --> 00:44:49.960
is it your job to look for gaps in the AI's thinking?

00:44:50.619 --> 00:44:53.920
Or is it the AI's job to help you look for gaps

00:44:53.920 --> 00:44:57.139
in your own thinking? And ultimately, the biggest

00:44:57.139 --> 00:44:59.559
factor that I think is going to drive this is

00:44:59.559 --> 00:45:02.360
workloads and time pressure. Now, do you know

00:45:02.360 --> 00:45:05.000
anyone, Martin, whose workload is so easy and

00:45:05.000 --> 00:45:07.619
their time pressure is so easy? that they're

00:45:07.619 --> 00:45:10.059
going to take the hard route and the slow route,

00:45:10.119 --> 00:45:12.039
or they're going to take the fast route and the

00:45:12.039 --> 00:45:14.760
easy route. Most people are under crazy amounts

00:45:14.760 --> 00:45:18.179
of time pressure. I mean, for me, the overwhelming

00:45:18.179 --> 00:45:22.360
trend of the last 15 years is how much more work

00:45:22.360 --> 00:45:25.599
can a single person do if we just keep throwing

00:45:25.599 --> 00:45:27.760
stuff at them? It's very easy to forget what

00:45:27.760 --> 00:45:30.619
it was like when we didn't have all of our work.

00:45:30.989 --> 00:45:33.110
on our mobile phones constantly, and you could

00:45:33.110 --> 00:45:36.389
be checking a Google Doc at 9pm at night and

00:45:36.389 --> 00:45:38.309
rattling off a Slack message. That's not how

00:45:38.309 --> 00:45:41.889
life used to be. And so I can absolutely see

00:45:41.889 --> 00:45:44.449
people defaulting to this, probably because they'll

00:45:44.449 --> 00:45:46.570
be like, I'm sure it will be okay. And in some

00:45:46.570 --> 00:45:50.630
cases, it absolutely will be. But we often get

00:45:50.630 --> 00:45:54.530
asked, especially by parents of kids that are

00:45:54.530 --> 00:45:58.809
maybe like 18 to 25, what should they do, right?

00:45:58.929 --> 00:46:02.260
What should their kids do? And that level of

00:46:02.260 --> 00:46:04.880
critical thinking skill is going to be so, so,

00:46:04.900 --> 00:46:09.940
so important. Your ability to call a model out

00:46:09.940 --> 00:46:12.840
for getting stuff wrong or for being able to

00:46:12.840 --> 00:46:16.099
provide interesting ideas that models wouldn't

00:46:16.099 --> 00:46:18.219
come up with, I think, is where some of the magic

00:46:18.219 --> 00:46:20.659
really lies. And it reminds me of my academic

00:46:20.659 --> 00:46:23.320
career, like working as an academic scientist

00:46:23.320 --> 00:46:25.679
is really thrilling in some ways and brutal in

00:46:25.679 --> 00:46:28.599
others, because basically the training is. let's

00:46:28.599 --> 00:46:31.159
all get very good at pulling each other's work

00:46:31.159 --> 00:46:33.400
apart because we're in the pursuit of truth here

00:46:33.400 --> 00:46:36.559
and egos can't be involved we have to destroy

00:46:36.559 --> 00:46:39.659
everybody's work looking for the holes and of

00:46:39.659 --> 00:46:41.400
course egos very much are involved because we're

00:46:41.400 --> 00:46:44.460
all humans um and that's not an easy process

00:46:44.460 --> 00:46:46.139
for most people to go through it certainly wasn't

00:46:46.139 --> 00:46:49.739
easy for me because every meeting is a bunch

00:46:49.739 --> 00:46:52.679
of smart people trying to find all the gaps but

00:46:52.679 --> 00:46:54.900
that for me is probably one of the most important

00:46:54.900 --> 00:46:57.920
parts of the pursuit of scientific progress.

00:46:58.139 --> 00:47:00.800
It takes a lot of training and a big mindset

00:47:00.800 --> 00:47:04.260
shift and a lot of energy to get your brain to

00:47:04.260 --> 00:47:08.480
operate like that. But I think everyone in a

00:47:08.480 --> 00:47:11.519
business context that uses AI is going to have

00:47:11.519 --> 00:47:15.039
to think like that. Until we are absolutely confident

00:47:15.039 --> 00:47:18.119
that AI tools can validate and critique their

00:47:18.119 --> 00:47:20.699
own outputs better than any human, that is our

00:47:20.699 --> 00:47:23.880
job. And it is honestly a super hard job, as

00:47:23.880 --> 00:47:25.860
you've just described in that peer review process.

00:47:26.440 --> 00:47:29.219
Yes. Well, that's not the only prediction they

00:47:29.219 --> 00:47:32.519
made in this Gartner report. There are a few

00:47:32.519 --> 00:47:36.519
other interesting things as well. And when thinking

00:47:36.519 --> 00:47:40.440
about critiquing and making decisions and using

00:47:40.440 --> 00:47:43.099
good judgment, that's something that people have

00:47:43.099 --> 00:47:46.719
to do in B2B procurement quite often. And one

00:47:46.719 --> 00:47:50.139
of the big predictions that they made is by 2028,

00:47:50.539 --> 00:47:58.099
90%, 90%. Of B2B buying will be AI agent intermediated,

00:47:58.099 --> 00:48:02.579
pushing over $15 trillion of B2B spend through

00:48:02.579 --> 00:48:06.199
AI agent exchanges. So they say that procurement's

00:48:06.199 --> 00:48:09.500
being reprogrammed, not by policy, but by invisible

00:48:09.500 --> 00:48:11.599
agents. Traditional search engine optimization

00:48:11.599 --> 00:48:15.860
and BPC will give way to agent engine optimization.

00:48:16.239 --> 00:48:19.059
Products will need to be machine readable and

00:48:19.059 --> 00:48:22.280
procurement will shift to efficient, autonomous.

00:48:22.889 --> 00:48:26.289
Machine to machine transactions. Yeah, this is

00:48:26.289 --> 00:48:27.969
a big deal if you're a marketer. We've talked

00:48:27.969 --> 00:48:30.590
about this before, right? In terms of humans

00:48:30.590 --> 00:48:34.050
marketing to humans, then humans marketing to

00:48:34.050 --> 00:48:36.909
agents, and then agents marketing to agents.

00:48:37.010 --> 00:48:39.150
And what do those shifts look like? And what

00:48:39.150 --> 00:48:42.170
do you need to do? It's super interesting. I

00:48:42.170 --> 00:48:46.090
renegotiated my satellite TV package this week.

00:48:46.670 --> 00:48:50.550
And it won't surprise you to know that I... asked

00:48:50.550 --> 00:48:54.309
GPT 5 .1 and Gemini 3 to go and find what the

00:48:54.309 --> 00:48:58.449
current prices were, give me a strategy for renegotiating

00:48:58.449 --> 00:49:02.889
my contract, and also little tips and tricks

00:49:02.889 --> 00:49:05.550
to out -negotiate the humans that I was going

00:49:05.550 --> 00:49:08.110
to be dealing with. And I got a really good deal

00:49:08.110 --> 00:49:11.110
on my satellite TV contract. I didn't do much

00:49:11.110 --> 00:49:13.210
of the thinking. Oh, sorry, I outsourced it on

00:49:13.210 --> 00:49:15.710
this occasion. But I was going through the process

00:49:15.710 --> 00:49:19.340
thinking it would be really good if... chat gpt

00:49:19.340 --> 00:49:21.219
could have just called because i had to spend

00:49:21.219 --> 00:49:23.579
40 minutes on the phone so it wouldn't really

00:49:23.579 --> 00:49:25.900
good if chat gpt could have just called the provider

00:49:25.900 --> 00:49:29.019
and taken care of that with me which from a capabilities

00:49:29.019 --> 00:49:33.139
perspective it has those capabilities now and

00:49:33.139 --> 00:49:36.179
the reward function is get the lowest price possible

00:49:36.179 --> 00:49:39.960
like it's not a hard thing to brief um so i was

00:49:39.960 --> 00:49:41.460
like kind of frustrated that we're not quite

00:49:41.460 --> 00:49:44.780
there yet but at the same time i was if ai followed

00:49:44.780 --> 00:49:47.139
the playbook it gave me which is all i really

00:49:47.139 --> 00:49:48.889
did probably would have got me a really good

00:49:48.889 --> 00:49:51.889
deal as well. So yeah, I think that's definitely

00:49:51.889 --> 00:49:55.070
one for us marketers to chew on, especially in

00:49:55.070 --> 00:49:57.710
a world where, you know, we're all going to have

00:49:57.710 --> 00:49:59.449
personal assistants to take care of this stuff

00:49:59.449 --> 00:50:01.670
for us. Absolutely. And I think the big takeaway

00:50:01.670 --> 00:50:06.190
from this in the near term, in terms of what

00:50:06.190 --> 00:50:08.090
can we do with this information right now, is

00:50:08.090 --> 00:50:10.289
I think marketers need to think about making

00:50:10.289 --> 00:50:14.070
sure that their product documentation is machine

00:50:14.070 --> 00:50:17.070
readable. For me, this is such an important point.

00:50:17.690 --> 00:50:21.010
So much information is still held, particularly

00:50:21.010 --> 00:50:24.650
in the B2B world, on PDFs. And I just think we

00:50:24.650 --> 00:50:27.090
need to make sure that we're transitioning all

00:50:27.090 --> 00:50:30.389
of that content away from PDFs into markdown

00:50:30.389 --> 00:50:33.710
files or some sort of machine readable format

00:50:33.710 --> 00:50:36.010
and making that publicly available so that the

00:50:36.010 --> 00:50:39.650
models can easily access, understand it. You

00:50:39.650 --> 00:50:43.769
know, data that's captured within tables is really

00:50:43.769 --> 00:50:47.739
hard to pass out for a model. when it's stored

00:50:47.739 --> 00:50:51.960
in a PDF. But stick that into Markdown, and all

00:50:51.960 --> 00:50:55.460
of a sudden, it's very easy for the AI to understand.

00:50:56.000 --> 00:51:01.780
So without getting too technical, you can quite

00:51:01.780 --> 00:51:04.860
easily start making this transition to be in

00:51:04.860 --> 00:51:09.639
a good position for agentic -assisted B2B procurement

00:51:09.639 --> 00:51:12.840
by just turning your data into Markdown files.

00:51:13.800 --> 00:51:15.780
Yeah, super interesting. There's a couple of

00:51:15.780 --> 00:51:17.980
things while you're talking that spring to mind.

00:51:18.019 --> 00:51:20.860
The first one is there's a continuum of possibilities

00:51:20.860 --> 00:51:27.019
here. One of them is we create information experiences

00:51:27.019 --> 00:51:31.179
for agents. So humans visit websites. But if

00:51:31.179 --> 00:51:35.340
your website has identified that it's a bot from

00:51:35.340 --> 00:51:38.099
ChatGPT, it's actually pushed off to a completely

00:51:38.099 --> 00:51:41.340
separate information repository. that it gathers

00:51:41.340 --> 00:51:43.500
the information from. Because websites of humans,

00:51:43.619 --> 00:51:46.380
obviously, in terms of how they're laid out and

00:51:46.380 --> 00:51:47.619
what have you, and PDFs are not particularly

00:51:47.619 --> 00:51:50.179
helpful, and gated PDFs and all this other stuff

00:51:50.179 --> 00:51:53.159
that agents aren't going to be able to access.

00:51:53.460 --> 00:51:55.360
So that's one possibility. The other end of the

00:51:55.360 --> 00:51:58.019
spectrum is computer use becomes so good so quickly

00:51:58.019 --> 00:52:01.400
that if you embark on this digital change project

00:52:01.400 --> 00:52:05.000
to reimagine how you... provide your products

00:52:05.000 --> 00:52:06.980
and services to agents and then our crumbs nine

00:52:06.980 --> 00:52:09.159
months later opus five can actually just browse

00:52:09.159 --> 00:52:12.239
your website and read pdfs and you know so it's

00:52:12.239 --> 00:52:15.000
like this is i think ethan mollock talks about

00:52:15.000 --> 00:52:16.719
this this is one of those weird technological

00:52:16.719 --> 00:52:20.139
evolutions that's happening so quickly there

00:52:20.139 --> 00:52:22.659
can sometimes be a strategic advantage to being

00:52:22.659 --> 00:52:25.579
a laggard right it's like let's just let the

00:52:25.579 --> 00:52:27.940
technology catch up because by the time we've

00:52:27.940 --> 00:52:30.079
put all the effort into this technology you'll

00:52:30.079 --> 00:52:31.679
be able to just do it anyway so it was a wasted

00:52:31.679 --> 00:52:35.579
couple of million dollars on a on a digital transformation

00:52:35.579 --> 00:52:37.320
project that we didn't need to do i'm not saying

00:52:37.320 --> 00:52:38.880
that's true by the way it's just that's that's

00:52:38.880 --> 00:52:42.320
one possibility right so that i do find myself

00:52:42.320 --> 00:52:45.679
thinking about how proactive do we need to be

00:52:45.679 --> 00:52:47.659
i mean do you have opinion on this martin if

00:52:47.659 --> 00:52:50.260
thinking about optimizing for computer use if

00:52:50.260 --> 00:52:56.320
anything has taught me how difficult computer

00:52:56.320 --> 00:53:00.920
interface design can be it's being in marketing

00:53:00.920 --> 00:53:04.219
agency world for the past 15 years doing web

00:53:04.219 --> 00:53:07.900
design projects and trying to make these interfaces

00:53:07.900 --> 00:53:11.519
consistent across different browsers oh it works

00:53:11.519 --> 00:53:14.059
in chromium and it but it doesn't work in firefox

00:53:14.059 --> 00:53:17.000
do you know what a markdown file is a markdown

00:53:17.000 --> 00:53:19.780
file it contains all of the information it's

00:53:19.780 --> 00:53:25.860
very simple and i think taking a uh a surefire

00:53:25.860 --> 00:53:28.519
approach to just Make sure there is a format

00:53:28.519 --> 00:53:31.219
to get the essential information out there is

00:53:31.219 --> 00:53:37.400
fail safe. Whilst, yes, the Opus might be able

00:53:37.400 --> 00:53:39.360
to use your website in the not too distant future.

00:53:39.539 --> 00:53:43.239
How long till some stupid JavaScript framework

00:53:43.239 --> 00:53:47.440
comes along where it can't do what you want it

00:53:47.440 --> 00:53:49.320
to do? I'll give you a good example, actually.

00:53:49.400 --> 00:53:53.360
When I was using, which browser was it? It was

00:53:53.360 --> 00:53:56.170
Perplexity. It was the Comet browser. And I was

00:53:56.170 --> 00:53:58.190
trying to get it to do a task and it ultimately

00:53:58.190 --> 00:54:01.389
completed the task. I was trying to get it to

00:54:01.389 --> 00:54:05.809
organize an inbox using Outlook online. And it

00:54:05.809 --> 00:54:07.909
was able to do nearly everything except as a

00:54:07.909 --> 00:54:11.309
interface element, which is right click. But

00:54:11.309 --> 00:54:14.449
the model can't right click. It can only have

00:54:14.449 --> 00:54:18.989
a left click select. So this particular function

00:54:18.989 --> 00:54:20.730
that needed you to be able to right click to

00:54:20.730 --> 00:54:23.460
add a new folder. it couldn't do and it had to

00:54:23.460 --> 00:54:26.159
find another way around which it did successfully

00:54:26.159 --> 00:54:30.019
do but if that other route didn't exist that

00:54:30.019 --> 00:54:33.760
interface is bust right and i think the web and

00:54:33.760 --> 00:54:38.400
uis are full of these edge cases which will ultimately

00:54:38.400 --> 00:54:41.820
catch computer use agents out from time to time

00:54:41.820 --> 00:54:45.400
yeah i mean i i completely agree and i but i

00:54:45.400 --> 00:54:46.920
think it's probably inherently quite fixable

00:54:46.920 --> 00:54:50.429
as well um But yeah, I think it's going to be

00:54:50.429 --> 00:54:52.210
interesting. I mean, one experiment I would love

00:54:52.210 --> 00:54:54.750
to run, if I had a product -based business like

00:54:54.750 --> 00:54:56.289
an e -commerce -based business, I'd probably

00:54:56.289 --> 00:54:59.110
look at my products that get the most web traffic

00:54:59.110 --> 00:55:02.750
or my most successful products. And I'd experiment

00:55:02.750 --> 00:55:07.610
with trying to find a way of providing that type

00:55:07.610 --> 00:55:10.349
of easily passable, markdown -driven information,

00:55:10.590 --> 00:55:13.630
perhaps on a product page in a way that encouraged

00:55:13.630 --> 00:55:18.469
an AI crawler to... use that information in a

00:55:18.469 --> 00:55:21.190
way that a human wouldn't. Like, I don't know

00:55:21.190 --> 00:55:24.949
if you could just hyperlink in the text for his

00:55:24.949 --> 00:55:27.110
AI readable markdown or just something that a

00:55:27.110 --> 00:55:28.849
human wouldn't click or if they did, like, fine,

00:55:28.989 --> 00:55:31.909
but it won't be much use to them. I would expect

00:55:31.909 --> 00:55:36.869
that it will become the norm within a CMS, actually,

00:55:37.050 --> 00:55:41.789
to almost like in email marketing systems where

00:55:41.789 --> 00:55:44.670
you have the plain text version. where you'll

00:55:44.670 --> 00:55:48.269
create a markdown version of a web page and that

00:55:48.269 --> 00:55:52.750
will exist as a kind of parallel page that you

00:55:52.750 --> 00:55:56.829
never see but in the page schema like in the

00:55:56.829 --> 00:56:00.190
metadata of the page it gives a link to the markdown

00:56:00.190 --> 00:56:03.349
version of that page and that's how i would imagine

00:56:03.349 --> 00:56:06.230
cms's in the future will operate they'll create

00:56:06.230 --> 00:56:10.070
this other version in the background and the

00:56:10.070 --> 00:56:12.369
crawler will go to the page it will see the schema

00:56:12.369 --> 00:56:14.130
and go right that's where i need to look for

00:56:14.130 --> 00:56:15.989
the information rather than passing through all

00:56:15.989 --> 00:56:18.630
of this html that makes perfect sense especially

00:56:18.630 --> 00:56:20.809
given i can't remember what it's called and this

00:56:20.809 --> 00:56:22.449
is baked into platforms like hubspot because

00:56:22.449 --> 00:56:24.150
it's fairly old school at this point but the

00:56:24.150 --> 00:56:26.570
ability to show a mobile optimized version of

00:56:26.570 --> 00:56:28.809
a page that's completely stripped down i can't

00:56:28.809 --> 00:56:30.670
remember what it's called now yeah amp right

00:56:30.670 --> 00:56:33.530
so hubspot will do that automatically for all

00:56:33.530 --> 00:56:36.750
of your blog posts server an amp version if someone's

00:56:36.750 --> 00:56:38.730
on a mobile and their speed is slow or whatever

00:56:38.730 --> 00:56:41.989
um i think you're absolutely right i can i can

00:56:41.989 --> 00:56:44.269
absolutely imagine that happening makes perfect

00:56:44.269 --> 00:56:46.769
sense to do so i think the other thing it may

00:56:46.769 --> 00:56:50.449
force us all to do and this harks back to a friend

00:56:50.449 --> 00:56:54.610
of ours marcus sheridan um and uh his they ask

00:56:54.610 --> 00:56:57.269
you answer methodology for inbound marketing

00:56:57.269 --> 00:57:00.409
and for those that are not familiar with marcus

00:57:00.409 --> 00:57:03.659
and his work In essence, I'd summarize it as

00:57:03.659 --> 00:57:06.480
provide your buyer with the information they

00:57:06.480 --> 00:57:08.619
need to make a decision. Don't hold too much

00:57:08.619 --> 00:57:10.380
back. And one of his big things is providing

00:57:10.380 --> 00:57:13.380
pricing information. Now, if you sell widgets,

00:57:13.659 --> 00:57:15.579
you probably do provide pricing information.

00:57:15.699 --> 00:57:18.019
But there's lots of things in the B2B marketing

00:57:18.019 --> 00:57:20.619
world and sales world where pricing isn't available

00:57:20.619 --> 00:57:22.960
on the Internet. So the question just becomes,

00:57:23.139 --> 00:57:27.349
how will an AI agent handle? not being able to

00:57:27.349 --> 00:57:29.710
gather pricing information and comparing that

00:57:29.710 --> 00:57:32.409
information for different services. And if you

00:57:32.409 --> 00:57:36.909
are the one website, backend, markdown files,

00:57:37.030 --> 00:57:38.690
whatever that does provide that information,

00:57:39.010 --> 00:57:41.650
will the agent be more likely to lean into you

00:57:41.650 --> 00:57:44.170
as a recommendation because it can actually give

00:57:44.170 --> 00:57:47.530
pricing recommendations to the human that asked

00:57:47.530 --> 00:57:51.630
for the purchase report or able to, because if

00:57:51.630 --> 00:57:53.610
you're an agent, you cannot make a purchase if

00:57:53.610 --> 00:57:55.679
you don't know the price. And if you're an agent

00:57:55.679 --> 00:57:57.619
that's optimized to make purchases, does that

00:57:57.619 --> 00:58:00.219
mean that you just buy from the one website that

00:58:00.219 --> 00:58:03.539
has pricing information or easy agent transaction

00:58:03.539 --> 00:58:06.420
capability? All of these things are like predator

00:58:06.420 --> 00:58:09.420
-prey relationships, that there will be a arbitrage

00:58:09.420 --> 00:58:12.480
opportunity that lasts for X amount of time,

00:58:12.599 --> 00:58:14.599
where you're the website that gave pricing in

00:58:14.599 --> 00:58:16.440
Markdown, and you're the website that made it

00:58:16.440 --> 00:58:19.420
easy for an agent to transact, and you... absolutely

00:58:19.420 --> 00:58:22.239
cleaned up for the two months that it took for

00:58:22.239 --> 00:58:23.659
the rest of the world to figure out that they

00:58:23.659 --> 00:58:26.440
need to do that so that they can also be included

00:58:26.440 --> 00:58:28.500
in those agent purchases. So there's probably

00:58:28.500 --> 00:58:31.480
quite a lot of sort of war gaming to be done

00:58:31.480 --> 00:58:34.320
in this area to think about how do you stay at

00:58:34.320 --> 00:58:36.159
the forefront? And is it commercially viable?

00:58:36.260 --> 00:58:37.880
Like if you're selling two things a month, probably

00:58:37.880 --> 00:58:40.199
isn't. But if you're like shifting a load of

00:58:40.199 --> 00:58:44.119
product, like in our industry, you're a consumable

00:58:44.119 --> 00:58:46.599
and reagent company, you might be selling. hundreds

00:58:46.599 --> 00:58:50.320
of thousands of vials per day. Absolutely worth

00:58:50.320 --> 00:58:52.179
looking at that because it could add up to quite

00:58:52.179 --> 00:58:54.480
a lot of revenue. I guess somewhat related to

00:58:54.480 --> 00:58:58.760
this is some of the changes we're seeing in Google

00:58:58.760 --> 00:59:01.940
AI overviews and potentially in chat GPT. So

00:59:01.940 --> 00:59:05.320
Google's starting to place ads in their AI overviews,

00:59:05.320 --> 00:59:08.440
Martin. Did you see this news? Yeah, I saw it

00:59:08.440 --> 00:59:14.260
and it seems quite linked to the Gemini re -release

00:59:14.260 --> 00:59:19.349
as well because they've also announced that as

00:59:19.349 --> 00:59:24.289
part of search responses gemini 3 will actually

00:59:24.289 --> 00:59:28.510
now create entire interfaces to explain concepts

00:59:28.510 --> 00:59:31.969
to people as well which is a fascinating change

00:59:31.969 --> 00:59:35.230
so i think that the search experience is evolving

00:59:35.230 --> 00:59:38.130
at a pace but yeah interesting to see that we're

00:59:38.130 --> 00:59:42.260
seeing the first sponsored links inserted into

00:59:42.260 --> 00:59:46.539
AI generated search results of AI overviews.

00:59:46.960 --> 00:59:51.760
So this is the first monetization strategy that

00:59:51.760 --> 00:59:56.820
we've seen for the AI search experience. And

00:59:56.820 --> 00:59:59.300
I think it's fair to say that we'll see more

00:59:59.300 --> 01:00:04.079
of this inserted into AI mode, as well as AI

01:00:04.079 --> 01:00:08.559
overviews, which are two distinct product surfaces.

01:00:09.320 --> 01:00:12.360
within the search experience it's quite an interesting

01:00:12.360 --> 01:00:16.619
one it's absolutely predictable how it impacts

01:00:16.619 --> 01:00:18.519
on user experience is going to be interesting

01:00:18.519 --> 01:00:21.480
so i'll tell you a little story we're doing house

01:00:21.480 --> 01:00:25.139
renovation at the moment and we i needed to buy

01:00:25.139 --> 01:00:27.460
what because we got a very old house and it's

01:00:27.460 --> 01:00:29.320
not all served by the central heating system

01:00:29.320 --> 01:00:33.739
so i needed to buy an electric radiator not something

01:00:33.739 --> 01:00:35.480
that you do very often not something i know much

01:00:35.480 --> 01:00:37.739
about But I knew what I didn't want, which was

01:00:37.739 --> 01:00:41.539
a very modern or kind of crappy looking radiator

01:00:41.539 --> 01:00:43.880
because our house is quite old and wouldn't be

01:00:43.880 --> 01:00:46.599
in keeping. And I'd managed to find these cast

01:00:46.599 --> 01:00:49.539
iron radiators that were electric and oil filled.

01:00:49.860 --> 01:00:52.260
And they look beautiful. They cost an absolute

01:00:52.260 --> 01:00:55.179
fortune. I was like, can't justify it. So I asked

01:00:55.179 --> 01:00:57.500
Gemini 3 to go find me something similar that

01:00:57.500 --> 01:01:00.480
cost a lot less. And I'd done my own Googling.

01:01:00.780 --> 01:01:03.199
And it came back with three different makes that

01:01:03.199 --> 01:01:05.409
I hadn't seen. that looked basically identical,

01:01:05.570 --> 01:01:09.590
but cost half the price. And we'd been having

01:01:09.590 --> 01:01:12.289
a conversation about the room size and what type

01:01:12.289 --> 01:01:14.530
of size of radiator, what type of radiator would

01:01:14.530 --> 01:01:17.070
you recommend? And it came back with these three

01:01:17.070 --> 01:01:19.449
and it gave me the pros and cons of each. And

01:01:19.449 --> 01:01:22.510
then I said, what would you buy? And it told

01:01:22.510 --> 01:01:24.809
me which one it would buy. And it had been so

01:01:24.809 --> 01:01:27.690
darn logical to that point, I bought the one

01:01:27.690 --> 01:01:31.550
it said. So it raises the question, is it better

01:01:31.550 --> 01:01:36.199
to have an over system? that says, here's a sponsored

01:01:36.199 --> 01:01:37.960
link for a thing you could buy that's kind of

01:01:37.960 --> 01:01:39.900
similar to what you've been chatting with me

01:01:39.900 --> 01:01:43.840
about. Because I just, I kind of trust Gemini

01:01:43.840 --> 01:01:46.239
3. And so I just bought the one it said. And

01:01:46.239 --> 01:01:47.800
at the moment, that's because I suspect there

01:01:47.800 --> 01:01:50.340
are no incentives for Google as to what product

01:01:50.340 --> 01:01:52.579
it recommends. And if I thought there were, maybe

01:01:52.579 --> 01:01:54.960
I wouldn't have been so gullible. But as it stands

01:01:54.960 --> 01:01:57.320
right now, I just trusted Gemini and bought the

01:01:57.320 --> 01:02:00.039
one that it said. And so that's kind of interesting,

01:02:00.139 --> 01:02:03.199
right? Like where... How important will it be

01:02:03.199 --> 01:02:06.159
that we actually have an understanding of how

01:02:06.159 --> 01:02:09.960
these recommendations get surfaced? Will it influence

01:02:09.960 --> 01:02:12.940
our buying behavior? Because I was so much more

01:02:12.940 --> 01:02:15.000
likely to do that than click on a sponsored link

01:02:15.000 --> 01:02:17.579
and buy the sponsored link. So I don't know how

01:02:17.579 --> 01:02:19.820
this is all going to play out. Yeah, that is

01:02:19.820 --> 01:02:23.719
a fascinating development and goes to show how

01:02:23.719 --> 01:02:27.320
ads really change the experience. This was something

01:02:27.320 --> 01:02:31.179
that ChatGPT were... quite clear on for a long

01:02:31.179 --> 01:02:33.719
time saying that they weren't sure about putting

01:02:33.719 --> 01:02:36.260
ads into the experience because of the way that

01:02:36.260 --> 01:02:39.739
people use it and they want impartial answers

01:02:39.739 --> 01:02:44.239
they want to trust the model outputs yet we've

01:02:44.239 --> 01:02:48.199
seen this weak speculation arising that chat

01:02:48.199 --> 01:02:51.860
gpt is about to get its own ad network and start

01:02:51.860 --> 01:02:56.099
serving ads in there as well so yeah from a marketer's

01:02:56.099 --> 01:02:59.530
perspective new ways to reach audiences right

01:02:59.530 --> 01:03:03.230
from an audience perspective oh is this going

01:03:03.230 --> 01:03:06.110
to change the relationship that you have with

01:03:06.110 --> 01:03:09.269
your chat assistant right if you had a friend

01:03:09.269 --> 01:03:13.590
who you who was like into tech that you i've

01:03:13.590 --> 01:03:15.730
got one his name's martin and he always tries

01:03:15.730 --> 01:03:17.809
to convince me to buy lifetime deals and then

01:03:17.809 --> 01:03:20.389
the software that we buy goes it stops being

01:03:20.389 --> 01:03:23.130
supported three months later but um if you knew

01:03:23.130 --> 01:03:26.369
that your friend had like sorry can i just set

01:03:26.369 --> 01:03:30.510
that That feels a bit on the nose, Paul. I feel

01:03:30.510 --> 01:03:35.269
outed. I think it would be fair to say you had

01:03:35.269 --> 01:03:38.909
a lifetime deal addiction that you then gave

01:03:38.909 --> 01:03:43.349
to me. And I feel partially cured now. I haven't

01:03:43.349 --> 01:03:45.769
been burnt a few times. But yeah, if you had

01:03:45.769 --> 01:03:47.969
a friend who you knew was on commission for every

01:03:47.969 --> 01:03:49.670
piece of advice they ever gave you, you probably

01:03:49.670 --> 01:03:51.880
wouldn't be your friend anymore. Right, because

01:03:51.880 --> 01:03:53.780
that would be a very weird relationship to have.

01:03:53.900 --> 01:03:56.139
And people have these weird relationships with

01:03:56.139 --> 01:03:58.360
their AIs. Do they want them selling to them?

01:03:58.460 --> 01:04:00.380
I think you're absolutely right. I think it changes

01:04:00.380 --> 01:04:03.940
the experience a lot. I mean, with OpenAI's release

01:04:03.940 --> 01:04:06.539
of the shopping research feature, where does

01:04:06.539 --> 01:04:10.039
that sit? Is it unpartial advice or is there

01:04:10.039 --> 01:04:13.880
like incentives baked into that? It hasn't been

01:04:13.880 --> 01:04:16.739
made clear as far as I can tell. And we kind

01:04:16.739 --> 01:04:21.420
of probably do need to know. I think it's a shame

01:04:21.420 --> 01:04:23.880
because I'm quite liking this moment in time

01:04:23.880 --> 01:04:28.099
where I can ask AI for purchase advice and it

01:04:28.099 --> 01:04:30.039
gives me great advice. Another thing it found,

01:04:30.139 --> 01:04:33.679
this is probably too much information, but because

01:04:33.679 --> 01:04:36.059
I've got this bad back and it's causing me to

01:04:36.059 --> 01:04:38.099
get calf strains all the time. Weird, right?

01:04:38.420 --> 01:04:42.960
So it recommended that I get these sliders. They're

01:04:42.960 --> 01:04:45.480
like slippers that have this amazing cushioning

01:04:45.480 --> 01:04:47.079
on the bottom that I would never have found myself.

01:04:47.219 --> 01:04:48.880
It wouldn't have even occurred to me to buy these.

01:04:49.039 --> 01:04:51.960
I bought them. They're like walking on their

01:04:51.960 --> 01:04:53.980
hair. Like, I don't want to wear normal shoes

01:04:53.980 --> 01:04:56.280
anymore. I would never have bought that. I would

01:04:56.280 --> 01:04:58.800
never have thought of it. Jury's out on whether

01:04:58.800 --> 01:05:00.679
or not it's helping with my biomechanical issues

01:05:00.679 --> 01:05:03.159
or not, but successfully sold me on this pair

01:05:03.159 --> 01:05:07.480
of these shoes that I absolutely adore now. That's

01:05:07.480 --> 01:05:09.840
a nice experience. Once it gets tainted with

01:05:09.840 --> 01:05:11.960
all of this advertising and do I trust it or

01:05:11.960 --> 01:05:15.059
not, I'm back to having to think for myself again.

01:05:15.659 --> 01:05:18.760
It's an age -old problem. A technology or...

01:05:19.050 --> 01:05:23.409
Some new thing is launched into the world and

01:05:23.409 --> 01:05:26.510
we go, this is great. And then the marketers

01:05:26.510 --> 01:05:31.409
come along and ruin it. Oh, therein lies a truth,

01:05:31.570 --> 01:05:34.070
I think. But yeah, I guess from a marketing perspective,

01:05:34.369 --> 01:05:36.269
seeing as we are those marketers and many of

01:05:36.269 --> 01:05:39.570
our listeners are also, keeping tabs on how this

01:05:39.570 --> 01:05:42.409
evolves and how you can best reach your audience

01:05:42.409 --> 01:05:46.789
is critical. And a little bit like SEO, and we've

01:05:46.789 --> 01:05:48.429
talked about this a bit in the past as well,

01:05:48.590 --> 01:05:51.530
there's going to be ways to have your brand show

01:05:51.530 --> 01:05:54.989
up in AI search results, creating content, being

01:05:54.989 --> 01:05:57.650
an authority, having your brand mentioned favorably

01:05:57.650 --> 01:06:00.869
in lots of different areas like reviews and earned

01:06:00.869 --> 01:06:04.250
media placements and what have you. Or you can

01:06:04.250 --> 01:06:07.670
just buy your way to the top if you have a business

01:06:07.670 --> 01:06:09.630
model whereby you can still make good margins

01:06:09.630 --> 01:06:11.530
if you're paying to reach that audience. And

01:06:11.530 --> 01:06:13.849
it does feel like we're entering that time of,

01:06:14.440 --> 01:06:18.159
first mover advantage figuring out how to leverage

01:06:18.159 --> 01:06:22.139
AI assistance to get in front of audiences where

01:06:22.139 --> 01:06:25.260
the prices are reasonable. I suspect that that

01:06:25.260 --> 01:06:28.940
is another short -lived arbitrage. I think once

01:06:28.940 --> 01:06:30.219
these become available, I don't think there's

01:06:30.219 --> 01:06:32.219
anybody who's going to be like they were with

01:06:32.219 --> 01:06:34.440
Google search ads. No one's sleeping on this.

01:06:34.679 --> 01:06:36.139
We've talked about this, and that's why there's

01:06:36.139 --> 01:06:39.079
such an obsession with answer engine optimization,

01:06:39.460 --> 01:06:41.820
because people are desperate to be at the forefront

01:06:41.820 --> 01:06:43.599
of whatever this will be. And it'll be an arbitrage

01:06:43.599 --> 01:06:45.699
for about two weeks, probably, but definitely

01:06:45.699 --> 01:06:47.639
something we all need to keep an eye on. Well,

01:06:47.840 --> 01:06:51.599
that neatly brings us on to our final story of

01:06:51.599 --> 01:06:55.179
the week, which is about Adobe buying SEMrush

01:06:55.179 --> 01:06:59.820
for $1 .9 billion. merging search engine optimization

01:06:59.820 --> 01:07:06.639
and creative AI and AEO or geo or whatever you

01:07:06.639 --> 01:07:10.659
want to call it, all into one platform. What

01:07:10.659 --> 01:07:14.400
do you think of this? So we are, one of the platforms

01:07:14.400 --> 01:07:17.260
we use is SEMrush. And my first thought was,

01:07:17.440 --> 01:07:20.780
doesn't Adobe often buy things and break them?

01:07:20.960 --> 01:07:24.179
And I was sad because I've invested time in setting

01:07:24.179 --> 01:07:27.420
up campaigns and... you know projects and stuff

01:07:27.420 --> 01:07:30.280
um and i don't want the tool to suddenly become

01:07:30.280 --> 01:07:33.619
10 times more expensive or break a lot because

01:07:33.619 --> 01:07:36.599
now it's been wrapped into some other thing that

01:07:36.599 --> 01:07:38.639
has a bunch of capabilities that perhaps i didn't

01:07:38.639 --> 01:07:41.559
need so if i'm honest my first thought was a

01:07:41.559 --> 01:07:45.019
bit selfish my second thought was if they've

01:07:45.019 --> 01:07:48.119
mostly done this for for aeo and they've spent

01:07:48.119 --> 01:07:52.440
all that money then had to be clearly the things

01:07:52.440 --> 01:07:54.920
that's going to be really important or it They

01:07:54.920 --> 01:07:57.500
think their audience, their customers think it's

01:07:57.500 --> 01:08:01.320
really important to spend so much cash. But what

01:08:01.320 --> 01:08:03.960
did you think? Not a user of the product, but

01:08:03.960 --> 01:08:06.199
I certainly saw the same reaction on LinkedIn

01:08:06.199 --> 01:08:08.260
of people going, oh, they're going to make it

01:08:08.260 --> 01:08:12.199
rubbish now. So, yeah, I think that's a valid

01:08:12.199 --> 01:08:16.880
concern shared by many users. I do think it validates

01:08:16.880 --> 01:08:20.399
the idea of generative engine optimization. If

01:08:20.399 --> 01:08:23.090
they're moving into this space with a... two

01:08:23.090 --> 01:08:26.289
billion dollar acquisition ai search clearly

01:08:26.289 --> 01:08:29.390
isn't just a side project anymore and i think

01:08:29.390 --> 01:08:33.550
marketers should be investing more into this

01:08:33.550 --> 01:08:37.989
now we know that ai search is still a relatively

01:08:37.989 --> 01:08:41.470
small part of the overall market google's own

01:08:41.470 --> 01:08:45.449
data talks a lot about how they've got more people

01:08:45.449 --> 01:08:49.750
using search more frequently and actually engagement

01:08:50.430 --> 01:08:53.189
in the search experience is increasing with people

01:08:53.189 --> 01:08:56.010
asking longer more complex questions in there

01:08:56.010 --> 01:09:01.510
so i guess part of my thought process is yes

01:09:01.510 --> 01:09:05.810
geo or generative engine optimization is on the

01:09:05.810 --> 01:09:08.430
rise and something that we should be doing but

01:09:08.430 --> 01:09:11.489
what are the generative engines that we should

01:09:11.489 --> 01:09:16.050
be paying attention to and is it going to end

01:09:16.050 --> 01:09:19.840
up just being Are they still going to be the

01:09:19.840 --> 01:09:23.239
big player with whether it's Gemini, AI mode,

01:09:23.659 --> 01:09:28.000
AI overviews or the classic SERPs window? Do

01:09:28.000 --> 01:09:32.060
all paths ultimately end up leading to Google

01:09:32.060 --> 01:09:36.020
anyway? Or is ChatGPT going to actually make

01:09:36.020 --> 01:09:39.220
a significant inroad? It's such an interesting

01:09:39.220 --> 01:09:43.199
question because why did, I'm not an expert in

01:09:43.199 --> 01:09:45.420
this, but why did Google win the search engine

01:09:45.420 --> 01:09:50.029
wars? And we talked about this previously, how

01:09:50.029 --> 01:09:52.409
I could like probably completely misrepresented

01:09:52.409 --> 01:09:57.409
a fact about how Google could correlate a 100

01:09:57.409 --> 01:10:01.090
millisecond increase in delivering the speed

01:10:01.090 --> 01:10:05.390
of the result to X, Y, Z of traffic or clicks

01:10:05.390 --> 01:10:08.670
or conversions on ads and what have you. And

01:10:08.670 --> 01:10:12.409
really simple, easy to use web page that did

01:10:12.409 --> 01:10:14.630
one thing. You searched and it found your information

01:10:14.630 --> 01:10:18.390
on the web. contrast that with how like yahoo

01:10:18.390 --> 01:10:20.649
search looked in the day it was just full of

01:10:20.649 --> 01:10:22.989
widgets everywhere and you know it's probably

01:10:22.989 --> 01:10:24.989
like made you feel ill just to look at it if

01:10:24.989 --> 01:10:27.229
you tried to look at it now um so i think it

01:10:27.229 --> 01:10:29.149
will come down to user experience but where i

01:10:29.149 --> 01:10:32.170
think that's really super interesting is user

01:10:32.170 --> 01:10:36.989
experience in this sense is very broad let's

01:10:36.989 --> 01:10:39.750
imagine that for whatever reason i'm a chat gpt

01:10:39.750 --> 01:10:42.609
user and i give it access to loads of stuff about

01:10:42.609 --> 01:10:45.930
me Like I want to use the memory feature. I give

01:10:45.930 --> 01:10:48.729
it access to my email and my calendar. It somehow

01:10:48.729 --> 01:10:51.449
makes its way into my messaging app, WhatsApp,

01:10:51.529 --> 01:10:54.369
whatever. And it knows a ton about me. Maybe

01:10:54.369 --> 01:10:56.090
it knows my Amazon purchase history, whatever.

01:10:56.529 --> 01:11:00.630
Its ability to surface things that are genuinely

01:11:00.630 --> 01:11:03.590
valuable to me, maybe paid for placements, but

01:11:03.590 --> 01:11:06.510
still things I would actually buy and get value

01:11:06.510 --> 01:11:10.010
from and want. That's the type of experience

01:11:10.010 --> 01:11:13.060
that we're looking for now. Right. Isn't just

01:11:13.060 --> 01:11:15.760
about surfacing information. It's about bridging

01:11:15.760 --> 01:11:18.880
that gap between almost predictive purchase behavior

01:11:18.880 --> 01:11:21.899
in a way that we haven't been able to, at least

01:11:21.899 --> 01:11:23.960
to my knowledge, really been able to do through

01:11:23.960 --> 01:11:28.199
any cookie based tracking or automation platforms

01:11:28.199 --> 01:11:31.340
like to get the most access and benefit out of

01:11:31.340 --> 01:11:33.779
our assistants. We're probably going to let them

01:11:33.779 --> 01:11:38.350
know a lot about us. And I think that's probably

01:11:38.350 --> 01:11:40.729
going to be as big a differentiator to the user

01:11:40.729 --> 01:11:44.029
experience as how messy is the UI and all this

01:11:44.029 --> 01:11:46.770
other stuff. So I don't know. I mean, Google's

01:11:46.770 --> 01:11:48.369
well placed because it already has Gmail and

01:11:48.369 --> 01:11:49.810
Calendar and it has a lot of stuff that people

01:11:49.810 --> 01:11:51.729
already use for this type of stuff. And obviously

01:11:51.729 --> 01:11:53.310
it has a lot of data on a lot of people already.

01:11:53.710 --> 01:11:56.130
What are your thoughts? Yeah, I think the more

01:11:56.130 --> 01:11:58.550
the systems know about us, the better. And actually,

01:11:58.630 --> 01:12:01.069
it's one of the reasons that ChatGPT remains

01:12:01.069 --> 01:12:05.529
my daily driver as a chat assistant is because...

01:12:05.949 --> 01:12:08.850
actually the memory function works really well

01:12:08.850 --> 01:12:12.550
and there's some features in it that just make

01:12:12.550 --> 01:12:17.210
the experience feel more useful as a daily driver

01:12:17.210 --> 01:12:20.909
when i need specialist tasks completing and like

01:12:20.909 --> 01:12:23.789
if i need if i'm doing some coding or vibe coding

01:12:23.789 --> 01:12:27.609
i'm going to go to claude um if i'll for certain

01:12:27.609 --> 01:12:30.789
applications i find gemini 3 is is fantastic

01:12:30.789 --> 01:12:34.409
and i go to that but the tool that i use i would

01:12:34.409 --> 01:12:39.409
say for 80 % of my prompting is still chat GPT

01:12:39.409 --> 01:12:42.250
because they've just got something dialed in.

01:12:42.310 --> 01:12:45.149
I guess the question is how much of this tool

01:12:45.149 --> 01:12:48.909
use is actually rolled out across the population

01:12:48.909 --> 01:12:53.050
at this point. I was with someone last week who

01:12:53.050 --> 01:12:55.590
just got a new phone and we were talking about

01:12:55.590 --> 01:12:58.729
Gemini. They'd never used Gemini before in their

01:12:58.729 --> 01:13:02.270
life. So I told them to open it up and we just

01:13:02.270 --> 01:13:05.460
tried a few prompts. And all of a sudden they

01:13:05.460 --> 01:13:07.939
were an AI user, but they'd never used it before.

01:13:08.020 --> 01:13:12.399
And it wasn't on their radar at all. Google stands

01:13:12.399 --> 01:13:16.859
to win quite significantly if they can get that

01:13:16.859 --> 01:13:20.640
audience shifted onto using their platform. And

01:13:20.640 --> 01:13:22.300
I think with the deals that they're doing with

01:13:22.300 --> 01:13:26.899
Apple through the iPhone tie -up and obviously

01:13:26.899 --> 01:13:31.020
the market coverage that they have through Android.

01:13:31.940 --> 01:13:36.180
they're really really well placed to just to

01:13:36.180 --> 01:13:38.579
clean up here and they've run this playbook before

01:13:38.579 --> 01:13:41.100
right they pay billions of dollars a year to

01:13:41.100 --> 01:13:43.899
apple to have chrome or google search be like

01:13:43.899 --> 01:13:45.920
they're not chrome sorry but google search to

01:13:45.920 --> 01:13:49.439
be this default search on you know on um apple

01:13:49.439 --> 01:13:52.319
devices and what have you so yeah i think i think

01:13:52.319 --> 01:13:54.899
that's key i also think just while i was listening

01:13:54.899 --> 01:13:57.359
to you speak there that if you think about a

01:13:57.359 --> 01:14:00.039
cookie -less world you know that didn't actually

01:14:00.039 --> 01:14:03.020
come to pass quite as it was expected to do um

01:14:03.020 --> 01:14:05.619
but working in a in a world of first party data

01:14:05.619 --> 01:14:08.960
and maybe it's harder to track people you know

01:14:08.960 --> 01:14:13.020
finding a way to have your audience like volunteer

01:14:13.020 --> 01:14:15.279
loads of information about themselves and then

01:14:15.279 --> 01:14:17.640
extremely happy to do it like there are people

01:14:17.640 --> 01:14:21.640
talking with chat gpt about um i don't know dealing

01:14:21.640 --> 01:14:23.960
with relationship issues with their friends and

01:14:23.960 --> 01:14:27.640
partners talking about like their illnesses feeling

01:14:27.640 --> 01:14:30.520
i don't know depressed and suicidal or there's

01:14:30.520 --> 01:14:32.920
a person at work i really like and how can i

01:14:32.920 --> 01:14:35.140
get them to like like the amount of like personal

01:14:35.140 --> 01:14:37.060
information that people are sharing with these

01:14:37.060 --> 01:14:40.460
tools is like way beyond anything that you would

01:14:40.460 --> 01:14:43.779
ever have done you know in a browser and all

01:14:43.779 --> 01:14:46.920
of the ai companies are launching their own browsers

01:14:46.920 --> 01:14:50.800
as well so i really think that will be the secret

01:14:50.800 --> 01:14:53.119
is trying to unlock the promise i mean i remember

01:14:53.640 --> 01:14:54.880
I can't remember what the book's called now,

01:14:54.960 --> 01:14:57.659
but one of the examples it gives for where AI

01:14:57.659 --> 01:15:01.880
could go is Amazon's... And the praises here

01:15:01.880 --> 01:15:05.399
is Amazon invested in drone technology because

01:15:05.399 --> 01:15:07.520
they wondered if there would be a time when AI

01:15:07.520 --> 01:15:10.539
could predict your purchase needs so well that

01:15:10.539 --> 01:15:14.739
if you could make delivery cost almost close

01:15:14.739 --> 01:15:17.859
to zero, you could have a drone turn up at someone's

01:15:17.859 --> 01:15:19.859
house with something that they didn't even know

01:15:19.859 --> 01:15:22.460
they needed. And they'd just take it. They'd

01:15:22.460 --> 01:15:25.000
be like, oh. Cheers, Amazon drone for. Yeah,

01:15:25.079 --> 01:15:26.899
it's funny to say that the toothpaste did run

01:15:26.899 --> 01:15:30.739
out exactly today. Well played you. And if half

01:15:30.739 --> 01:15:32.619
of that package you didn't want and the drone

01:15:32.619 --> 01:15:34.600
took it back, if you got the cost of delivery

01:15:34.600 --> 01:15:36.720
down to near zero, it didn't matter. But I think

01:15:36.720 --> 01:15:38.359
one of the reasons that that hasn't come to pass

01:15:38.359 --> 01:15:40.000
and not just because of drone technology is because

01:15:40.000 --> 01:15:43.020
predicting what people want is actually still

01:15:43.020 --> 01:15:46.100
very hard. But this could be one of those things

01:15:46.100 --> 01:15:48.140
that makes it much easier to predict what people

01:15:48.140 --> 01:15:51.500
want. And I think, again, that's where. the the

01:15:51.500 --> 01:15:54.420
winner comes in is that predictive ability not

01:15:54.420 --> 01:15:58.939
just the ui and all of this other stuff so i

01:15:58.939 --> 01:16:01.140
don't know as a in b2b marketing is probably

01:16:01.140 --> 01:16:02.619
not something we have to worry about huge amounts

01:16:02.619 --> 01:16:04.380
but if you're in b2c you should be thinking about

01:16:04.380 --> 01:16:06.659
this how long do you think it will be before

01:16:06.659 --> 01:16:08.859
the ai assistance you know at the moment they

01:16:08.859 --> 01:16:12.279
all have that um they've got that trait where

01:16:12.279 --> 01:16:13.939
at the end they'll say would you like me to turn

01:16:13.939 --> 01:16:17.970
this into a convenient edf handout for you or

01:16:17.970 --> 01:16:19.670
would you like me to turn it into a table for

01:16:19.670 --> 01:16:21.109
you or would you like me to do something else

01:16:21.109 --> 01:16:23.270
how long is it till they start having conversations

01:16:23.270 --> 01:16:25.850
with you and just start randomly going so how

01:16:25.850 --> 01:16:30.149
much milk have you got left in the fridge yeah

01:16:30.149 --> 01:16:34.569
because your fridge said that you're nearly out

01:16:34.569 --> 01:16:38.229
but it is uh it is a brand that i don't trust

01:16:38.229 --> 01:16:40.789
and so i because it's got poor cameras inside

01:16:40.789 --> 01:16:42.949
it so i think it'll definitely happen i also

01:16:42.949 --> 01:16:45.149
think there's a world where There's the buying

01:16:45.149 --> 01:16:47.729
committee of humans. The complex B2B purchase

01:16:47.729 --> 01:16:50.409
is a buying committee of like eight humans. And

01:16:50.409 --> 01:16:52.310
they're like thrashing it out and you're getting

01:16:52.310 --> 01:16:55.609
the absolute, you know, camel as a solution designed

01:16:55.609 --> 01:16:57.829
by a committee. And then all of the assistants

01:16:57.829 --> 01:17:00.770
of those humans are having like a meta discussion

01:17:00.770 --> 01:17:05.050
in the cloud. Also thrashing it. Well, my human,

01:17:05.210 --> 01:17:07.329
I've known what I know about my human. I think

01:17:07.329 --> 01:17:09.550
my human would actually go this way. And it's

01:17:09.550 --> 01:17:12.550
like. Who then ends up making the decisions and

01:17:12.550 --> 01:17:14.029
how does that work? I think the humans would

01:17:14.029 --> 01:17:15.489
probably go, I can't be bothered to be in those

01:17:15.489 --> 01:17:17.170
meetings anymore. Can we just send all of our

01:17:17.170 --> 01:17:21.270
assistants off to be the buying committee? We'll

01:17:21.270 --> 01:17:23.310
get some quite weird purchases then based on

01:17:23.310 --> 01:17:25.369
that, I would have said. Well, it's all good

01:17:25.369 --> 01:17:28.090
though, because Claude 4 .5 is very good at being

01:17:28.090 --> 01:17:31.750
the orchestrator and Claude Heike 4 .5 will be

01:17:31.750 --> 01:17:36.689
the purchase executor. So, you know, in Anthropic,

01:17:36.770 --> 01:17:41.159
we trust other models are available. Well, that

01:17:41.159 --> 01:17:43.119
feels like a pretty good place to close out the

01:17:43.119 --> 01:17:44.760
podcast. Anything else you wanted to cover this

01:17:44.760 --> 01:17:48.239
week, Mike? I think that is plenty. Well, thanks

01:17:48.239 --> 01:17:50.220
for your time as always, and I will catch you

01:17:50.220 --> 01:17:53.640
on the next one. Thank you for listening to Artificially

01:17:53.640 --> 01:17:56.520
Intelligent Marketing. To stay on top of the

01:17:56.520 --> 01:17:59.800
latest trends, tips, and tools in the world of

01:17:59.800 --> 01:18:03.859
marketing AI, be sure to subscribe. We look forward

01:18:03.859 --> 01:18:05.439
to seeing you again next week.