WEBVTT

00:00:00.000 --> 00:00:02.000
Today we're going to compare three different

00:00:02.000 --> 00:00:05.900
models inside a Copilot agent built in Copilot

00:00:05.900 --> 00:00:09.080
Studio. We're going to compare OpenAI's latest

00:00:09.080 --> 00:00:14.539
model to Anthropix cloud model and to Grok which

00:00:14.539 --> 00:00:18.059
is built by XAI. Now we're going to keep every

00:00:18.059 --> 00:00:21.640
variable the same except just change the model

00:00:21.640 --> 00:00:24.480
so it'll give us a good indication of how powerful

00:00:24.480 --> 00:00:27.359
the model is. So right now you're looking at

00:00:27.789 --> 00:00:31.129
my copilot studio environment and this is an

00:00:31.129 --> 00:00:33.609
agent that i actually built in the past it's

00:00:33.609 --> 00:00:37.049
called matchmaker and it's basically an autonomous

00:00:37.049 --> 00:00:41.770
agent that receives emails and it looks at emails

00:00:41.770 --> 00:00:45.469
that are related to job postings it matches the

00:00:45.469 --> 00:00:48.329
candidate that fits that job posting the best

00:00:49.039 --> 00:00:51.579
it ranks all the different candidates and then

00:00:51.579 --> 00:00:54.840
it generates a document and emails it back to

00:00:54.840 --> 00:00:58.420
the sender it's a super cool agent and if you're

00:00:58.420 --> 00:01:00.820
interested in building this check out the previous

00:01:00.820 --> 00:01:03.179
video that actually goes through this step by

00:01:03.179 --> 00:01:06.079
step in today's video we're just interested in

00:01:06.079 --> 00:01:09.879
this section here which is selecting the model

00:01:09.879 --> 00:01:13.120
and when i open up all these different models

00:01:13.120 --> 00:01:18.230
that i have access to from gpt41 to 5 chat and

00:01:18.230 --> 00:01:21.469
5 auto the latest one that i actually have access

00:01:21.469 --> 00:01:26.650
to is gpt 5 3 for chat that just came out and

00:01:26.650 --> 00:01:29.950
on the anthropic side you can see that i have

00:01:29.950 --> 00:01:36.069
cloud sonnet 4 5 cloud sonnet 4 6 and opus 4

00:01:36.069 --> 00:01:39.930
6. opus seems to be for deep reasoning and i'm

00:01:39.930 --> 00:01:42.829
going to use sonnet 4 6 which is for general

00:01:42.829 --> 00:01:47.209
tasks because This is also for most tasks, the

00:01:47.209 --> 00:01:49.530
5 -3. So I'm going to kind of do apples to apples

00:01:49.530 --> 00:01:53.329
instead of using the deep reasoning models. For

00:01:53.329 --> 00:01:55.629
example, the 5 -2 deep reasoning, I'm not going

00:01:55.629 --> 00:01:57.730
to use that. And then I'm going to also compare

00:01:57.730 --> 00:02:02.189
it to this XAI Grok 4 -1, which actually says

00:02:02.189 --> 00:02:04.290
deep reasoning, but it's the only choice that

00:02:04.290 --> 00:02:07.750
I have for XAI. So in summary, there's going

00:02:07.750 --> 00:02:11.449
to be three models. But if you look at this agent,

00:02:11.689 --> 00:02:14.889
this is the instructions. and i modified it a

00:02:14.889 --> 00:02:17.409
little bit from the initial matchmaker build

00:02:17.409 --> 00:02:20.349
that we had if we look at the instructions basically

00:02:20.349 --> 00:02:23.310
a new email arrives and the agent determines

00:02:23.310 --> 00:02:26.590
if it's related to a job posting or a hiring

00:02:26.590 --> 00:02:29.629
type of a question and then it compares the job

00:02:29.629 --> 00:02:31.810
description to the candidates that are on the

00:02:31.810 --> 00:02:34.750
hiring bench in a sharepoint site that sharepoint

00:02:34.750 --> 00:02:37.270
site is actually right here in the knowledge

00:02:37.270 --> 00:02:41.330
sources and it's actually this is the sharepoint

00:02:41.330 --> 00:02:44.629
site And when I go to the hiring bench, what

00:02:44.629 --> 00:02:47.430
Arnold has done, we're looking at Arnold's SharePoint.

00:02:47.590 --> 00:02:51.590
He's taken the resumes of a few people like Jessica,

00:02:51.710 --> 00:02:54.669
Jordan, and he's put them on the bench because

00:02:54.669 --> 00:02:58.210
he doesn't have an open rec to fill. But if you

00:02:58.210 --> 00:03:02.270
look at Jessica, Arnold is hiring for product

00:03:02.270 --> 00:03:06.169
manager types of roles. Back to the agent, you

00:03:06.169 --> 00:03:08.830
got the knowledge section and the instructions

00:03:08.830 --> 00:03:13.500
say, look at. the job description and compare

00:03:13.500 --> 00:03:16.379
them to the hiring bench. Also use all available

00:03:16.379 --> 00:03:20.099
resources on the web to research ideal qualities

00:03:20.099 --> 00:03:23.599
that a candidate should possess to be successful

00:03:23.599 --> 00:03:25.800
in this type of a role. Use this information

00:03:25.800 --> 00:03:29.099
in your analysis. In this first example you can

00:03:29.099 --> 00:03:32.439
see that I've highlighted GPT -53 chat experimental

00:03:32.439 --> 00:03:36.509
from the list. And so let's go ahead and fire

00:03:36.509 --> 00:03:38.210
off our first email. I'm going to open up my

00:03:38.210 --> 00:03:41.870
Outlook and Albert is ready to fire off an email

00:03:41.870 --> 00:03:46.090
to Arnold saying, hey, do you know anyone? Do

00:03:46.090 --> 00:03:49.189
you know anyone that fits this job role? We just

00:03:49.189 --> 00:03:52.189
got to drop the link in right over here. So I'm

00:03:52.189 --> 00:03:55.110
going to go ahead and go to the Microsoft career

00:03:55.110 --> 00:03:58.770
site. I'm just going to grab this link here and

00:03:58.770 --> 00:04:02.370
go back to the email that Albert is creating

00:04:02.370 --> 00:04:04.500
and just drop it in here. now we're going to

00:04:04.500 --> 00:04:07.719
send this off okay so the email just came in

00:04:07.719 --> 00:04:11.240
it's 318 so it didn't take that long and you

00:04:11.240 --> 00:04:14.840
can see here this is the response and even though

00:04:14.840 --> 00:04:19.519
i selected 53 of gpt it says ps this analysis

00:04:19.519 --> 00:04:23.379
was completed using gpt52 llm so we'll keep that

00:04:23.379 --> 00:04:27.240
in mind whether it's 5 2 or 5 3 and this is the

00:04:27.240 --> 00:04:29.819
attachment now we're going to open this up and

00:04:29.819 --> 00:04:32.420
just let's take a look at this document So this

00:04:32.420 --> 00:04:35.139
is the executive summary. There's a role alignment

00:04:35.139 --> 00:04:38.079
overview and it says Zoe Petroni. Here's the

00:04:38.079 --> 00:04:41.000
link to her resume. And it seems like she has

00:04:41.000 --> 00:04:45.779
an overall job fit of 92%. Then is Jose, which

00:04:45.779 --> 00:04:49.860
the overall fit for him is about 78 % and so

00:04:49.860 --> 00:04:52.420
on and so forth. So it does a pretty good job

00:04:52.420 --> 00:04:55.579
of explaining all these different sections that

00:04:55.579 --> 00:04:58.360
I asked it to, but I don't see any diagrams.

00:04:59.199 --> 00:05:01.240
Now what we're going to do, we're going to go

00:05:01.240 --> 00:05:04.399
back to the agent and we're going to go to the

00:05:04.399 --> 00:05:07.160
overview section. And the only thing that I'm

00:05:07.160 --> 00:05:10.259
going to change is I'm going to change this large

00:05:10.259 --> 00:05:13.620
language model. So here I'm going to select Cloud

00:05:13.620 --> 00:05:17.779
Sonnet 4 .6. Let's select that. And additionally,

00:05:17.879 --> 00:05:19.939
after that, I'm just going to select publish.

00:05:20.060 --> 00:05:23.850
Now we're going to lob another email in. the

00:05:23.850 --> 00:05:25.810
only difference is the llm chain so i'm just

00:05:25.810 --> 00:05:27.850
going to send it off and the only thing here

00:05:27.850 --> 00:05:30.490
is that i put part two so that we know that this

00:05:30.490 --> 00:05:32.430
is the second email coming in let's go check

00:05:32.430 --> 00:05:35.430
our inbox here and so the email just came in

00:05:35.430 --> 00:05:38.290
which again should trigger the agent the only

00:05:38.290 --> 00:05:41.269
difference is that it's using cloud sauna 4 .6

00:05:41.269 --> 00:05:43.310
by the way while we're waiting for that email

00:05:43.310 --> 00:05:46.050
if you're finding value in this type of a content

00:05:46.050 --> 00:05:48.170
it really does help the channel to give it a

00:05:48.170 --> 00:05:51.029
thumbs up and consider subscribing thank you

00:05:51.029 --> 00:05:53.790
very much for your support here it is and looks

00:05:53.790 --> 00:05:56.310
like from a high level perspective if you just

00:05:56.310 --> 00:05:59.370
look at kind of like this email the email looks

00:05:59.370 --> 00:06:02.370
like it has a lot more graphics and tables and

00:06:02.370 --> 00:06:05.750
it just generally looks a lot more pleasing if

00:06:05.750 --> 00:06:08.370
you compare it to this open ai one which is the

00:06:08.370 --> 00:06:10.649
one you're looking at right now so the email

00:06:10.649 --> 00:06:13.870
itself looks really nice and let's look at the

00:06:13.870 --> 00:06:15.870
details of it right it says hi albert thanks

00:06:15.870 --> 00:06:18.290
for reaching out i had a chance to review this

00:06:18.290 --> 00:06:21.069
is the clickable link. Top recommendation is

00:06:21.069 --> 00:06:24.050
Zoe Petroni, which is also consistent with what

00:06:24.050 --> 00:06:26.970
OpenAI decided, but it has a nice blurb here

00:06:26.970 --> 00:06:30.730
about why it's Zoe. And as we scroll down, it

00:06:30.730 --> 00:06:33.910
says that I've done a thorough analysis of all

00:06:33.910 --> 00:06:36.189
the candidates, including technical ability,

00:06:36.470 --> 00:06:40.209
and here's a candidate fit summary. And so this

00:06:40.209 --> 00:06:43.870
chart is essentially telling us that all these

00:06:43.870 --> 00:06:48.389
different people are rated for overall fit. And

00:06:48.389 --> 00:06:51.250
the recommendation, highly recommended, strong,

00:06:51.350 --> 00:06:54.810
second choice, moderate. And the percentages

00:06:54.810 --> 00:06:58.649
as well, 20%, 5%, not a match, and so on. And

00:06:58.649 --> 00:07:01.490
then it says attached is the Word document. So

00:07:01.490 --> 00:07:03.370
the email definitely looks a lot better. And

00:07:03.370 --> 00:07:07.430
here at the bottom, if you see, it says PS, this

00:07:07.430 --> 00:07:10.430
candidate analysis was compiled with the assistance

00:07:10.430 --> 00:07:19.000
of GPT -40 OpenAI. Oh. But that can't be true

00:07:19.000 --> 00:07:22.060
because I definitely selected Anthropic and I

00:07:22.060 --> 00:07:24.240
click publish. And this doesn't look like something

00:07:24.240 --> 00:07:27.600
GPT -4 .0 would do. So I think that's just a

00:07:27.600 --> 00:07:29.980
typo. What do you think? I'm going to go ahead

00:07:29.980 --> 00:07:32.360
and open this Word document. So let's look at

00:07:32.360 --> 00:07:34.819
this together. Again, my take is this is actually

00:07:34.819 --> 00:07:37.500
Anthropic. Here's the document. Here's the executive

00:07:37.500 --> 00:07:41.300
summary. Here's the top recommendation. Here's

00:07:41.300 --> 00:07:44.300
the strong second choice. And then as we scroll

00:07:44.300 --> 00:07:48.149
down. It has the overview for the actual role,

00:07:48.350 --> 00:07:51.509
including the posting that you can click on and

00:07:51.509 --> 00:07:55.029
what the role is. This it just grabbed from the

00:07:55.029 --> 00:07:58.089
job posting. And then it has the ideal candidate

00:07:58.089 --> 00:08:00.930
profile summary. Now it has a candidate comparison.

00:08:01.189 --> 00:08:04.790
Zoe Petroni. This is her resume. Why we think

00:08:04.790 --> 00:08:07.069
she's good. Technical ability, leadership, culture,

00:08:07.209 --> 00:08:10.110
fit. A lot more information here. Overall fit.

00:08:10.480 --> 00:08:14.220
95 % match, as well as Jose Gonzalez. This document

00:08:14.220 --> 00:08:17.100
looks a lot more thorough than the other one.

00:08:17.319 --> 00:08:20.279
And I'd love to see what your take is. But so

00:08:20.279 --> 00:08:23.300
far, so good. And actually quite impressive in

00:08:23.300 --> 00:08:26.079
terms of what Anthropic is able to do compared

00:08:26.079 --> 00:08:28.759
to OpenAI. And then if we go down all the way

00:08:28.759 --> 00:08:31.279
here, here's the candidate summary scorecard.

00:08:31.480 --> 00:08:33.320
So it has this nice chart that it put together

00:08:33.320 --> 00:08:36.820
for me. Recommended next steps. Immediately contact

00:08:36.820 --> 00:08:40.909
Zoe Petroni. She's near perfect. I'm just going

00:08:40.909 --> 00:08:43.990
to go back to the agent, go to the overview section,

00:08:44.250 --> 00:08:46.789
and I'm going to change this from Cloud Sonnet

00:08:46.789 --> 00:08:52.289
4 .6 to Grok 4 .1. I'm going to select Grok.

00:08:52.289 --> 00:08:54.889
So Grok is there. I'm going to go ahead and publish

00:08:54.889 --> 00:08:57.570
it. And I'm going to try again. This is part

00:08:57.570 --> 00:09:00.190
three now. This is the last one. And I'm going

00:09:00.190 --> 00:09:03.110
to go ahead and send this off. Okay. So an email

00:09:03.110 --> 00:09:05.970
came in. The first thing I noticed is that this

00:09:05.970 --> 00:09:08.639
email. It actually doesn't have a title. It says

00:09:08.639 --> 00:09:11.559
no subject. But the other emails actually do

00:09:11.559 --> 00:09:14.759
have a subject. But let's look at the email itself.

00:09:15.039 --> 00:09:18.639
So it says, hey, Albert, thanks for the note.

00:09:19.360 --> 00:09:21.940
And I've reviewed the job description and compared

00:09:21.940 --> 00:09:24.440
it to our hiring bench. Attached is a detailed

00:09:24.440 --> 00:09:27.679
analysis, including an executive summary. Now,

00:09:27.700 --> 00:09:29.899
the interesting thing is everyone is looking

00:09:29.899 --> 00:09:33.419
at Zoe as a 95 % fit. So across the models, that's

00:09:33.419 --> 00:09:36.320
definitely consistent. PS, this agent is using

00:09:36.320 --> 00:09:39.820
Grok LLM model, which is good. So for sure, we're

00:09:39.820 --> 00:09:42.740
using Grok. Now let's look at the actual document

00:09:42.740 --> 00:09:45.519
that got created. It looks pretty good. It looks

00:09:45.519 --> 00:09:47.860
interesting, right? So let's look at this. Senior

00:09:47.860 --> 00:09:50.620
Technical Program Manager. Candidate Analysis

00:09:50.620 --> 00:09:53.580
Report. Here's the job link. This is prepared

00:09:53.580 --> 00:09:57.559
by an AI hiring agent. This is the day. It's

00:09:57.559 --> 00:09:59.279
confidential. No, it's not because everything

00:09:59.279 --> 00:10:02.730
is fictitious. Executive Summary. The report

00:10:02.730 --> 00:10:06.190
analyzes bench candidates. Top, Zoe Petroni.

00:10:06.610 --> 00:10:10.929
Full resumes here for Jose Gonzalez, and he's

00:10:10.929 --> 00:10:14.750
at 72%. And then there's another one as well,

00:10:14.809 --> 00:10:17.809
Jessica Lin, 48%, and the rest of them are here.

00:10:18.269 --> 00:10:21.629
Job requirement qualities, technical strategy.

00:10:21.950 --> 00:10:25.090
So it put like an interesting bar chart here

00:10:25.090 --> 00:10:27.309
in terms of all the different candidates, because

00:10:27.309 --> 00:10:30.980
I think this model of Grok is chat -based. And

00:10:30.980 --> 00:10:33.580
here's the candidate analysis of the rest of

00:10:33.580 --> 00:10:36.559
them. So very good. And it has the location of

00:10:36.559 --> 00:10:40.700
the resumes as footnotes as well. But I don't

00:10:40.700 --> 00:10:43.740
think it looks just as good as the graphics and

00:10:43.740 --> 00:10:47.159
all the different colors that the Anthropic one

00:10:47.159 --> 00:10:50.080
did. And then if you compare it to this one,

00:10:50.320 --> 00:10:52.820
which is actually the OpenAI one, the first one

00:10:52.820 --> 00:10:55.279
that we did, which has the executive summary,

00:10:55.620 --> 00:10:58.820
role alignment, and then it has the people's

00:10:58.820 --> 00:11:02.220
listed. and how they fit for those different

00:11:02.220 --> 00:11:04.820
categories we're looking for and the links. And

00:11:04.820 --> 00:11:06.919
then finally, if you look at the Anthropic one,

00:11:07.080 --> 00:11:10.700
which is right here, which has the job listed,

00:11:10.940 --> 00:11:14.659
the executive summary, the top recommendation,

00:11:15.139 --> 00:11:18.539
the strong second choice. And then it has this

00:11:18.539 --> 00:11:21.360
role overview, which is different, which is one

00:11:21.360 --> 00:11:23.399
of those things that I think makes it stand out

00:11:23.399 --> 00:11:27.159
because it dives into the role and why. what

00:11:27.159 --> 00:11:29.580
the ideal candidate would be which is in this

00:11:29.580 --> 00:11:33.080
blue section here and then it goes into depth

00:11:33.080 --> 00:11:36.919
about who's first why lots of details here for

00:11:36.919 --> 00:11:39.500
each of these categories and i like the fact

00:11:39.500 --> 00:11:41.460
that it uses different colors again this is the

00:11:41.460 --> 00:11:44.440
anthropic model that we're using the four six

00:11:45.120 --> 00:11:47.539
And instead of like two or three pages, this

00:11:47.539 --> 00:11:50.559
document is 10 pages. It even has this chart

00:11:50.559 --> 00:11:53.879
over here and all kinds of good stuff. And my

00:11:53.879 --> 00:11:55.720
thoughts around this, and I would love to know

00:11:55.720 --> 00:11:58.580
what your thoughts are, is that what's awesome

00:11:58.580 --> 00:12:01.580
about the Microsoft Copilot product is that it

00:12:01.580 --> 00:12:04.899
is an open model choice. But it's not just inside

00:12:04.899 --> 00:12:08.440
Copilot Studio where you can select models. Even

00:12:08.440 --> 00:12:12.440
inside M365 Copilot Chat. In different places

00:12:12.440 --> 00:12:15.960
like researcher and even in chat, I think...

00:12:16.250 --> 00:12:19.769
that as these models keep on evolving and changing

00:12:19.769 --> 00:12:23.330
they're going to be good at something at one

00:12:23.330 --> 00:12:26.490
point in time a snapshot in time but then maybe

00:12:26.490 --> 00:12:29.269
another model becomes better at that function

00:12:29.269 --> 00:12:31.389
and they're going to keep kind of leapfrogging

00:12:31.389 --> 00:12:34.370
each other and so this beauty of this open model

00:12:34.370 --> 00:12:37.470
concept that copilot has where you can kind of

00:12:37.470 --> 00:12:40.029
plug and play your own large language model of

00:12:40.029 --> 00:12:42.409
choice allows you to decide which one you want

00:12:42.409 --> 00:12:45.899
to use at a certain point in time so again thanks

00:12:45.899 --> 00:12:48.419
so much for watching don't forget to like the

00:12:48.419 --> 00:12:50.539
video don't forget to subscribe if you have any

00:12:50.539 --> 00:12:53.460
questions let me know and i'll catch you on the

00:12:53.460 --> 00:12:53.899
next one