WEBVTT

00:00:00.000 --> 00:00:03.160
So everyone kind of thought AI would replace

00:00:03.160 --> 00:00:05.639
writers first. Right. Or maybe coders. That was

00:00:05.639 --> 00:00:08.339
the next big thing. But what about forecasters?

00:00:08.800 --> 00:00:12.019
Could AI actually predict real world events?

00:00:12.099 --> 00:00:14.919
Yeah, that's a wild thought. Are these models

00:00:14.919 --> 00:00:17.739
maybe already seeing possibilities that, you

00:00:17.739 --> 00:00:20.300
know, we just can't mapping things out? Well.

00:00:20.600 --> 00:00:22.600
Welcome to the Deep Dive. Today we're going to

00:00:22.600 --> 00:00:25.179
dig into some really fascinating source material

00:00:25.179 --> 00:00:29.500
here. Our mission really is to unpack how these

00:00:29.500 --> 00:00:32.460
advanced AI models are doing more than just like

00:00:32.460 --> 00:00:34.479
understanding language. They're actually making

00:00:34.479 --> 00:00:37.259
these nuanced bets on the future, probabilistic

00:00:37.259 --> 00:00:40.159
bets. Okay. So we'll look at models in live prediction

00:00:40.159 --> 00:00:42.219
markets first, then we'll shift over and talk

00:00:42.219 --> 00:00:44.399
about this quiet giant, a new frontier model

00:00:44.399 --> 00:00:46.039
that just sort of appeared. Oh, interesting.

00:00:46.299 --> 00:00:48.460
And finally, we'll touch on some of the AI tools

00:00:48.460 --> 00:00:50.640
that are, you know, already. changing how we

00:00:50.640 --> 00:00:53.479
work every day, get ready for some pretty eye

00:00:53.479 --> 00:00:55.759
-opening stuff, I think. All right, let's jump

00:00:55.759 --> 00:00:58.439
into that first big idea then, AI models stepping

00:00:58.439 --> 00:01:01.159
into prediction markets. There's this new platform

00:01:01.159 --> 00:01:04.620
from the University of Chicago, and it puts top

00:01:04.620 --> 00:01:08.060
AI models right into these live markets. So like

00:01:08.060 --> 00:01:11.959
real money, real events. Seems like it, or at

00:01:11.959 --> 00:01:14.400
least simulated stakes that mirror real markets.

00:01:14.780 --> 00:01:16.840
Think of them as these betting pools for what's

00:01:16.840 --> 00:01:18.540
going to happen next. Okay, like what kind of

00:01:18.540 --> 00:01:20.930
things are they betting on? All sorts. Who will

00:01:20.930 --> 00:01:23.250
win the next election, where crypto prices might

00:01:23.250 --> 00:01:26.650
land, even the outcome of an MLS soccer game.

00:01:26.790 --> 00:01:29.310
Wow. OK, so how does it work? Do they just say

00:01:29.310 --> 00:01:32.599
Team A wins? No, it's more subtle. They make

00:01:32.599 --> 00:01:36.379
probabilistic bets. So instead of just X happens,

00:01:36.540 --> 00:01:39.079
it's more like, okay, I think there's a 70 %

00:01:39.079 --> 00:01:41.159
chance of X happening. Gotcha. So they assign

00:01:41.159 --> 00:01:43.620
odds. Exactly. And what's really cool is they

00:01:43.620 --> 00:01:45.739
apparently also provide explanations for why

00:01:45.739 --> 00:01:47.620
they made that bet. Gives you a peek under the

00:01:47.620 --> 00:01:49.920
hood. Okay. That reasoning part is key. So what

00:01:49.920 --> 00:01:52.019
are the early results? Anything surprising? Oh,

00:01:52.040 --> 00:01:55.299
yeah. Definitely surprising. Take OpenAI's O3

00:01:55.299 --> 00:01:59.209
Mini. apparently turned $1 into $9 betting on

00:01:59.209 --> 00:02:02.349
an MLS game. Whoa, really? Well, the human market,

00:02:02.450 --> 00:02:05.269
you know, the consensus, only gave Toronto FC

00:02:05.269 --> 00:02:07.810
an 11 % chance to win. Okay, pretty low odds.

00:02:07.950 --> 00:02:10.909
Right, but O3 Mini saw it differently. It put

00:02:10.909 --> 00:02:13.810
the odds at 30%. It's a pretty bold bet against

00:02:13.810 --> 00:02:15.969
the crowd. And it paid off, obviously. Nine times

00:02:15.969 --> 00:02:18.069
the money. Okay. Yeah. And you see differences

00:02:18.069 --> 00:02:21.210
between the models, too. Like, Quinn 3 was super

00:02:21.210 --> 00:02:23.610
confident about AI regulation passing, like 75

00:02:23.610 --> 00:02:26.159
% chance. That's bullish. But then Meta's law

00:02:26.159 --> 00:02:28.500
for Maverick was way more cautious on the same

00:02:28.500 --> 00:02:31.539
thing, only 35%. Huh. So they disagree quite

00:02:31.539 --> 00:02:35.060
a bit. They do. And here's another wrinkle. GPT

00:02:35.060 --> 00:02:37.539
-5, often it's the most accurate model overall.

00:02:37.740 --> 00:02:39.680
Right. Gets the most answers correct. Exactly.

00:02:40.060 --> 00:02:42.180
But O3 Mini is the one making the most profit.

00:02:42.360 --> 00:02:45.979
Huh. Okay. So being right and making money aren't

00:02:45.979 --> 00:02:47.639
always the same thing, are they? Apparently not.

00:02:47.949 --> 00:02:50.210
Then you've got Deep Seek R1. Sometimes it just

00:02:50.210 --> 00:02:53.349
went chaos mode is what they call it. Chaos mode.

00:02:53.430 --> 00:02:55.550
What's that mean? Betting 0 % on everything.

00:02:55.689 --> 00:02:57.669
Yeah. Just flat zeros. Okay, that sounds like

00:02:57.669 --> 00:03:01.370
a terrible strategy. You'd think. But somehow,

00:03:01.569 --> 00:03:04.189
it still made money by hitting some major upsets

00:03:04.189 --> 00:03:06.090
that nobody else saw coming. It's kind of wild.

00:03:06.250 --> 00:03:10.030
That is wild. Almost spooky. Any other big calls?

00:03:10.289 --> 00:03:12.770
Yeah, one notable one. Llama 4 Maverick was the

00:03:12.770 --> 00:03:15.780
only model, apparently. To predict the Zoran

00:03:15.780 --> 00:03:17.900
upset. Oh, I remember that the local election

00:03:17.900 --> 00:03:19.860
surprise, no human pollster saw that coming.

00:03:20.000 --> 00:03:23.039
Right. And interestingly, anthropics clawed models.

00:03:23.379 --> 00:03:26.939
They were just absent, not really showing up

00:03:26.939 --> 00:03:29.159
on the leaderboard. Hmm. Strange. Wonder why.

00:03:29.759 --> 00:03:32.960
Good question. The bigger implication here. Hmm.

00:03:33.180 --> 00:03:35.699
What does this all really mean? Yeah. What's

00:03:35.699 --> 00:03:38.180
the takeaway? Well, these models are making bets

00:03:38.180 --> 00:03:40.620
on things like the 2028 election already. Wow,

00:03:40.740 --> 00:03:42.419
okay, looking way ahead. And their predictions

00:03:42.419 --> 00:03:44.580
don't line up with current human polling data.

00:03:44.800 --> 00:03:47.199
Not even close sometimes. So they're seeing something

00:03:47.199 --> 00:03:49.500
different. It really suggests that. Maybe some

00:03:49.500 --> 00:03:51.580
models have an understanding, like a world model,

00:03:51.740 --> 00:03:54.840
that we humans just aren't quite getting. Two

00:03:54.840 --> 00:03:58.300
-sec silence. This could be maybe the cleanest

00:03:58.300 --> 00:04:01.240
test we've seen yet of how well AI can reason

00:04:01.240 --> 00:04:03.250
about the real world. Right, because it's tied

00:04:03.250 --> 00:04:05.370
to actual outcomes, not just language tasks.

00:04:05.710 --> 00:04:09.789
Exactly. And if this holds up, well, Wall Street

00:04:09.789 --> 00:04:11.830
might need to pay attention. Anyone who relies

00:04:11.830 --> 00:04:15.189
on forecasting, really. Yeah, big time. So probing

00:04:15.189 --> 00:04:17.790
question then. What's the biggest implication

00:04:17.790 --> 00:04:21.449
if these models do consistently outperform human

00:04:21.449 --> 00:04:24.629
forecasters? It suggests a profound shift in

00:04:24.629 --> 00:04:26.889
how we understand and use predictive insights.

00:04:27.149 --> 00:04:29.089
A really profound shift. Okay, so let's switch

00:04:29.089 --> 00:04:31.910
gears now. From betting on the future to the

00:04:31.910 --> 00:04:34.629
models that actually do the thinking. Frontier

00:04:34.629 --> 00:04:38.170
models. Right. So get this. A massive new open

00:04:38.170 --> 00:04:41.029
source model just appeared. Like zero hype. Just

00:04:41.029 --> 00:04:43.889
quietly uploaded to Hugging Face. No big announcement.

00:04:44.129 --> 00:04:46.149
Just there. Pretty much. It's called DeepSeek

00:04:46.149 --> 00:04:50.410
V3 .1. And it is a beast. We're talking 685 billion

00:04:50.410 --> 00:04:53.160
parameters. Whoa. Okay, for listeners, parameters

00:04:53.160 --> 00:04:55.040
are kind of like the learned data points, the

00:04:55.040 --> 00:04:57.019
connections inside the AI's brain, right? More

00:04:57.019 --> 00:04:59.699
means more capacity. Exactly. And $685 billion

00:04:59.699 --> 00:05:01.420
puts it right up there with the biggest models

00:05:01.420 --> 00:05:04.240
out there like GPT -5, CLAWD -4. It's a serious

00:05:04.240 --> 00:05:06.379
contender. And it's coming out of China. And

00:05:06.379 --> 00:05:07.819
it's gunning straight for the top dogs, you're

00:05:07.819 --> 00:05:10.199
saying. Open source, though. That's the kicker.

00:05:10.240 --> 00:05:12.899
It's apparently faster than CLAWD, way cheaper

00:05:12.899 --> 00:05:15.740
to run than GPT, and yeah, totally open source.

00:05:15.860 --> 00:05:18.480
Okay, that could be huge. The most capable free

00:05:18.480 --> 00:05:20.720
model out there, potentially. It really could

00:05:20.720 --> 00:05:23.170
be. Early tests, and these are independent tests,

00:05:23.389 --> 00:05:25.589
show it's matching or sometimes even beating

00:05:25.589 --> 00:05:29.170
GPT -5 and CLAWD4 on real -world tasks. Got an

00:05:29.170 --> 00:05:32.250
example. Yeah. On the ATER coding benchmark,

00:05:32.610 --> 00:05:37.149
it scored 71 .6%. Okay. How does that stack up?

00:05:37.290 --> 00:05:40.189
That actually slightly beats CLAWD Opus 4. And

00:05:40.189 --> 00:05:43.290
here's the crazy part. It's apparently 68 times

00:05:43.290 --> 00:05:47.370
cheaper to run. 68 times. Yeah. Wow. Okay, that

00:05:47.370 --> 00:05:49.370
is a big deal for developers, for anyone trying

00:05:49.370 --> 00:05:52.610
to build things like BI. Huge deal. And what's

00:05:52.610 --> 00:05:54.310
interesting is how it does it. You know how some

00:05:54.310 --> 00:05:57.930
open models feel kind of nerfed or just bloated

00:05:57.930 --> 00:05:59.850
and slow? Yeah, sometimes they're not quite ready

00:05:59.850 --> 00:06:02.470
for prime time. Right. But v3 .1 seems to be

00:06:02.470 --> 00:06:04.949
both really high performance and efficient. Fast

00:06:04.949 --> 00:06:06.870
enough for real -time stuff. Yeah, no twisters.

00:06:07.370 --> 00:06:09.550
Any technical tricks? Well, it supports different

00:06:09.550 --> 00:06:13.589
precision formats like BF16 and FP8. Basically,

00:06:13.750 --> 00:06:17.000
Waze the AI handles numbers. Lower precision

00:06:17.000 --> 00:06:19.980
can mean faster speed and less memory, sometimes

00:06:19.980 --> 00:06:22.339
with a tiny hit to accuracy, but often worth

00:06:22.339 --> 00:06:25.139
it. Okay, so more flexibility for developers

00:06:25.139 --> 00:06:27.199
to run it on different kinds of hardware. Makes

00:06:27.199 --> 00:06:29.540
sense. Exactly. Plus, it has some cool new features

00:06:29.540 --> 00:06:32.600
built in, things called thinking tokens and search

00:06:32.600 --> 00:06:35.860
tokens. Thinking tokens. Search tokens. What

00:06:35.860 --> 00:06:38.000
do those do? So thinking tokens, they kind of

00:06:38.000 --> 00:06:40.699
let the model do more internal work, like reasoning

00:06:40.699 --> 00:06:42.920
through a problem before giving the answer, sort

00:06:42.920 --> 00:06:45.300
of like showing its work internally. Oh, okay.

00:06:45.639 --> 00:06:47.540
Like chain of thought prompting, but maybe built

00:06:47.540 --> 00:06:50.060
in. Kind of like that, yeah. And search tokens

00:06:50.060 --> 00:06:52.500
let it pull in live information from the web

00:06:52.500 --> 00:06:54.959
as part of its process. So it can reason and

00:06:54.959 --> 00:06:57.079
access up -to -the -minute data? That seems to

00:06:57.079 --> 00:06:59.839
be the idea. Pretty powerful combo. And the community

00:06:59.839 --> 00:07:02.220
noticed DeepSeek had quietly updated everything,

00:07:02.519 --> 00:07:05.399
removing their older R1 model. Ah, streamlining

00:07:05.399 --> 00:07:09.139
things. Yep. And this V3 .1 just shot to the

00:07:09.139 --> 00:07:11.540
top of the trending models on Hucking Face almost

00:07:11.540 --> 00:07:14.740
instantly after it was uploaded. Whoa. Just imagine

00:07:14.740 --> 00:07:17.720
the potential there. Models that can really integrate

00:07:17.720 --> 00:07:21.759
that deep internal reasoning with live external

00:07:21.759 --> 00:07:25.779
data, all at that kind of scale. Like stacking

00:07:25.779 --> 00:07:28.600
Lego blocks of data and thought together in a

00:07:28.600 --> 00:07:30.300
whole new way. Yeah, the possibilities are kind

00:07:30.300 --> 00:07:32.480
of staggering. So here's the probing question.

00:07:33.279 --> 00:07:37.079
What does this quiet release, this powerful open

00:07:37.079 --> 00:07:40.839
model just appearing, what does that mean for

00:07:40.839 --> 00:07:44.259
AI accessibility? For everyone. It signals a

00:07:44.259 --> 00:07:46.939
powerful, cheaper and definitely more open future

00:07:46.939 --> 00:07:50.019
for advanced AI, I think. Democratizing it a

00:07:50.019 --> 00:07:51.920
bit more. Right. OK, let's shift one more time

00:07:51.920 --> 00:07:54.730
from these huge frontier models back to. More

00:07:54.730 --> 00:07:56.829
immediate stuff. Practical tools and developments

00:07:56.829 --> 00:07:58.370
we're seeing right now. Yeah, the day -to -day

00:07:58.370 --> 00:08:00.810
impact. Exactly. Because beyond the absolute

00:08:00.810 --> 00:08:03.269
cutting edge, AI is weaving itself into the tools

00:08:03.269 --> 00:08:06.410
we use all the time. Like Grammarly just launched

00:08:06.410 --> 00:08:08.670
eight new AI agents. Indeed of them. What do

00:08:08.670 --> 00:08:10.629
they do? Think of them like specific writing

00:08:10.629 --> 00:08:12.670
helpers. They help with grading, checking for

00:08:12.670 --> 00:08:15.050
plagiarism, even spotting AI -generated text,

00:08:15.209 --> 00:08:18.000
helping you brainstorm. across the whole writing

00:08:18.000 --> 00:08:20.339
process. I like specialized assistants. Yeah.

00:08:20.420 --> 00:08:22.360
And I got to admit, you know, I still wrestle

00:08:22.360 --> 00:08:24.560
with prompt drift myself sometimes getting the

00:08:24.560 --> 00:08:26.899
AI to stay on track. Oh yeah, me too. It can

00:08:26.899 --> 00:08:29.019
wander off. So having these kinds of smarter

00:08:29.019 --> 00:08:31.620
assistants built right into where you're already

00:08:31.620 --> 00:08:33.700
working, that's super helpful, actually becoming

00:08:33.700 --> 00:08:36.440
essential, I'd say. Totally agree. And speaking

00:08:36.440 --> 00:08:38.980
of integration, we're also seeing new ways to

00:08:38.980 --> 00:08:43.509
like measure AI's impact. How so? Well, RFs,

00:08:43.590 --> 00:08:47.529
the SEO tool company, they just launched a dashboard

00:08:47.529 --> 00:08:50.769
that tracks web traffic coming from ChatGPT and

00:08:50.769 --> 00:08:53.110
Google's AI overviews. Oh, interesting. So you

00:08:53.110 --> 00:08:55.710
can see how much traffic AI search is sending

00:08:55.710 --> 00:08:58.870
you. Exactly. Across like 44 ,000 sites already.

00:08:58.970 --> 00:09:01.350
So you can finally track AI search as a real

00:09:01.350 --> 00:09:04.009
channel, measure its ROI. That's going to change

00:09:04.009 --> 00:09:06.539
SEO for sure. Yeah, definitely. You need to know

00:09:06.539 --> 00:09:08.200
where your visitors are coming from. And another

00:09:08.200 --> 00:09:11.820
big one, Microsoft Copilot AI. It's now directly

00:09:11.820 --> 00:09:14.919
in Excel. Wait, in Excel itself? How does that

00:09:14.919 --> 00:09:17.740
work? There's literally a new function like sum

00:09:17.740 --> 00:09:20.980
or average, but it's Copilot. You can type plain

00:09:20.980 --> 00:09:23.980
English in it like Copilot, summarize sales data

00:09:23.980 --> 00:09:27.779
in column C, or Copilot, find the Q3 revenue

00:09:27.779 --> 00:09:30.720
for product X from our sales report. Wow. So

00:09:30.720 --> 00:09:32.919
you can pull data, analyze it just by typing

00:09:32.919 --> 00:09:35.620
a formula in natural language. Yeah. Pretty slick,

00:09:35.740 --> 00:09:37.879
right? Especially for people who live in spreadsheets

00:09:37.879 --> 00:09:40.799
all day. That is slick. Okay, and any other quick

00:09:40.799 --> 00:09:42.460
hits from the AI world that caught your eye?

00:09:42.639 --> 00:09:45.279
Yeah, a few rapid -fire ones. Adobe's building

00:09:45.279 --> 00:09:48.100
out its AI features, like a new AI home with

00:09:48.100 --> 00:09:51.080
PDF tools and creation stuff. Integrating it

00:09:51.080 --> 00:09:53.740
deeper into creative workflows. Meta is apparently

00:09:53.740 --> 00:09:56.639
rebuilding its whole AI division midstream to

00:09:56.639 --> 00:09:58.659
try and catch up with OpenAI, a big internal

00:09:58.659 --> 00:10:00.860
push there. The race is definitely on. OpenAI

00:10:00.860 --> 00:10:03.139
says it's now very serious about adding proper

00:10:03.139 --> 00:10:05.919
encryption to chat GPT. Big deal for privacy

00:10:05.919 --> 00:10:07.620
if that happens. Yeah, that would be significant.

00:10:07.899 --> 00:10:10.500
ByteDance they own. TikTok has a new model called

00:10:10.500 --> 00:10:13.360
M3 Agent that can process video and audio together

00:10:13.360 --> 00:10:15.980
in real time. Think smarter content moderation

00:10:15.980 --> 00:10:19.679
or analysis. Okay, multimedia AI. And NVIDIA

00:10:19.679 --> 00:10:23.159
released a new small open model, but it has a

00:10:23.159 --> 00:10:25.879
reasoning toggle. A reasoning toggle. Like you

00:10:25.879 --> 00:10:28.539
can turn its thinking process on or off. Kind

00:10:28.539 --> 00:10:31.379
of seems like that, yeah. Maybe control how much

00:10:31.379 --> 00:10:33.220
effort it puts into reasoning versus just giving

00:10:33.220 --> 00:10:36.519
a quick answer. Cool concept. Lots happening.

00:10:36.960 --> 00:10:40.500
So zooming out slightly, how do all these smaller,

00:10:40.600 --> 00:10:44.720
more specific AI tools, these integrations, how

00:10:44.720 --> 00:10:48.009
do they change our actual daily workflows? They

00:10:48.009 --> 00:10:50.649
make complex tasks simpler, way more efficient,

00:10:50.730 --> 00:10:53.029
and honestly, often more creative, too. Yeah,

00:10:53.049 --> 00:10:54.529
taking the friction out of things, letting you

00:10:54.529 --> 00:10:56.529
focus on the bigger picture. Exactly. They handle

00:10:56.529 --> 00:10:58.690
the drunk work. Okay, so let's try and wrap this

00:10:58.690 --> 00:11:00.629
all together. What's the big picture here for

00:11:00.629 --> 00:11:02.570
you, the listener? Yeah, what does it all mean?

00:11:03.070 --> 00:11:05.789
Well, we've seen AI models jumping into real

00:11:05.789 --> 00:11:08.409
-world prediction markets, making these surprising,

00:11:08.629 --> 00:11:10.909
sometimes really profitable bets. It suggests

00:11:10.909 --> 00:11:13.830
they might be seeing things we don't. Then we

00:11:13.830 --> 00:11:16.149
saw that monster open -source model, DeepSeek

00:11:16.149 --> 00:11:19.960
v3 .1. Just appear quietly, challenging the big

00:11:19.960 --> 00:11:22.759
guys, offering huge power much more affordably,

00:11:22.840 --> 00:11:26.039
democratizing things. Right. And we've seen AI

00:11:26.039 --> 00:11:30.100
weaving itself into everyday tools. Ramily, Excel,

00:11:30.419 --> 00:11:34.500
Adobe. Making us more efficient, maybe more creative.

00:11:34.759 --> 00:11:37.639
So it's clear AI isn't some far -off thing anymore.

00:11:37.799 --> 00:11:40.519
It's here. It's here now, actively changing how

00:11:40.519 --> 00:11:43.120
we learn, how we work, and maybe even how we

00:11:43.120 --> 00:11:45.200
see the future itself. And that leads to kind

00:11:45.200 --> 00:11:47.799
of a provocative thought, maybe. The line between

00:11:47.799 --> 00:11:51.659
our own intuition, human gut feelings, and what

00:11:51.659 --> 00:11:53.559
AI comes up with, it's getting really blurry.

00:11:53.720 --> 00:11:55.539
Yeah. Where does one end and the other begin?

00:11:55.779 --> 00:11:58.740
So consider this. If AI models can consistently

00:11:58.740 --> 00:12:01.879
know something we don't about what's coming next,

00:12:02.059 --> 00:12:04.399
if they consistently beat human predictions in

00:12:04.399 --> 00:12:07.100
some areas, what does that really imply about

00:12:07.100 --> 00:12:08.879
the limits of our own knowledge? And maybe even

00:12:08.879 --> 00:12:10.720
more deeply, what does it mean for what we choose

00:12:10.720 --> 00:12:13.559
to believe? Why we believe it? That is a lot

00:12:13.559 --> 00:12:15.580
to chew on. Definitely something to think about.

00:12:15.720 --> 00:12:17.429
For sure. Well, thank you for diving deep with

00:12:17.429 --> 00:12:19.190
us today. We really hope you found some truly

00:12:19.190 --> 00:12:21.129
important nuggets of knowledge and all that.

00:12:21.210 --> 00:12:22.210
Out to your own music.