WEBVTT

00:00:00.000 --> 00:00:01.980
What happens when artificial intelligence stops

00:00:01.980 --> 00:00:04.480
apologizing and starts fitting completely inside

00:00:04.480 --> 00:00:07.839
your pocket? Beat. Welcome to our deep dive for

00:00:07.839 --> 00:00:10.300
today. We are genuinely thrilled you decided

00:00:10.300 --> 00:00:13.019
to join us. Today we are exploring OpenAI's newest

00:00:13.019 --> 00:00:16.679
update. It is designed to make chat GPT significantly

00:00:16.679 --> 00:00:20.120
less cringe. We are also flying through a lightning

00:00:20.120 --> 00:00:22.739
round of industry news. That includes a massive

00:00:22.739 --> 00:00:25.160
hardware leap and a mysterious leaked model.

00:00:25.469 --> 00:00:28.410
Plus, we are unpacking how Alibaba's tiny new

00:00:28.410 --> 00:00:32.130
AI humiliated a massive giant. The entire computing

00:00:32.130 --> 00:00:34.810
landscape is transforming right beneath us. I

00:00:34.810 --> 00:00:36.590
am really looking forward to unpacking this together.

00:00:36.829 --> 00:00:38.909
Let's start with OpenAI and model personality

00:00:38.909 --> 00:00:41.869
design choices. For a very long time, the biggest

00:00:41.869 --> 00:00:43.890
user complaint remained. It was never really

00:00:43.890 --> 00:00:46.630
about raw speed or overall intelligence. The

00:00:46.630 --> 00:00:48.729
primary issue was simply the underlying tone

00:00:48.729 --> 00:00:51.560
of responses. Yeah. That tone was a massive source

00:00:51.560 --> 00:00:53.899
of daily friction. You would get incredibly long,

00:00:54.060 --> 00:00:56.640
moralizing preambles before straightforward answers.

00:00:56.859 --> 00:00:59.579
It had way too many excessive safety caveats

00:00:59.579 --> 00:01:01.479
programmed in. It would literally tell you to

00:01:01.479 --> 00:01:03.920
stop and take a breath. I still wrestle with

00:01:03.920 --> 00:01:06.900
my AI constantly apologizing to me. It makes

00:01:06.900 --> 00:01:09.579
the entire user experience feel incredibly clunky

00:01:09.579 --> 00:01:11.879
today. You just want the digital machine to do

00:01:11.879 --> 00:01:14.879
its job. Users want a highly capable digital

00:01:14.879 --> 00:01:18.239
tool, not a life coach. OpenAI just released

00:01:18.239 --> 00:01:21.340
GPT -5 .3 Instant to finally fix this problem.

00:01:21.599 --> 00:01:23.799
This new update is heavily designed to sound

00:01:23.799 --> 00:01:26.540
incredibly natural. It answers your direct questions

00:01:26.540 --> 00:01:29.200
and drops the awkwardness entirely. But does

00:01:29.200 --> 00:01:31.420
the underlying statistical data actually support

00:01:31.420 --> 00:01:34.340
that claim? I am always highly skeptical of these

00:01:34.340 --> 00:01:36.959
sweeping software promises. They often promise

00:01:36.959 --> 00:01:39.560
massive behavioral improvements that feel largely

00:01:39.560 --> 00:01:41.859
invisible. Surprisingly, the hard empirical data

00:01:41.859 --> 00:01:44.379
actually backs their corporate claims. They achieved

00:01:44.379 --> 00:01:48.439
26 .8 % fewer hallucinations overall today. That

00:01:48.439 --> 00:01:51.019
specifically applies to live web results globally.

00:01:51.640 --> 00:01:53.859
That is a pretty massive leap in daily system

00:01:53.859 --> 00:01:57.560
reliability. What exactly counts as a hallucination

00:01:57.560 --> 00:02:00.299
in this specific context? When AI makes up fake

00:02:00.299 --> 00:02:02.519
facts instead of saying, I don't know, it just

00:02:02.519 --> 00:02:05.219
completely destroys foundational user trust in

00:02:05.219 --> 00:02:08.620
the system. They also saw a 19 .7 % reliability

00:02:08.620 --> 00:02:12.139
improvement internally. Plus, direct user feedback

00:02:12.139 --> 00:02:15.740
showed 22 .5 % fewer hallucinations globally.

00:02:16.180 --> 00:02:18.479
So the previous model version is stepping aside

00:02:18.479 --> 00:02:22.330
very soon. GPT -5 .2 Instant is officially retiring

00:02:22.330 --> 00:02:25.490
from service on June 3rd. That feels incredibly

00:02:25.490 --> 00:02:27.569
fast for a complete software model lifecycle.

00:02:27.949 --> 00:02:30.629
It really is an aggressively fast digital lifecycle

00:02:30.629 --> 00:02:33.129
nowadays everywhere. And they are already teasing

00:02:33.129 --> 00:02:36.569
the mysterious GPT -5 .4 release timeline. The

00:02:36.569 --> 00:02:38.449
release schedule just keeps accelerating faster

00:02:38.449 --> 00:02:41.449
than anyone expected. Two secs silence. I want

00:02:41.449 --> 00:02:43.030
to ask you about the underlying software engineering.

00:02:43.389 --> 00:02:46.849
Why is reducing these AI refusals so technically

00:02:46.849 --> 00:02:49.729
difficult today? Well, it is a profound daily

00:02:49.729 --> 00:02:52.270
balance of safety and helpfulness. If you remove

00:02:52.270 --> 00:02:54.469
the guardrails entirely, the AI breaks down.

00:02:54.610 --> 00:02:57.009
It might generate dangerous or highly toxic content

00:02:57.009 --> 00:02:59.509
almost immediately. Teaching it nuanced context

00:02:59.509 --> 00:03:01.990
requires immense computational effort constantly

00:03:01.990 --> 00:03:04.909
running. So fewer guardrails, but much smarter

00:03:04.909 --> 00:03:08.289
navigation. Precisely. It knows exactly when

00:03:08.289 --> 00:03:11.090
to pump the brakes naturally now. It understands

00:03:11.090 --> 00:03:13.750
human conversation nuance much better than previous

00:03:13.750 --> 00:03:16.740
versions. Let's dive right into a rapid synthesis

00:03:16.740 --> 00:03:20.000
of industry news. We absolutely need to examine

00:03:20.000 --> 00:03:22.139
these completely frictionless workflows today.

00:03:22.740 --> 00:03:24.879
Anthropic just rolled out voice mode for their

00:03:24.879 --> 00:03:27.699
Claude code. You simply speak your commands and

00:03:27.699 --> 00:03:29.560
it writes the code. Right. And that completely

00:03:29.560 --> 00:03:32.340
changes the deep creative flow state. You dictate

00:03:32.340 --> 00:03:34.240
the structural architecture and it writes the

00:03:34.240 --> 00:03:37.110
syntax. We're also seeing incredible platforms

00:03:37.110 --> 00:03:40.370
like GetVictor .com. It acts autonomously across

00:03:40.370 --> 00:03:43.370
over 3 ,000 different Slack tools. That level

00:03:43.370 --> 00:03:45.430
of autonomous integration is simply staggering

00:03:45.430 --> 00:03:48.789
to consider. Then we have Crisp Conversion, working

00:03:48.789 --> 00:03:51.169
on global business communication. It understands

00:03:51.169 --> 00:03:53.569
highly accented speech locally on your personal

00:03:53.569 --> 00:03:56.490
device. You get near zero latency for your global

00:03:56.490 --> 00:03:58.849
video conference calls. Yeah, it removes the

00:03:58.849 --> 00:04:00.530
language barrier from international business

00:04:00.530 --> 00:04:03.830
communication entirely. We also have major developments

00:04:03.830 --> 00:04:05.849
regarding physical hardware processing speed.

00:04:06.250 --> 00:04:09.770
Google introduced Gemini 3 .1 Flashlight in a

00:04:09.770 --> 00:04:12.449
developer preview today. This new version is

00:04:12.449 --> 00:04:15.610
45 % faster than previous iterations. It could

00:04:15.610 --> 00:04:18.310
easily handle 1 million token prompts from users.

00:04:18.629 --> 00:04:20.189
I mean, what does that actually look like in

00:04:20.189 --> 00:04:23.389
practice? Feeding the AI. dozens of full books

00:04:23.389 --> 00:04:26.170
all at once. It holds all that dense information

00:04:26.170 --> 00:04:28.850
in its active memory. That completely changes

00:04:28.850 --> 00:04:31.290
how financial analysts review massive data sets.

00:04:31.790 --> 00:04:34.589
Speaking of incredible hardware leaps, IR Labs

00:04:34.589 --> 00:04:37.879
raised massive funding. They secured $500 million

00:04:37.879 --> 00:04:40.699
at huge corporate valuations. They are backed

00:04:40.699 --> 00:04:43.279
by massive industry giants like Nvidia now. Whoa,

00:04:43.540 --> 00:04:46.540
imagine using literal light to wire AI brains

00:04:46.540 --> 00:04:48.860
together. They're replacing traditional copper

00:04:48.860 --> 00:04:51.560
wires with advanced optical connections. Using

00:04:51.560 --> 00:04:53.439
light completely changes the physical limits

00:04:53.439 --> 00:04:55.699
of global computing. We must also address the

00:04:55.699 --> 00:04:58.100
real world safety consequences today. Twitter

00:04:58.100 --> 00:05:00.279
is handing out incredibly harsh platform penalties

00:05:00.279 --> 00:05:02.620
right now. They will suspend anyone for posting

00:05:02.620 --> 00:05:05.470
completely fake AI videos. Deepfakes are becoming

00:05:05.470 --> 00:05:08.649
completely indistinguishable from actual physical

00:05:08.649 --> 00:05:12.850
video reality. We also saw a highly fascinating

00:05:12.850 --> 00:05:15.790
test regarding geopolitical conflicts. Researchers

00:05:15.790 --> 00:05:17.949
asked different models about the Iran war test

00:05:17.949 --> 00:05:21.389
recently. One hallucinated while another stayed

00:05:21.389 --> 00:05:24.470
extremely politically cautious and refused. There

00:05:24.470 --> 00:05:27.110
is also incredible drama exploding in the developer

00:05:27.110 --> 00:05:30.899
community. Developers spotted GPT -5 .4 inside

00:05:30.899 --> 00:05:33.939
the OpenAI codex source code. Then the references

00:05:33.939 --> 00:05:36.480
suddenly disappeared completely a few short hours

00:05:36.480 --> 00:05:39.040
later. It genuinely looks like a classic oops

00:05:39.040 --> 00:05:41.680
internal corporate leak. We also have a quick

00:05:41.680 --> 00:05:44.100
update from Deep Personality today. They claim

00:05:44.100 --> 00:05:45.720
to understand you better than a professional

00:05:45.720 --> 00:05:48.860
therapist. They just need 10 interactive conversational

00:05:48.860 --> 00:05:51.180
digital sessions with you. Okay, we have talked

00:05:51.180 --> 00:05:54.300
heavily about voice coding and microchips. We

00:05:54.300 --> 00:05:56.779
discussed light speed connections and fully autonomous

00:05:56.779 --> 00:05:59.819
Slack agent tools. What is the underlying theme

00:05:59.819 --> 00:06:01.860
connecting all these scattered updates? It is

00:06:01.860 --> 00:06:03.959
about removing the barrier between human thought

00:06:03.959 --> 00:06:06.300
and digital execution. You just think or naturally

00:06:06.300 --> 00:06:09.000
speak and the machine acts. It's all about frictionless

00:06:09.000 --> 00:06:13.250
thinking and absolutely zero delays. We will

00:06:13.250 --> 00:06:16.209
be right back after a brief word. Mid -roll sponsor

00:06:16.209 --> 00:06:19.910
break. And we are back to our deep dive. Now

00:06:19.910 --> 00:06:22.449
let's confidently move to our final big breakthrough

00:06:22.449 --> 00:06:25.430
today. Alibaba just launched a massive digital

00:06:25.430 --> 00:06:28.449
family of tiny models. It is officially called

00:06:28.449 --> 00:06:33.839
the QEN 3 .5 Small AI Model Series. These models

00:06:33.839 --> 00:06:36.180
are explicitly designed to run completely offline

00:06:36.180 --> 00:06:38.560
locally. They released four specific versions

00:06:38.560 --> 00:06:41.180
tailored for different local hardware. First,

00:06:41.300 --> 00:06:44.259
we clearly have the 0 .8 billion version. That

00:06:44.259 --> 00:06:46.079
metric is strictly counting the model's total

00:06:46.079 --> 00:06:48.660
parameters overall. Remind me, what exactly are

00:06:48.660 --> 00:06:51.180
parameters in this specific context? The digital

00:06:51.180 --> 00:06:53.879
brain cells that determine an AI's overall smarts.

00:06:53.879 --> 00:06:56.139
That smallest model is perfectly sized for everyday

00:06:56.139 --> 00:06:58.620
mobile phones. Then they have a 2 billion parameter

00:06:58.620 --> 00:07:01.160
version available now. The 4 billion parameter

00:07:01.160 --> 00:07:03.800
model offers significantly stronger multimodal

00:07:03.800 --> 00:07:05.839
capabilities. You can physically show it a picture

00:07:05.839 --> 00:07:07.860
and it understands. And the 9 billion parameter

00:07:07.860 --> 00:07:10.860
version is the absolute powerhouse. The 9 billion

00:07:10.860 --> 00:07:13.680
parameter model pulled off a historic upset.

00:07:13.920 --> 00:07:17.720
It completely outscored OpenAI's massive GPT

00:07:17.720 --> 00:07:21.279
-OSS 120B model in direct testing. It won on

00:07:21.279 --> 00:07:23.620
complex graduate level reasoning and nuanced

00:07:23.620 --> 00:07:26.819
multilingual knowledge. That massive OpenAI model

00:07:26.819 --> 00:07:30.329
is more than 13 times larger. The 4 billion model

00:07:30.329 --> 00:07:33.410
also achieved something absolutely visually spectacular.

00:07:33.810 --> 00:07:37.269
It matched complex visual benchmark scores previously

00:07:37.269 --> 00:07:40.449
requiring massive architecture. Even Elon Musk

00:07:40.449 --> 00:07:42.750
reacted very publicly to this specific release.

00:07:42.990 --> 00:07:45.149
He called the performance a display of impressive

00:07:45.149 --> 00:07:47.689
intelligence density. Another incredibly crucial

00:07:47.689 --> 00:07:50.149
detail is that these models are open weight.

00:07:50.389 --> 00:07:52.629
What does open weight actually mean for the average

00:07:52.629 --> 00:07:54.990
software developer? Free models where anyone

00:07:54.990 --> 00:07:57.860
can tweak the core digital brain. It fundamentally

00:07:57.860 --> 00:08:00.180
changes the raw economics of software development

00:08:00.180 --> 00:08:03.040
today. Right. But there's also highly significant

00:08:03.040 --> 00:08:06.600
human drama surrounding this. Right after Alibaba

00:08:06.600 --> 00:08:08.660
launched this incredibly historic tech, things

00:08:08.660 --> 00:08:11.360
changed. The project's key technical leads suddenly

00:08:11.360 --> 00:08:13.660
stepped down completely today. Colleagues publicly

00:08:13.660 --> 00:08:15.959
stated the sudden corporate move feels like an

00:08:15.959 --> 00:08:18.680
end. They openly called his highly sudden departure

00:08:18.680 --> 00:08:21.680
the end of an era. Severe burnout in the artificial

00:08:21.680 --> 00:08:24.379
intelligence industry is very real. The immense

00:08:24.379 --> 00:08:26.939
pressure to constantly release new models breaks

00:08:26.939 --> 00:08:30.300
people. Two sec silence. Why does a tiny model

00:08:30.300 --> 00:08:33.340
beating a massive server actually matter? Why

00:08:33.340 --> 00:08:35.779
should the completely average listener care about

00:08:35.779 --> 00:08:38.740
this upset? Think deeply about the physical laptop

00:08:38.740 --> 00:08:41.740
sitting on your desk. It forcefully moves complex

00:08:41.740 --> 00:08:44.419
data processing directly onto personal devices.

00:08:44.720 --> 00:08:47.220
You easily gain immense computing capability

00:08:47.220 --> 00:08:50.059
without sacrificing any privacy. So ultimate

00:08:50.059 --> 00:08:52.039
power is shifting directly into our pockets.

00:08:52.159 --> 00:08:55.059
Yes. It beautifully democratizes widespread access

00:08:55.059 --> 00:08:58.419
to absolute cutting -edge reasoning logic. Anyone

00:08:58.419 --> 00:09:00.720
anywhere with a basic computer essentially has

00:09:00.720 --> 00:09:03.139
a private genius. Let's deliberately take a moment

00:09:03.139 --> 00:09:06.059
to tile these themes together. AI is finally

00:09:06.059 --> 00:09:09.019
shedding its awkward, apologizing phase completely

00:09:09.019 --> 00:09:11.779
today. It is integrating flawlessly into our

00:09:11.779 --> 00:09:14.340
daily hardware and workflows. And most importantly,

00:09:14.559 --> 00:09:16.980
the entire physical technology stack is shrinking.

00:09:17.220 --> 00:09:20.299
We historically used to rely entirely on giant,

00:09:20.360 --> 00:09:23.240
expensive cloud models. Now we are completely

00:09:23.240 --> 00:09:25.679
transitioning to quiet, highly dense software

00:09:25.679 --> 00:09:28.360
locally. These incredibly hyper -intelligent

00:09:28.360 --> 00:09:30.620
models will soon live entirely on your phone.

00:09:30.960 --> 00:09:33.539
We deeply want to leave you with a final provocative

00:09:33.539 --> 00:09:36.990
thought. If a 9 billion parameter model on a

00:09:36.990 --> 00:09:39.889
laptop wins completely today, what happens in

00:09:39.889 --> 00:09:42.090
two years when your offline pocket device magically

00:09:42.090 --> 00:09:45.269
evolves? What if it easily outsmarts the absolutely

00:09:45.269 --> 00:09:48.309
greatest supercomputers existing today? How does

00:09:48.309 --> 00:09:50.470
a world teeming with private pocket geniuses

00:09:50.470 --> 00:09:53.269
change collaboration? That is a truly wild future

00:09:53.269 --> 00:09:55.429
to imagine living in. Thank you deeply for taking

00:09:55.429 --> 00:09:57.909
this highly fascinating deep dive with us. Stay

00:09:57.909 --> 00:10:00.549
endlessly curious, keep bravely exploring, and

00:10:00.549 --> 00:10:02.389
we will see you next time. Out to row music.