WEBVTT

00:00:00.000 --> 00:00:02.220
I want you to picture an extremely sophisticated

00:00:02.220 --> 00:00:05.940
mind, like a super genius. Okay. They're trying

00:00:05.940 --> 00:00:08.019
to solve these really complex logic problems,

00:00:08.160 --> 00:00:10.980
but first they have to run a full detailed calculation

00:00:10.980 --> 00:00:13.380
just to remember what the capital of France is.

00:00:13.539 --> 00:00:15.599
Every single time. Every single time. It's an

00:00:15.599 --> 00:00:18.839
absurd waste of energy. It's totally counterintuitive,

00:00:18.839 --> 00:00:21.739
isn't it? But that strange inefficiency, that

00:00:21.739 --> 00:00:24.399
repetitive brute force thinking just to recall

00:00:24.399 --> 00:00:27.980
simple facts. That's been a foundational limit

00:00:27.980 --> 00:00:30.640
for large language models since day one. They

00:00:30.640 --> 00:00:34.979
just lack a reliable built -in memory. Okay,

00:00:35.039 --> 00:00:37.280
so let's unpack this using the sources you shared.

00:00:37.439 --> 00:00:40.539
For years, the story of AI has been all about

00:00:40.539 --> 00:00:43.399
scaling up, right? More data, more parameters.

00:00:43.500 --> 00:00:45.439
More processing power. Exactly. We've been focused

00:00:45.439 --> 00:00:48.759
on just increasing the raw brain size. But today...

00:00:48.960 --> 00:00:51.020
We're seeing this fascinating shift. The sources

00:00:51.020 --> 00:00:52.799
are pointing toward the focus moving from that

00:00:52.799 --> 00:00:55.619
inefficient, constant processing. To extremely

00:00:55.619 --> 00:00:58.560
efficient architectural remembering. Right. Absolutely.

00:00:58.600 --> 00:01:01.700
And our mission today is to cut through the noise

00:01:01.700 --> 00:01:04.319
and give you those vital nuggets from this evolving

00:01:04.319 --> 00:01:07.319
landscape. This deep dive is all about specialization,

00:01:07.500 --> 00:01:10.400
personalization, and efficiency. So here's the

00:01:10.400 --> 00:01:12.640
roadmap for you. First, we're going to look at

00:01:12.640 --> 00:01:15.420
how OpenAI is finally seriously challenging Google

00:01:15.420 --> 00:01:17.689
Translate. And they're doing it by injecting

00:01:17.689 --> 00:01:20.569
emotional and tonal context into it. Then we'll

00:01:20.569 --> 00:01:22.510
do a quick synthesis of the market. We're talking

00:01:22.510 --> 00:01:26.049
predicted mega IPOs, major corporate shifts,

00:01:26.170 --> 00:01:29.670
and critical revaluation of data. And finally,

00:01:29.730 --> 00:01:31.450
we'll dive deep into that breakthrough memory

00:01:31.450 --> 00:01:33.969
architecture we just mentioned, DeepSeq's Enneagram

00:01:33.969 --> 00:01:36.290
system, which really changes the game by giving

00:01:36.290 --> 00:01:39.590
LLMs a true functional memory bank. Let's start

00:01:39.590 --> 00:01:42.650
with translation. For what, over a decade, Google

00:01:42.650 --> 00:01:44.969
Translate has just been the default. Oh, yeah.

00:01:45.049 --> 00:01:47.260
It's functional. But everyone knows the output

00:01:47.260 --> 00:01:51.099
often sounds, well, sterile. Yeah. Mechanical.

00:01:51.180 --> 00:01:54.260
And now OpenAI has launched ChatGPT Translate,

00:01:54.359 --> 00:01:57.359
and it is a direct challenge with a major twist.

00:01:57.560 --> 00:01:59.180
And that twist is? They aren't just trying to

00:01:59.180 --> 00:02:00.700
provide a basic language conversion. They're

00:02:00.700 --> 00:02:03.280
targeting the professional need for nuance. Right.

00:02:03.319 --> 00:02:05.859
The core innovation is that this translator lets

00:02:05.859 --> 00:02:08.460
you customize the output tone right from the

00:02:08.460 --> 00:02:12.800
interface. That flexibility is the real aha moment

00:02:12.800 --> 00:02:14.780
here, isn't it? Especially for anyone who needs

00:02:14.780 --> 00:02:18.490
to bridge cultural gaps or handle sensitive communication?

00:02:18.870 --> 00:02:21.849
It's tone -aware translation. It completely changes

00:02:21.849 --> 00:02:24.069
the standard for quality. So you could give it

00:02:24.069 --> 00:02:27.250
an example. Sure. You can input a complex sentence,

00:02:27.530 --> 00:02:30.469
say a response to a client complaint, and tell

00:02:30.469 --> 00:02:32.990
the AI, translate this into German, but make

00:02:32.990 --> 00:02:36.280
it sound... highly formal like a university professor

00:02:36.280 --> 00:02:38.639
writing a business contract. Wow. Or you can

00:02:38.639 --> 00:02:40.919
flip that completely and ask it to translate

00:02:40.919 --> 00:02:42.939
something into Spanish and make it sound, you

00:02:42.939 --> 00:02:44.939
know, casual like a text message between friends.

00:02:45.180 --> 00:02:47.300
This just highlights how mechanical the old systems

00:02:47.300 --> 00:02:49.180
are. Google Translate is basically operating

00:02:49.180 --> 00:02:51.960
on this massive statistical map. Yeah, it finds

00:02:51.960 --> 00:02:54.159
the most likely linguistic equivalent. But it

00:02:54.159 --> 00:02:56.289
was never built... to understand the emotional

00:02:56.289 --> 00:02:58.610
or professional intent behind the words. And

00:02:58.610 --> 00:03:01.310
that's the competitive barrier. The sources suggest

00:03:01.310 --> 00:03:04.110
this new service, which is likely powered by

00:03:04.110 --> 00:03:07.930
GPT -4 Turbo, handles over 50 languages. And

00:03:07.930 --> 00:03:10.009
it works right on mobile browsers. Right, with

00:03:10.009 --> 00:03:13.789
text or voice. For big tech, this is all about

00:03:13.789 --> 00:03:17.349
targeting that next layer of value that goes

00:03:17.349 --> 00:03:20.270
beyond just basic utility. When you think about

00:03:20.270 --> 00:03:23.810
the user base, I mean... Business users, students

00:03:23.810 --> 00:03:26.530
drafting papers, international creators. Yeah.

00:03:26.939 --> 00:03:29.219
They all need translations that don't sound like

00:03:29.219 --> 00:03:31.159
they came from a machine. A contract that sounds

00:03:31.159 --> 00:03:33.919
robotic isn't just awkward. No, it could actually

00:03:33.919 --> 00:03:36.159
hurt a deal. Exactly. It saves you that extra

00:03:36.159 --> 00:03:39.060
step of translating, then manually rewriting

00:03:39.060 --> 00:03:41.500
the whole thing to sound natural. It delivers

00:03:41.500 --> 00:03:44.219
context and quality at the same time. But wait,

00:03:44.280 --> 00:03:46.439
if the user just typing in a prompt like make

00:03:46.439 --> 00:03:48.580
it sound casual, isn't that essentially just

00:03:48.580 --> 00:03:51.060
a packaged prompt engineering trick? Why is this

00:03:51.060 --> 00:03:52.979
a major step forward for translation models?

00:03:53.199 --> 00:03:55.020
That's a really good question. It's more than

00:03:55.020 --> 00:03:57.439
just a trick. The model itself is trained to

00:03:57.439 --> 00:04:00.240
recognize this huge spectrum of linguistic features,

00:04:00.379 --> 00:04:03.360
things like register, formality markers, cultural

00:04:03.360 --> 00:04:05.580
norms. Things that are usually invisible to a

00:04:05.580 --> 00:04:08.199
simple statistical engine. Precisely. It's embedding

00:04:08.199 --> 00:04:10.819
that contextual awareness into the translation

00:04:10.819 --> 00:04:14.159
process itself. So the translated language structure

00:04:14.159 --> 00:04:17.139
reflects the requested tone, not just the word

00:04:17.139 --> 00:04:21.310
choice. So how will this focus on tone? fundamentally

00:04:21.310 --> 00:04:23.810
change professional communication across different

00:04:23.810 --> 00:04:26.569
languages. It ensures translated content conveys

00:04:26.569 --> 00:04:29.550
human intent, not just literal word equivalence.

00:04:29.670 --> 00:04:32.050
Okay, let's pivot to the broader AI ecosystem.

00:04:32.350 --> 00:04:34.470
Let's do it. We'll look at shifts in productivity

00:04:34.470 --> 00:04:37.269
and personalization that affect you daily, and

00:04:37.269 --> 00:04:39.709
then at the massive capital that's driving this

00:04:39.709 --> 00:04:42.689
entire sector. On the personalization side, we're

00:04:42.689 --> 00:04:45.930
seeing users find really clever ways to integrate

00:04:45.930 --> 00:04:49.389
AI into their workflows. The sources highlighted

00:04:49.389 --> 00:04:52.310
a trending... thread on Reddit where a user curated

00:04:52.310 --> 00:04:54.910
15 of what they called golden prompts over six

00:04:54.910 --> 00:04:57.449
months. And they claim this saved them 10 to

00:04:57.449 --> 00:04:59.949
15 hours every single week. Prompt engineering

00:04:59.949 --> 00:05:02.069
or prompt hacking, as some people are calling

00:05:02.069 --> 00:05:04.129
it, is definitely becoming a high value skill.

00:05:04.310 --> 00:05:06.689
It really is. But the providers are also doing

00:05:06.689 --> 00:05:08.769
the hard work to bake that personalization right

00:05:08.769 --> 00:05:11.529
into the product itself. And that's where Google's

00:05:11.529 --> 00:05:13.610
personal intelligence in Gemini comes in. Right.

00:05:13.790 --> 00:05:16.149
This is integrating the model directly with a

00:05:16.149 --> 00:05:19.680
user's personal ecosystem. your Gmail, your photos,

00:05:19.920 --> 00:05:23.699
your YouTube data, to offer these hyper -customized

00:05:23.699 --> 00:05:25.980
answers. So instead of asking a general question,

00:05:26.259 --> 00:05:28.480
you can ask something like, based on my Gmail,

00:05:28.699 --> 00:05:31.360
summarize the three key action items from the

00:05:31.360 --> 00:05:33.680
Smith meeting last week. And then find the flight

00:05:33.680 --> 00:05:36.259
receipt from my photos folder. Exactly. Their

00:05:36.259 --> 00:05:38.360
pitch is basically, we're saving you a headache.

00:05:38.829 --> 00:05:41.350
by making the AI an extension of your digital

00:05:41.350 --> 00:05:44.709
memory. And that focus on seamless integration

00:05:44.709 --> 00:05:48.029
is also happening so fast inside corporations.

00:05:48.689 --> 00:05:51.810
Microsoft, for instance, just shut down its massive

00:05:51.810 --> 00:05:54.610
internal employee library. I saw that, including

00:05:54.610 --> 00:05:57.290
subscriptions they've used for decades. And they

00:05:57.290 --> 00:05:59.490
replaced that traditional resource with an AI

00:05:59.490 --> 00:06:01.959
-powered skilling hub. Just think about the message

00:06:01.959 --> 00:06:04.560
that sends. Institutional knowledge is no longer

00:06:04.560 --> 00:06:07.560
static documents in a folder somewhere. No. It's

00:06:07.560 --> 00:06:10.180
a dynamic conversational resource managed by

00:06:10.180 --> 00:06:13.160
an AI. That's a fundamental change. Now, moving

00:06:13.160 --> 00:06:16.579
to the money side. The legal drama between Altman

00:06:16.579 --> 00:06:19.540
and Musk, that's still unfolding. New court documents

00:06:19.540 --> 00:06:22.040
just dropped. But the market valuation story

00:06:22.040 --> 00:06:24.819
is what's truly staggering. I mean, a New York

00:06:24.819 --> 00:06:27.600
Times report suggests 2026 is shaping up to be

00:06:27.600 --> 00:06:29.920
the year of the... What do they call it? The

00:06:29.920 --> 00:06:33.439
$2 trillion mega IPO. That's wild. We're talking

00:06:33.439 --> 00:06:36.860
about open AI, Anthropic and SpaceX potentially

00:06:36.860 --> 00:06:38.959
hitting the public markets almost at the same

00:06:38.959 --> 00:06:41.519
time. That would just redefine global tech investment.

00:06:41.699 --> 00:06:44.019
And investors are still betting big on specialized

00:06:44.019 --> 00:06:47.360
sectors. The AI video startup Hakesfield just

00:06:47.360 --> 00:06:50.800
hit a $1 .3 billion valuation after a new funding

00:06:50.800 --> 00:06:53.220
round. Which shows massive confidence in video

00:06:53.220 --> 00:06:55.360
generation as a real market, you know, despite

00:06:55.360 --> 00:06:57.519
the ongoing debates about quality. But the most

00:06:57.519 --> 00:06:59.540
fascinating shift, the one that speaks directly

00:06:59.540 --> 00:07:02.100
to the long term cost of all this. It involves

00:07:02.100 --> 00:07:04.959
Wikipedia. This is a huge signal that the old

00:07:04.959 --> 00:07:07.860
scraping era is ending. A hundred percent. Wikipedia

00:07:07.860 --> 00:07:10.620
has now signed actual payment deals with major

00:07:10.620 --> 00:07:14.579
tech companies, Microsoft, Meta, even Perplexity.

00:07:14.600 --> 00:07:17.040
Giving them access to their content for AI training.

00:07:17.279 --> 00:07:20.480
So big tech, which for years just ingested vast

00:07:20.480 --> 00:07:23.439
amounts of content for free, is now paying a

00:07:23.439 --> 00:07:27.759
premium for quality verified source data. This

00:07:27.759 --> 00:07:30.000
changes the entire economics of training models.

00:07:30.259 --> 00:07:33.379
Whoa. I mean, imagine scaling your company to

00:07:33.379 --> 00:07:36.060
hundreds of millions, even a billion queries

00:07:36.060 --> 00:07:38.879
without ever paying for your core source knowledge.

00:07:39.079 --> 00:07:41.560
That was the Internet model. And it seems that

00:07:41.560 --> 00:07:44.300
model is finally broken. Yeah, it suggests the

00:07:44.300 --> 00:07:47.420
market has realized that freely scraped, noisy

00:07:47.420 --> 00:07:50.839
data is just far less valuable than curated,

00:07:50.980 --> 00:07:54.199
clean data. Quality is now a scarce and valuable

00:07:54.199 --> 00:07:57.350
commodity. So what does big tech paying Wikipedia

00:07:57.350 --> 00:08:00.089
tell us about the fundamental long term value

00:08:00.089 --> 00:08:02.970
of quality training data? Quality training data

00:08:02.970 --> 00:08:05.509
is transitioning from a free resource into a

00:08:05.509 --> 00:08:07.990
critical, expensive commodity. OK, so let's get

00:08:07.990 --> 00:08:09.569
back to that technical challenge we introduced

00:08:09.569 --> 00:08:11.569
at the top. The inefficiency problem. Right.

00:08:11.629 --> 00:08:14.529
Why do these massive models waste so much compute

00:08:14.529 --> 00:08:16.649
power just remembering how to spell basic words

00:08:16.649 --> 00:08:19.029
or recall simple facts? Traditionally, the race

00:08:19.029 --> 00:08:21.430
has been all about increasing the context window,

00:08:21.629 --> 00:08:24.009
you know, making the workspace bigger. We focus

00:08:24.009 --> 00:08:26.610
on how much data the model can see at one time.

00:08:27.000 --> 00:08:28.899
But you're right. The problem isn't the size

00:08:28.899 --> 00:08:31.560
of the room. It's the filing system inside the

00:08:31.560 --> 00:08:34.779
model. Precisely. They are constantly retalculating

00:08:34.779 --> 00:08:37.700
facts because they lack an indexed built -in

00:08:37.700 --> 00:08:40.320
memory structure that separates that rote knowledge

00:08:40.320 --> 00:08:42.700
from complex reasoning. This is where DeepSeq

00:08:42.700 --> 00:08:45.460
comes in. Enter DeepSeq. They introduced the

00:08:45.460 --> 00:08:48.159
memory lookup module for LLMs, and they named

00:08:48.159 --> 00:08:51.240
it Ngram. So Ngram is essentially a real memory

00:08:51.240 --> 00:08:53.879
system. A conditional one, yeah. It lets the

00:08:53.879 --> 00:08:56.879
model know... when to think deeply and when to

00:08:56.879 --> 00:08:59.600
just retrieve a fact. How does that actually

00:08:59.600 --> 00:09:01.840
work? Well, think of it this way. The model's

00:09:01.840 --> 00:09:03.840
deep neural network is like a master sculptor.

00:09:03.860 --> 00:09:06.139
It takes time and compute to create a new statue.

00:09:06.340 --> 00:09:08.840
Okay. But if you just want to reproduce the same

00:09:08.840 --> 00:09:11.480
simple flower over and over, you don't re -sculpt

00:09:11.480 --> 00:09:14.720
it every time. You use a mold, a stamp. Enneagram

00:09:14.720 --> 00:09:17.559
uses something called engram embeddings. And

00:09:17.559 --> 00:09:19.440
for our listeners, how would you define engram

00:09:19.440 --> 00:09:22.019
embeddings in simple terms? Ngram embeddings

00:09:22.019 --> 00:09:25.259
are basically compressed pre -packaged digital

00:09:25.259 --> 00:09:28.460
representations of common phrases or facts. They're

00:09:28.460 --> 00:09:30.940
like pre -made organized stamps you can grab

00:09:30.940 --> 00:09:33.720
instantly. So instead of rebuilding the answer

00:09:33.720 --> 00:09:36.750
through complex processing. It just. performs

00:09:36.750 --> 00:09:39.909
a single, lightning -fast, one -step lookup for

00:09:39.909 --> 00:09:41.669
that fact. Hang on a minute. If they're just

00:09:41.669 --> 00:09:45.230
prepackaged, compressed phrases stamps, as you

00:09:45.230 --> 00:09:47.750
put it, isn't that just a really optimized caching

00:09:47.750 --> 00:09:50.470
layer? Why is DeepSea calling this revolutionary?

00:09:51.240 --> 00:09:53.600
Because it fundamentally changes how the model

00:09:53.600 --> 00:09:55.740
uses its depth. It's not just caching external

00:09:55.740 --> 00:09:58.220
data. It's offloading the internal memory function

00:09:58.220 --> 00:10:01.200
that was previously eating up the network's reasoning

00:10:01.200 --> 00:10:04.440
capacity. I see. That instant retrieval frees

00:10:04.440 --> 00:10:06.919
up the core model's depth for actual complicated

00:10:06.919 --> 00:10:10.700
tasks. The math, the code generation, the truly

00:10:10.700 --> 00:10:13.720
novel logic. It stops simple recall from clogging

00:10:13.720 --> 00:10:16.779
up. the intellectual arteries of the neural network.

00:10:16.919 --> 00:10:18.519
That's a good way to put it. It separates the

00:10:18.519 --> 00:10:20.580
dictionary from the logic center. And DeepSeek's

00:10:20.580 --> 00:10:22.320
research showed something critical here. What

00:10:22.320 --> 00:10:24.259
was that? They found that general model performance

00:10:24.259 --> 00:10:27.259
actually dips significantly when you push models

00:10:27.259 --> 00:10:29.700
to a massive scale if memory isn't handled correctly.

00:10:30.179 --> 00:10:32.700
But by offloading that rote memory work to the

00:10:32.700 --> 00:10:36.259
Enneagram system, performance rebounds. A lot.

00:10:36.480 --> 00:10:39.870
And this whole tension. between using deep processing

00:10:39.870 --> 00:10:43.549
for complex logic versus a quick lookup for simple

00:10:43.549 --> 00:10:46.690
facts, that balance is known as sparsity allocation,

00:10:47.029 --> 00:10:49.899
right? how you intelligently distribute computational

00:10:49.899 --> 00:10:52.779
effort. Exactly. It's one of the biggest efficiency

00:10:52.779 --> 00:10:56.039
and cost problems in AI today. On a human level,

00:10:56.220 --> 00:10:58.039
you know, when we try to juggle too much information

00:10:58.039 --> 00:11:00.559
in our heads, we get context collapse. I mean,

00:11:00.580 --> 00:11:02.720
I still wrestle with prompt drift myself when

00:11:02.720 --> 00:11:04.480
I overload a context window on a challenging

00:11:04.480 --> 00:11:06.620
task. That's relatable. If you can offload the

00:11:06.620 --> 00:11:08.399
simple stuff, you can focus on the deeper thought.

00:11:08.700 --> 00:11:10.879
So the question now is whether the major players

00:11:10.879 --> 00:11:13.240
like OpenAI and Google will actually commit to

00:11:13.240 --> 00:11:15.940
this memory -first architecture. Will they adopt

00:11:15.940 --> 00:11:18.299
Enneagram or something like it? Or will they

00:11:18.299 --> 00:11:20.980
just keep doubling down on sheer scale, hoping

00:11:20.980 --> 00:11:23.639
massive compute brute forces the memory problem?

00:11:23.980 --> 00:11:25.960
If you can remember simple facts efficiently,

00:11:26.279 --> 00:11:29.759
you just. You dramatically reduce the effort

00:11:29.759 --> 00:11:31.899
needed for that constant brute force thinking.

00:11:32.159 --> 00:11:35.159
It changes the entire economic model for future

00:11:35.159 --> 00:11:38.100
development. So will this new memory architecture

00:11:38.100 --> 00:11:41.159
fundamentally change how all models are developed

00:11:41.159 --> 00:11:44.039
going forward? If AI can remember efficiently,

00:11:44.399 --> 00:11:47.440
less effort is needed for continuous expensive

00:11:47.440 --> 00:11:50.100
brute force processing. So what does this all

00:11:50.100 --> 00:11:52.820
mean for you, the learner? We've covered an AI

00:11:52.820 --> 00:11:55.539
landscape that is rapidly moving away from these

00:11:55.539 --> 00:11:58.519
generalized one size fits all tools. And towards

00:11:58.519 --> 00:12:01.679
specialized tone aware services that prioritize

00:12:01.679 --> 00:12:04.200
human nuance. Yeah. And we watched the market

00:12:04.200 --> 00:12:06.700
finally recognize and actually start paying for

00:12:06.700 --> 00:12:09.700
the true value of quality data. The overarching

00:12:09.700 --> 00:12:12.940
trend here is clearly specialization, personalization,

00:12:13.080 --> 00:12:16.080
and a relentless push for efficiency. AI is getting

00:12:16.080 --> 00:12:18.179
smarter, but not just by getting bigger, like,

00:12:18.200 --> 00:12:20.200
you know, stacking Lego blocks of data. Right.

00:12:20.340 --> 00:12:22.740
It's getting smarter by optimizing its architecture

00:12:22.740 --> 00:12:25.039
and using its resources more intelligently, which

00:12:25.039 --> 00:12:26.940
is exactly what we saw with DeepSeq's Enneagram.

00:12:27.259 --> 00:12:29.600
If architectural breakthroughs like Enneagram

00:12:29.600 --> 00:12:32.460
become the standard and they solve this massive

00:12:32.460 --> 00:12:34.480
cost problem with fact retrieval and memory,

00:12:34.720 --> 00:12:37.500
the next great limit might not be data access

00:12:37.500 --> 00:12:41.080
or rock. compute. It might be economic. How so?

00:12:41.279 --> 00:12:43.700
Well, if efficiency makes the cost of running

00:12:43.700 --> 00:12:47.159
these LLMs just collapse, does AI stop being

00:12:47.159 --> 00:12:49.539
a specialized tool for giant corporations and

00:12:49.539 --> 00:12:52.120
become a true utility accessible to every single

00:12:52.120 --> 00:12:54.279
person on Earth? That's something for you to

00:12:54.279 --> 00:12:56.279
consider as these systems evolve. And I'd encourage

00:12:56.279 --> 00:12:58.240
you to look into the concept of sparsity allocation,

00:12:58.580 --> 00:13:01.039
how the AI allocates its effort, or even just

00:13:01.039 --> 00:13:03.220
test out that new tone -aware translation feature

00:13:03.220 --> 00:13:05.539
yourself. You'll really see the difference between

00:13:05.539 --> 00:13:08.279
mechanical output and contextual understanding.

00:13:08.720 --> 00:13:10.460
Thanks for diving deep into your sources with

00:13:10.460 --> 00:13:11.799
us. We'll catch you on the next deep dive.
