1
00:00:00,000 --> 00:00:12,640
Welcome to the Azure Security Podcast where we discuss topics relating to security, privacy,

2
00:00:12,640 --> 00:00:16,520
reliability and compliance on the Microsoft Cloud Platform.

3
00:00:16,520 --> 00:00:20,900
Hey everybody, welcome to Episode 68.

4
00:00:20,900 --> 00:00:22,400
This is one of those special episodes.

5
00:00:22,400 --> 00:00:27,080
It's only myself this week, Michael, and I have a guest, AJ Jagannathan, who's here

6
00:00:27,080 --> 00:00:32,280
to talk to us about the release of Microsoft SQL Server 2022.

7
00:00:32,280 --> 00:00:37,080
But before we get to AJ, I actually do have a couple of little news items that are related

8
00:00:37,080 --> 00:00:38,960
to SQL Server.

9
00:00:38,960 --> 00:00:44,920
The first is that we now have a STIG for Azure SQL Database.

10
00:00:44,920 --> 00:00:49,520
Now, STIG stands for Security Technical Implementation Guides.

11
00:00:49,520 --> 00:00:53,720
They're created by a group of folks called DISSA, and they create most of this documentation

12
00:00:53,720 --> 00:00:55,800
for the Department of Defense.

13
00:00:55,800 --> 00:00:59,160
But they're also available for commercial use as well.

14
00:00:59,160 --> 00:01:01,400
They're peer reviewed.

15
00:01:01,400 --> 00:01:05,040
They generally tend to be really good quality, but it's great to see something finally coming

16
00:01:05,040 --> 00:01:07,880
out for Azure SQL Database.

17
00:01:07,880 --> 00:01:13,960
The other thing I want to just want to touch on briefly is my colleague, Andreas Walter,

18
00:01:13,960 --> 00:01:18,920
has written a blog post called Troubleshooting External Data and Access Policies in Azure

19
00:01:18,920 --> 00:01:21,440
SQL and SQL Server.

20
00:01:21,440 --> 00:01:26,120
This is actually a much needed post.

21
00:01:26,120 --> 00:01:29,520
Reviewing access policies in SQL Server is not easy.

22
00:01:29,520 --> 00:01:31,400
I'm going to be honest with you.

23
00:01:31,400 --> 00:01:34,040
It's very worthwhile read.

24
00:01:34,040 --> 00:01:38,320
Andreas also touches on integration with Purview.

25
00:01:38,320 --> 00:01:43,920
We'll talk a little bit more about that as we continue through this podcast.

26
00:01:43,920 --> 00:01:45,600
Those are the news items that I have.

27
00:01:45,600 --> 00:01:48,680
Now let's switch our attention to our guest.

28
00:01:48,680 --> 00:01:53,640
This week we have AJ Jaganathan who's here to talk to us, as I mentioned, about SQL Server

29
00:01:53,640 --> 00:01:54,640
2022.

30
00:01:54,640 --> 00:01:55,640
AJ, welcome to the podcast.

31
00:01:55,640 --> 00:01:59,680
We'd like to just spend a quick moment and give our listeners a bit of background on

32
00:01:59,680 --> 00:02:00,680
yourself.

33
00:02:00,680 --> 00:02:01,680
Thanks, Michael.

34
00:02:01,680 --> 00:02:06,960
It's my pleasure to be here speaking to you and sharing some knowledge with the audience.

35
00:02:06,960 --> 00:02:08,200
Hey, folks.

36
00:02:08,200 --> 00:02:09,400
My name is AJ Jaganathan.

37
00:02:09,400 --> 00:02:14,240
I'm a Principal Group Program Manager in the Azure Data Team in Microsoft.

38
00:02:14,240 --> 00:02:20,840
My team is responsible for building SQL Server and hybrid services and helping our customers

39
00:02:20,840 --> 00:02:25,000
successfully deploy this platform for their applications.

40
00:02:25,000 --> 00:02:28,560
We just released SQL Server 2022, literally, two or three weeks ago.

41
00:02:28,560 --> 00:02:31,480
There was a SQL Pass Summit in Seattle.

42
00:02:31,480 --> 00:02:35,800
I was actually a presenter there talking about funding security.

43
00:02:35,800 --> 00:02:40,160
That was the day we actually announced the general availability of SQL Server 2022.

44
00:02:40,160 --> 00:02:46,640
AJ, why don't you spend a few moments and give us an overview of the big changes and

45
00:02:46,640 --> 00:02:50,120
the big messages that we have around SQL Server 2022?

46
00:02:50,120 --> 00:02:55,520
Let me do a quick rundown of some of the major features of SQL Server 2022 and why we believe

47
00:02:55,520 --> 00:02:59,360
customers would love to upgrade to this release.

48
00:02:59,360 --> 00:03:01,480
It all starts with connecting to Azure.

49
00:03:01,480 --> 00:03:07,360
We invested in several scenarios like SQL Minage Instance for disaster recovery, near

50
00:03:07,360 --> 00:03:13,120
real-time analytics with Synapse Link, as well as insights lineage and policy management

51
00:03:13,120 --> 00:03:14,960
with Microsoft Purview.

52
00:03:14,960 --> 00:03:20,800
The second area is, as Michael said, security is always very critical.

53
00:03:20,800 --> 00:03:26,320
SQL continues to be the least vulnerable database in the industry with the latest published

54
00:03:26,320 --> 00:03:32,760
numbers from NIST, as well as investments like Ledger and Purview that you heard Michael

55
00:03:32,760 --> 00:03:33,760
mentioned.

56
00:03:33,760 --> 00:03:39,040
Finally, performance and availability is something we take pride on.

57
00:03:39,040 --> 00:03:45,680
SQL ranked number one in the latest benchmark that HPE published for SQL Server 2022 for

58
00:03:45,680 --> 00:03:48,680
the data warehouse non-clustered workloads.

59
00:03:48,680 --> 00:03:54,520
You'll also see a ton of innovations in the built-in query intelligence which make your

60
00:03:54,520 --> 00:03:58,560
apps run faster without any code changes.

61
00:03:58,560 --> 00:04:04,160
Is this a short summary of some of the key innovations of SQL Server 2022?

62
00:04:04,160 --> 00:04:11,840
When we look at a major area of innovation, as you mentioned, security, one of the big

63
00:04:11,840 --> 00:04:19,280
ones is we're starting to see the first efforts around integrating with Purview for managing

64
00:04:19,280 --> 00:04:21,880
roles but also just data governance in general.

65
00:04:21,880 --> 00:04:26,880
Do you mind just spending just a few moments, just explain what that means for people, what

66
00:04:26,880 --> 00:04:27,880
that means for customers?

67
00:04:27,880 --> 00:04:30,320
Why should they even consider using Purview?

68
00:04:30,320 --> 00:04:34,680
Over time, what we have seen is, I mean, the attack surface area has just increased.

69
00:04:34,680 --> 00:04:40,240
Customers are finding it very hard to keep up with keeping their environments secure.

70
00:04:40,240 --> 00:04:42,320
The environments are hybrid in nature.

71
00:04:42,320 --> 00:04:46,880
You may have some deployments in your own data center, some in cloud environments, whether

72
00:04:46,880 --> 00:04:49,600
it's in Azure or other clouds.

73
00:04:49,600 --> 00:04:55,680
Customers are really looking for central experience, a single pane of glass if you will, to manage

74
00:04:55,680 --> 00:04:58,320
such large estates.

75
00:04:58,320 --> 00:05:00,360
This is where Purview comes along.

76
00:05:00,360 --> 00:05:06,520
Microsoft Purview introduces this capability to centrally govern and manage on-premises

77
00:05:06,520 --> 00:05:09,400
SQL Server in addition to other versions.

78
00:05:09,400 --> 00:05:15,120
We go all the way back to SQL Server 2005 on the on-premises version as well as the cloud

79
00:05:15,120 --> 00:05:16,640
deployments too.

80
00:05:16,640 --> 00:05:23,240
It helps you with data cataloging, scanning, classification, setting up access policies,

81
00:05:23,240 --> 00:05:24,240
etc.

82
00:05:24,240 --> 00:05:34,280
It gives IT administrator or database administrator easy to use system to manage their environment.

83
00:05:34,280 --> 00:05:38,240
The second part of Purview is role-based access control.

84
00:05:38,240 --> 00:05:41,040
This helps with permissions management.

85
00:05:41,040 --> 00:05:46,680
This has been a longstanding ask for providing more granular scope through policy-based management

86
00:05:46,680 --> 00:05:54,120
where customers can set up policies for different users and roles for their SQL Server environments.

87
00:05:54,120 --> 00:05:58,760
You can apply this at scale instead of having to run scripts individually at each server.

88
00:05:58,760 --> 00:06:04,000
This works seamlessly across multiple instances where, let's say if you want to give temporary

89
00:06:04,000 --> 00:06:08,760
access, for example, if you have hired a consultant on a short-term basis and you want to give

90
00:06:08,760 --> 00:06:14,600
them temporary access to a production environment, you could easily do that through data read

91
00:06:14,600 --> 00:06:17,240
policy and modify option is coming soon.

92
00:06:17,240 --> 00:06:23,720
Or if you also have folks who want to do DevOps policies where you want to give them temporary

93
00:06:23,720 --> 00:06:29,360
access to production environment for, let's say, performance monitoring or troubleshooting,

94
00:06:29,360 --> 00:06:34,240
you could set up this policy through Purview without having to create specific logins and

95
00:06:34,240 --> 00:06:36,800
permissions within SQL Server.

96
00:06:36,800 --> 00:06:43,440
This works through the Azure Active Directory integration and this policy can push to multiple

97
00:06:43,440 --> 00:06:48,360
instances of SQL Server from a single place.

98
00:06:48,360 --> 00:06:53,640
It's also easy to revoke this policy once the time is elapsed.

99
00:06:53,640 --> 00:06:58,960
It simplifies the entire security management from a single place.

100
00:06:58,960 --> 00:07:03,080
That's the key innovation we are bringing to customers who are on SQL workloads.

101
00:07:03,080 --> 00:07:09,160
AJ, we touched on Purview, but there are some big themes that we tried to address and we

102
00:07:09,160 --> 00:07:11,880
did address in SQL Server 22.

103
00:07:11,880 --> 00:07:13,120
What were those themes?

104
00:07:13,120 --> 00:07:14,120
Yeah.

105
00:07:14,120 --> 00:07:20,280
When we look at SQL Server 2022 capabilities, we have five major categories.

106
00:07:20,280 --> 00:07:23,360
The first one, as I mentioned, is Cloud Connected.

107
00:07:23,360 --> 00:07:27,880
This is where we want to bring customers the power of Azure to their SQL Server environment,

108
00:07:27,880 --> 00:07:31,760
wherever they are running, whether it's in Azure or outside Azure.

109
00:07:31,760 --> 00:07:37,520
As Cloud continues to innovate at a very fast pace, a lot of our customers, we talk to,

110
00:07:37,520 --> 00:07:42,400
want to get the same innovations for their SQL Server environments.

111
00:07:42,400 --> 00:07:47,400
With SQL Server 2022, you will see a lot of innovations that bring SQL Server closer to

112
00:07:47,400 --> 00:07:49,360
Azure.

113
00:07:49,360 --> 00:07:53,280
One of them is the managed DR using managed instance, where you can directly connect your

114
00:07:53,280 --> 00:07:58,640
SQL Server instance to an Azure SQL managed instance.

115
00:07:58,640 --> 00:08:03,320
Once the link is established, you will be able to use the managed instance as a disaster

116
00:08:03,320 --> 00:08:09,600
recovery site for your SQL Server, or you can also offload your reporting workloads as

117
00:08:09,600 --> 00:08:13,640
a read scale out thereby freeing up resources on SQL Server.

118
00:08:13,640 --> 00:08:16,280
The third scenario where this helps us with migrations.

119
00:08:16,280 --> 00:08:22,240
At some point, let's say you have established a link and you're using the managed instance

120
00:08:22,240 --> 00:08:26,440
extensively and you decide to migrate to Azure, which all you need to do is initiate

121
00:08:26,440 --> 00:08:32,720
a failover and it's now running primary on instance on your managed instance.

122
00:08:32,720 --> 00:08:36,200
Multiple scenarios available through this feature.

123
00:08:36,200 --> 00:08:39,760
The second big one is analytics.

124
00:08:39,760 --> 00:08:44,560
With innovations and Synapse, customers are using this more and more.

125
00:08:44,560 --> 00:08:51,680
How do we connect because operational data sitting in SQL Server with analytics running

126
00:08:51,680 --> 00:08:52,680
in Synapse?

127
00:08:52,680 --> 00:08:55,360
Synapse links helps with that.

128
00:08:55,360 --> 00:09:00,200
Once you establish the sync, we will automatically synchronize the tables that you identify to

129
00:09:00,200 --> 00:09:05,560
the landing zone in Synapse and then make it available in the Synapse workspace, which

130
00:09:05,560 --> 00:09:13,080
then exposes you to do near real time analytics using Power BI directly from Synapse Studio.

131
00:09:13,080 --> 00:09:18,000
Not only that, any changes that happen to the table on your on-premises SQL Server automatically

132
00:09:18,000 --> 00:09:24,120
replicated to your Synapse environment, so you have the latest and greatest data available

133
00:09:24,120 --> 00:09:27,360
quickly for you to do near real time analytics.

134
00:09:27,360 --> 00:09:31,680
The second pillar is built-in query intelligence.

135
00:09:31,680 --> 00:09:37,120
This is one of the biggest ask customers have is when they want to upgrade or modernize to

136
00:09:37,120 --> 00:09:42,480
Azure and moving to a newer version, how do we make sure that their applications don't

137
00:09:42,480 --> 00:09:44,440
have any performance issues?

138
00:09:44,440 --> 00:09:48,520
End of day, customers care about performance as a key fundamental.

139
00:09:48,520 --> 00:09:51,080
We have invested both in Query Store.

140
00:09:51,080 --> 00:09:55,920
We have made it by default now because we have the confidence and learnings from Azure

141
00:09:55,920 --> 00:09:59,280
SQL again where millions of databases have this turned on.

142
00:09:59,280 --> 00:10:05,160
With the power of Query Store and the investments in Intelligent QP, we have addressed some

143
00:10:05,160 --> 00:10:12,080
of the most common pain points either by just upgrading or whether you want to run on the

144
00:10:12,080 --> 00:10:18,400
latest database compact level, you will be able to get the best features available.

145
00:10:18,400 --> 00:10:20,440
Let me highlight a couple of them.

146
00:10:20,440 --> 00:10:24,240
One is the parameter sensitive plan optimization.

147
00:10:24,240 --> 00:10:27,920
Many customers today see this problem where you are running the same stored procedure

148
00:10:27,920 --> 00:10:33,520
with a different set of parameters and each one fetches a different set of rows because

149
00:10:33,520 --> 00:10:36,840
the data is queued.

150
00:10:36,840 --> 00:10:41,920
The first user's plan is cached, so the second query will reuse the same plan which may not

151
00:10:41,920 --> 00:10:44,200
be the optimal plan for that query.

152
00:10:44,200 --> 00:10:49,120
For example, if the first one uses a seek and the second one needs a scan, you may have

153
00:10:49,120 --> 00:10:52,000
undesirable impact.

154
00:10:52,000 --> 00:10:54,400
This is called parameter sniffing problem.

155
00:10:54,400 --> 00:11:00,880
With SQL 2022, we have addressed this where we can cast multiple plans for the same parameterized

156
00:11:00,880 --> 00:11:01,880
queries.

157
00:11:01,880 --> 00:11:08,320
So thereby, each query or each user gets the right version of the plan according to the

158
00:11:08,320 --> 00:11:10,480
number of rows that it needs.

159
00:11:10,480 --> 00:11:15,680
There are other similar features like the degree of parallelism feedback where customers have

160
00:11:15,680 --> 00:11:21,720
a challenge on setting the right max DOP value, whether at the server level, database level,

161
00:11:21,720 --> 00:11:23,600
or even at the query level.

162
00:11:23,600 --> 00:11:29,600
So with this feedback mechanism, we are able to dynamically adjust the right DOP for your

163
00:11:29,600 --> 00:11:36,080
query by reducing the DOP but yet trying to achieve the same performance using a learning

164
00:11:36,080 --> 00:11:39,360
mechanism via belt leveraging the query store.

165
00:11:39,360 --> 00:11:44,920
All these feedback features rely on query store where we persist all the knowledge we

166
00:11:44,920 --> 00:11:47,000
gain as we run the queries.

167
00:11:47,000 --> 00:11:52,880
So it also has an added benefit that if the server crashes or restarts or if your query

168
00:11:52,880 --> 00:11:57,840
plan is evicted from cache, it can be retreat quickly from the query store so you don't

169
00:11:57,840 --> 00:12:01,160
have to pay the cost of recompiling.

170
00:12:01,160 --> 00:12:06,720
The third pillar I will talk about is scalability and availability.

171
00:12:06,720 --> 00:12:12,800
As the data volume increases and the size of machines that customers run, SQL grows,

172
00:12:12,800 --> 00:12:18,760
like very large memory systems, it's very important that we can scale in these environments.

173
00:12:18,760 --> 00:12:26,160
So we have more than 25 plus features in the core engine which will help in this case.

174
00:12:26,160 --> 00:12:32,160
One of the examples is parallel buffer pool scan where we can scan large memory footprints

175
00:12:32,160 --> 00:12:37,000
where most of the memory is used by the buffer pool to scan the memory quickly and retrieve

176
00:12:37,000 --> 00:12:38,760
the right data for the query.

177
00:12:38,760 --> 00:12:44,480
On the availability side, to maintain the highest availability, we have invested in some

178
00:12:44,480 --> 00:12:50,840
of the areas like contained availability groups which help you to include system databases

179
00:12:50,840 --> 00:12:52,400
as part of your failover.

180
00:12:52,400 --> 00:12:59,440
So thereby it's easiest administration and setup of your availability environment, HADR.

181
00:12:59,440 --> 00:13:05,200
You don't have to separately set up master or MSDB then user, they can all combine them.

182
00:13:05,200 --> 00:13:10,480
The other one is we partnered with Intel to introduce the quick assist technology for

183
00:13:10,480 --> 00:13:15,120
backup compression, backup is a very CPU intensive task.

184
00:13:15,120 --> 00:13:21,760
So by offloading this task to the secondary chip that Intel quick assist technology provides,

185
00:13:21,760 --> 00:13:26,320
the SQL Server Core CPU is freed up for user workloads.

186
00:13:26,320 --> 00:13:33,720
So these are some of the few examples of innovations in the availability and scalability pillar.

187
00:13:33,720 --> 00:13:36,540
The next one is data virtualization.

188
00:13:36,540 --> 00:13:37,920
This is actually a big one.

189
00:13:37,920 --> 00:13:43,720
It started with SQL Server 2016 with PolyBase and we continue to expand this.

190
00:13:43,720 --> 00:13:48,600
What we have realized is in today's world data is residing not just in a SQL Server database

191
00:13:48,600 --> 00:13:56,760
but also in other non-relational formats like 4k files, delta files, CSV text files.

192
00:13:56,760 --> 00:14:02,480
So how do we provide a way for customers to connect their SQL data with this data set

193
00:14:02,480 --> 00:14:09,400
without having to do expensive ETL or moving the data into a database in a format that

194
00:14:09,400 --> 00:14:10,400
SQL understands.

195
00:14:10,400 --> 00:14:16,600
So data virtualization helps address that problem by expanding the PolyBase example.

196
00:14:16,600 --> 00:14:23,080
We are also integrating with S3 API, which is an open source API.

197
00:14:23,080 --> 00:14:28,920
This exposes to many more open source formats that are available in the industry.

198
00:14:28,920 --> 00:14:36,480
And now customers can access data stored in Azure Data Lake or Blobstore or other S3 providers.

199
00:14:36,480 --> 00:14:42,840
This technology can also be leveraged for backup restore with two URL with S3 API where

200
00:14:42,840 --> 00:14:48,320
you can use this REST API for backing up your critical database assets.

201
00:14:48,320 --> 00:14:51,160
And the last bucket of investments is the T-SQL.

202
00:14:51,160 --> 00:14:56,320
T-SQL is the most popular language for SQL administrators.

203
00:14:56,320 --> 00:15:01,480
We are expanding the T-SQL in the area of JSON for customers to operate with JSON type

204
00:15:01,480 --> 00:15:02,480
data.

205
00:15:02,480 --> 00:15:08,480
We have added several string data and windowing functions based on customer and community

206
00:15:08,480 --> 00:15:09,480
feedback.

207
00:15:09,480 --> 00:15:14,200
And then we are bringing a lot of these time series functions that were introduced in Azure

208
00:15:14,200 --> 00:15:22,400
SQL Edge to SQL customers so they can operate on IoT or time series kind of data sets.

209
00:15:22,400 --> 00:15:28,600
So this talked about a lot of features, but this in essence is some of the key pillars.

210
00:15:28,600 --> 00:15:31,680
We have to continue to invest and grow in SQL server.

211
00:15:31,680 --> 00:15:33,600
So the last area is security.

212
00:15:33,600 --> 00:15:35,200
Security is also another core fundamental.

213
00:15:35,200 --> 00:15:41,440
It's actually table stakes as customers have data that expands into multiple scenarios

214
00:15:41,440 --> 00:15:44,600
and then the attack vectors increase.

215
00:15:44,600 --> 00:15:47,120
Data is the most critical asset for any company.

216
00:15:47,120 --> 00:15:51,720
So we continue to be an industry leader in the security space.

217
00:15:51,720 --> 00:15:54,760
And I will let Michael highlight some of the investments here.

218
00:15:54,760 --> 00:15:56,960
Just listening to you just then was really, really kind of interesting.

219
00:15:56,960 --> 00:16:02,160
And I made this comment actually when we released SQL Server 2022 when I was at the keynote.

220
00:16:02,160 --> 00:16:05,240
And it's funny because I'm obviously like the security nerd.

221
00:16:05,240 --> 00:16:06,240
This is a big joke, right?

222
00:16:06,240 --> 00:16:11,280
I'm the security nerd and a bunch of database engineers.

223
00:16:11,280 --> 00:16:15,120
And so I made the comment saying, yeah, I kind of understand the words, but I'm not

224
00:16:15,120 --> 00:16:17,080
sure about the sentences.

225
00:16:17,080 --> 00:16:22,120
And yeah, there's still a lot that I have to learn about databases in general, but this

226
00:16:22,120 --> 00:16:23,120
is great to see.

227
00:16:23,120 --> 00:16:26,560
So yeah, security, there's a lot of big changes that we've made.

228
00:16:26,560 --> 00:16:30,600
Yeah, so one of them we've already touched on and that's Microsoft Perview integration.

229
00:16:30,600 --> 00:16:36,040
I want to quickly go through the list of items and I'll explain each in a bit more detail.

230
00:16:36,040 --> 00:16:39,920
So the first one is Microsoft Defender for cloud integration, Perview we've talked about,

231
00:16:39,920 --> 00:16:40,920
Ledger.

232
00:16:40,920 --> 00:16:45,200
So we actually had Peter Van Hoeven here a few weeks ago talking about Ledger, very

233
00:16:45,200 --> 00:16:47,040
cool technology.

234
00:16:47,040 --> 00:16:52,000
As your Active Directory authentication, more importantly, client authentication.

235
00:16:52,000 --> 00:16:53,000
Always encrypted with secure enclaves.

236
00:16:53,000 --> 00:16:55,160
There's been some changes there as well.

237
00:16:55,160 --> 00:16:57,280
Access control around permissions.

238
00:16:57,280 --> 00:16:59,760
Access control server level roles.

239
00:16:59,760 --> 00:17:02,720
Some changes to dynamic data masking.

240
00:17:02,720 --> 00:17:03,720
Support for PFX.

241
00:17:03,720 --> 00:17:05,240
There's actually a funny little story in there.

242
00:17:05,240 --> 00:17:11,160
And then finally, actually one of my favorite changes is support for Microsoft TDS 8.0, which

243
00:17:11,160 --> 00:17:13,160
is Tabular Data Stream 8.0.

244
00:17:13,160 --> 00:17:17,840
And I'll talk about that because that has some really important TLS implications.

245
00:17:17,840 --> 00:17:20,320
So let me start again at the very top and go through some of these things.

246
00:17:20,320 --> 00:17:23,120
And AJ, if you have any comments to add to each of these, let me know.

247
00:17:23,120 --> 00:17:25,800
All right, so the first one is Microsoft Defender for cloud integration.

248
00:17:25,800 --> 00:17:29,320
So when everyone thinks of like Microsoft Defender for cloud, you think about, you know, just

249
00:17:29,320 --> 00:17:30,320
Azure.

250
00:17:30,320 --> 00:17:33,680
The actual fact, Defender for cloud doesn't just work for Azure.

251
00:17:33,680 --> 00:17:41,320
It works for assets and, for example, AWS and GCP, but also works for on-prem items as

252
00:17:41,320 --> 00:17:42,320
well.

253
00:17:42,320 --> 00:17:44,320
So for example, SQL Server 2022.

254
00:17:44,320 --> 00:17:50,040
So now we can actually use Microsoft Defender for cloud, you know, with your SQL Server databases

255
00:17:50,040 --> 00:17:55,480
or SQL Server servers, I should say, on-prem or in VMs for that matter.

256
00:17:55,480 --> 00:17:56,980
Microsoft Perview we already talked about.

257
00:17:56,980 --> 00:17:59,120
So I'm not going to go through that in any more detail.

258
00:17:59,120 --> 00:18:02,160
Another feature, Ledger for SQL Server.

259
00:18:02,160 --> 00:18:05,760
This is, as I mentioned, we had Peter Van Hove on a few weeks ago, actually a few months

260
00:18:05,760 --> 00:18:08,240
ago now, to go over Ledger.

261
00:18:08,240 --> 00:18:13,680
And this is a way of providing cryptographic proof that data has not been tampered with.

262
00:18:13,680 --> 00:18:18,200
Now, the most important part of that is that those hashes, which are actually called Merkle

263
00:18:18,200 --> 00:18:21,680
trees, should be stored on something that is temper resistant.

264
00:18:21,680 --> 00:18:27,840
So for example, you could use immutable storage and like as your storage account.

265
00:18:27,840 --> 00:18:30,040
You could also use confidential ledger.

266
00:18:30,040 --> 00:18:34,120
Now confidential ledger is not the same as Ledger for SQL Server.

267
00:18:34,120 --> 00:18:38,080
Similar ideas, but a confidential ledger came out of the confidential computing team.

268
00:18:38,080 --> 00:18:39,760
It's more of a general purpose ledger.

269
00:18:39,760 --> 00:18:41,920
It's not just for databases.

270
00:18:41,920 --> 00:18:45,040
We can store the Merkle trees in Ledger if needed.

271
00:18:45,040 --> 00:18:46,040
Yeah.

272
00:18:46,040 --> 00:18:51,720
Just to add on to your point, Michael, one of the value prop of bringing into SQL is the

273
00:18:51,720 --> 00:18:54,480
simplicity of using this through SQL.

274
00:18:54,480 --> 00:19:00,120
SQL is a very complex technology and it's a decentralized storage and interacting with

275
00:19:00,120 --> 00:19:01,120
it.

276
00:19:01,120 --> 00:19:03,640
It requires some programming skills.

277
00:19:03,640 --> 00:19:09,520
Whereas with SQL ledger, you just enable it at the database level and then the engine

278
00:19:09,520 --> 00:19:11,080
automatically takes care of it.

279
00:19:11,080 --> 00:19:17,800
And then we have many views and functions to help customers to look at their ledger data

280
00:19:17,800 --> 00:19:21,840
and then verify the integrity of the data and such.

281
00:19:21,840 --> 00:19:27,960
So definitely looking forward to our customers using it for both auditing scenarios or use

282
00:19:27,960 --> 00:19:33,560
cases where they have multi-party business processes where, for example, like a supply

283
00:19:33,560 --> 00:19:39,960
chain management system where multiple organizations share the business process with each other,

284
00:19:39,960 --> 00:19:46,960
but you want to make sure as the data flows, it's completely trusted and has not been tampered

285
00:19:46,960 --> 00:19:47,960
with.

286
00:19:47,960 --> 00:19:53,240
So it's a very good use case and we already have customers like Lenovo who have been using

287
00:19:53,240 --> 00:19:57,840
this in Azure and now we're bringing the same experience to SQL Server.

288
00:19:57,840 --> 00:19:58,840
Yeah.

289
00:19:58,840 --> 00:20:01,560
Peter and I spoke at length, actually.

290
00:20:01,560 --> 00:20:06,800
We were together at the past summit in Seattle and we spoke at length about Ledger and there's

291
00:20:06,800 --> 00:20:11,000
some other aspects that aren't really being talked about with Ledger that much.

292
00:20:11,000 --> 00:20:14,440
The first one is, okay, we throw around the word blockchain a lot, but I want people to

293
00:20:14,440 --> 00:20:16,520
understand that it's actually not a blockchain technology.

294
00:20:16,520 --> 00:20:21,120
It's actually not using signatures and blocks in the same way the blockchain does.

295
00:20:21,120 --> 00:20:27,680
It's symmetric and the strength of its security comes from storing the hashes in immutable

296
00:20:27,680 --> 00:20:29,480
storage.

297
00:20:29,480 --> 00:20:36,600
The complete linchpin is storing it in immutable storage, but there is also a huge upside to

298
00:20:36,600 --> 00:20:37,800
the way we have it implemented.

299
00:20:37,800 --> 00:20:42,800
And as you mentioned, AJ, the fact that you can just create a table and go with Ledger

300
00:20:42,800 --> 00:20:46,480
equals on and that's basically it.

301
00:20:46,480 --> 00:20:48,240
That's the extent of configuration ultimately.

302
00:20:48,240 --> 00:20:52,080
I mean, other than setting up a storage account, it's really that simple.

303
00:20:52,080 --> 00:20:54,720
But the other aspect is that everything is symmetric.

304
00:20:54,720 --> 00:21:00,840
So it's hashes as opposed to signatures and hashes are a lot more energy efficient than

305
00:21:00,840 --> 00:21:04,520
signatures like substantially so.

306
00:21:04,520 --> 00:21:07,160
So this is not just faster than blockchain.

307
00:21:07,160 --> 00:21:10,960
It's also a lot more energy efficient than blockchain and that's a big deal, especially

308
00:21:10,960 --> 00:21:13,640
if you're doing this thing, there's kind of stuff at scale.

309
00:21:13,640 --> 00:21:17,200
Next one is Azure Active Directory authentication and this is for clients authentication.

310
00:21:17,200 --> 00:21:19,200
You may think, well, hang on, don't we already have that?

311
00:21:19,200 --> 00:21:24,640
Yeah, we do in Azure SQL Database and then Azure SQL MI, but not in SQL Server.

312
00:21:24,640 --> 00:21:29,400
So you can actually have an on-prem system or a VM running SQL Server 2022 and you can

313
00:21:29,400 --> 00:21:34,040
now authenticate your clients using Azure Active Directory.

314
00:21:34,040 --> 00:21:37,280
You can also use things like managed identities if you're running in a VM.

315
00:21:37,280 --> 00:21:38,400
You can do all those kinds of things.

316
00:21:38,400 --> 00:21:39,920
So this is really cool.

317
00:21:39,920 --> 00:21:43,480
This honestly from my perspective as a security guy, this is a huge, huge, huge, huge, huge

318
00:21:43,480 --> 00:21:44,480
win.

319
00:21:44,480 --> 00:21:48,720
I think this is a very important direction for our database products.

320
00:21:48,720 --> 00:21:55,320
We really want to push people to using Azure Active Directory primarily because, one, the

321
00:21:55,320 --> 00:21:59,360
client doesn't have to persist any kind of credential to connect to SQL Server.

322
00:21:59,360 --> 00:22:01,080
That's number one.

323
00:22:01,080 --> 00:22:05,760
Number two, you can start taking advantage of other defensive mechanisms that we have

324
00:22:05,760 --> 00:22:08,560
in Azure Active Directory.

325
00:22:08,560 --> 00:22:15,640
So for example, things like we can detect dodgy sign-ins enforcing credential policies,

326
00:22:15,640 --> 00:22:16,640
those kinds of things.

327
00:22:16,640 --> 00:22:21,920
So that's why you really need to be thinking about Azure Active Directory for client authentication

328
00:22:21,920 --> 00:22:24,240
into SQL Server.

329
00:22:24,240 --> 00:22:30,520
And Azure SQL Database, we still support on-prem, for example, UID and password.

330
00:22:30,520 --> 00:22:33,680
But the problem is that password, you have to persist it somewhere and that's what the

331
00:22:33,680 --> 00:22:35,120
attack is going to go after.

332
00:22:35,120 --> 00:22:38,680
As with AAD authentication, there is no credential, right?

333
00:22:38,680 --> 00:22:39,880
We don't persist it anywhere.

334
00:22:39,880 --> 00:22:43,720
And if using like a managed identity in Azure, the credential is actually managed completely

335
00:22:43,720 --> 00:22:44,720
by AAD.

336
00:22:44,720 --> 00:22:49,600
So it's just better all around from a security posture perspective.

337
00:22:49,600 --> 00:22:53,960
Next one is always encrypted with secure enclave, some small changes here.

338
00:22:53,960 --> 00:22:57,440
We now support join, group, buy and order buy.

339
00:22:57,440 --> 00:23:01,920
And we now support text columns using UTF-8 collations.

340
00:23:01,920 --> 00:23:05,240
So historically, we didn't support UTF-8, turns out a lot of our customers are using

341
00:23:05,240 --> 00:23:09,080
UTF-8 for collations in SQL Server.

342
00:23:09,080 --> 00:23:14,320
So collations are just like the essentially what are you using to store your data so we

343
00:23:14,320 --> 00:23:18,720
can do comparisons and orderings and so on because different alphabets have different

344
00:23:18,720 --> 00:23:19,920
orderings.

345
00:23:19,920 --> 00:23:22,840
And so collations help you with that.

346
00:23:22,840 --> 00:23:27,800
Is that a fair definition, AJ, of collations or do you have a much better, more specific

347
00:23:27,800 --> 00:23:30,240
example of what a collation really is?

348
00:23:30,240 --> 00:23:37,920
Yeah, collation actually helps SQL Server engine understand the data in the table.

349
00:23:37,920 --> 00:23:42,000
It helps influence query results.

350
00:23:42,000 --> 00:23:49,080
If your data is following this collation, what sort order I need to use because if you

351
00:23:49,080 --> 00:23:56,480
have a case-sensitive collation, as an example, you need to make sure when you're on the query

352
00:23:56,480 --> 00:24:00,680
and you have an order buy, it follows the rules of a case-sensitive collation.

353
00:24:00,680 --> 00:24:04,000
Or if it's a case-insensitive collation, then it doesn't matter.

354
00:24:04,000 --> 00:24:07,800
So it's basically a way for operating on string data.

355
00:24:07,800 --> 00:24:09,360
So I wasn't completely wrong then.

356
00:24:09,360 --> 00:24:10,360
That's good.

357
00:24:10,360 --> 00:24:15,400
So yeah, now we support UTF-8 collations and there's also been some performance improvements

358
00:24:15,400 --> 00:24:17,200
as well and always encrypted.

359
00:24:17,200 --> 00:24:20,280
This is always encrypted with secure enclaves.

360
00:24:20,280 --> 00:24:22,160
The second two areas are somewhat related.

361
00:24:22,160 --> 00:24:24,560
They're both to do with access control.

362
00:24:24,560 --> 00:24:30,520
The first is we have some new granular permissions that help improve principle release privilege.

363
00:24:30,520 --> 00:24:32,360
Look, I'm going to be honest with you.

364
00:24:32,360 --> 00:24:41,560
There's a lot of privileges and roles in SQL Server that are not granular and you end

365
00:24:41,560 --> 00:24:43,920
up granting someone a particular permission.

366
00:24:43,920 --> 00:24:48,280
The next thing, they got access to stuff beyond what they really should have access to.

367
00:24:48,280 --> 00:24:54,720
So we spent quite a bit of time trying to reduce the scope of some of those permissions.

368
00:24:54,720 --> 00:25:00,840
Again, this has been a lot of work done by Andreas Walter and he's writing a lot of information

369
00:25:00,840 --> 00:25:03,560
about some of these new permissions.

370
00:25:03,560 --> 00:25:07,760
We also in the area of access control have new server-level roles.

371
00:25:07,760 --> 00:25:11,320
These are, again, to help with principle release privilege.

372
00:25:11,320 --> 00:25:16,760
So these are roles that provide very, very specific roles, funny enough, rather than

373
00:25:16,760 --> 00:25:20,040
having a grab bag of roles that a user might have.

374
00:25:20,040 --> 00:25:24,320
So again, it's all about granting people or entities specifically what they need access

375
00:25:24,320 --> 00:25:28,000
to, but this is at a server level.

376
00:25:28,000 --> 00:25:30,640
Next one is around dynamic data masking.

377
00:25:30,640 --> 00:25:34,960
I'm going to start off by saying that dynamic data masking really isn't security, but if

378
00:25:34,960 --> 00:25:37,600
you've got to pick a bucket, this will be the bucket it would go into.

379
00:25:37,600 --> 00:25:39,880
It's all about just masking data, right?

380
00:25:39,880 --> 00:25:43,200
So for example, social security numbers only show, like say for example, the last four

381
00:25:43,200 --> 00:25:45,360
digits or something like that.

382
00:25:45,360 --> 00:25:49,160
We now have much more granular unmasked permissions.

383
00:25:49,160 --> 00:25:53,920
So you can set permissions like at the schema level all the way down to individual tables

384
00:25:53,920 --> 00:25:56,160
and obviously columns as well.

385
00:25:56,160 --> 00:25:58,040
But this is really great to see as well.

386
00:25:58,040 --> 00:26:01,840
The second to last one is support for PFX.

387
00:26:01,840 --> 00:26:05,680
Okay, so there's actually a really interesting story about behind this.

388
00:26:05,680 --> 00:26:10,520
Ajay, I think you heard the story, right, about how I got involved in this, the PFX

389
00:26:10,520 --> 00:26:11,520
stuff.

390
00:26:11,520 --> 00:26:12,520
Do you hear about that?

391
00:26:12,520 --> 00:26:16,400
Yep, the previous person left the company and you had to take over.

392
00:26:16,400 --> 00:26:19,120
Yeah, so it came down to the wire, right?

393
00:26:19,120 --> 00:26:21,920
I'm getting really, really close to shipping SQL Server 2022.

394
00:26:21,920 --> 00:26:26,320
And one of the things you have to get done is support has to sign off on all the new

395
00:26:26,320 --> 00:26:28,920
features because they have to be able to support it.

396
00:26:28,920 --> 00:26:29,920
It makes absolute sense, right?

397
00:26:29,920 --> 00:26:31,960
If they can't support it, then we can't ship it.

398
00:26:31,960 --> 00:26:36,320
Well, it turns out the person who had worked on this originally had left the company and

399
00:26:36,320 --> 00:26:40,200
so there was no support documentation for PFX support.

400
00:26:40,200 --> 00:26:43,160
But I'll explain what PFX is in a minute.

401
00:26:43,160 --> 00:26:47,560
And so my manager came to me and he says, do you know PFX?

402
00:26:47,560 --> 00:26:49,080
I'm like, yeah, I know PFX.

403
00:26:49,080 --> 00:26:54,000
He said, well, do you mind writing the support documentation for SQL Server 2022 so that

404
00:26:54,000 --> 00:26:55,000
we can ship?

405
00:26:55,000 --> 00:26:57,920
I'm like, yeah, I guess.

406
00:26:57,920 --> 00:27:03,000
So I knew PFX from, so jokingly, I said to him, I said, it's funny, you should bring

407
00:27:03,000 --> 00:27:07,960
that up because, yeah, I worked on PFX because I was in IIS, right, in our web server.

408
00:27:07,960 --> 00:27:12,600
Of course I know PFX because we support the ability to import and export certificates

409
00:27:12,600 --> 00:27:16,160
and private keys, but that was 20 years ago.

410
00:27:16,160 --> 00:27:21,000
So congratulations on finally supporting in SQL Server 2022.

411
00:27:21,000 --> 00:27:26,480
So what PFX is, it's just an industry standard format for, it's called PKCS 12 as well, is

412
00:27:26,480 --> 00:27:33,760
just a way of exporting and importing certificates and private keys and potentially certificate

413
00:27:33,760 --> 00:27:37,640
hierarchies and private keys in an encrypted blob.

414
00:27:37,640 --> 00:27:38,640
That's basically all of this.

415
00:27:38,640 --> 00:27:43,280
Don't get me wrong, you may think, oh man, does SQL Server not support exporting private

416
00:27:43,280 --> 00:27:44,840
keys in an encrypted form?

417
00:27:44,840 --> 00:27:50,480
Yeah, it always has, but it was a Windows format, not an industry standard format, it's just

418
00:27:50,480 --> 00:27:52,880
something that's been around in Windows forever.

419
00:27:52,880 --> 00:27:54,800
So we now support PFX.

420
00:27:54,800 --> 00:27:59,880
So now that you can export and import using PFX files instead.

421
00:27:59,880 --> 00:28:04,840
Great to see customers that need this are happy that it's finally there, but yeah, so we now

422
00:28:04,840 --> 00:28:06,560
support PFX.

423
00:28:06,560 --> 00:28:11,720
The last one, which is actually probably my favorite of all the features that we've added

424
00:28:11,720 --> 00:28:16,320
to SQL Server 2022 is improved TDS support.

425
00:28:16,320 --> 00:28:22,400
So this is TDS, Tabular DataStream, and we now support version eight.

426
00:28:22,400 --> 00:28:27,440
Now, here's what's really interesting about TDS 8.0.

427
00:28:27,440 --> 00:28:32,640
If you use TDS 8.0, you will have an encrypted channel.

428
00:28:32,640 --> 00:28:34,000
We will use TLS.

429
00:28:34,000 --> 00:28:35,480
There is no, you can't turn that off.

430
00:28:35,480 --> 00:28:42,480
If you're going to opt in for using TDS 8.0, then the channel will use TLS.

431
00:28:42,480 --> 00:28:44,400
That can't be turned off.

432
00:28:44,400 --> 00:28:49,120
Now the other major change that we made here, which I think is incredibly important is that

433
00:28:49,120 --> 00:28:55,160
historically TDS and TLS or whatever was used to protect the channel and provide server

434
00:28:55,160 --> 00:28:58,480
authentication were intermingled.

435
00:28:58,480 --> 00:28:59,800
Now we have broken them apart.

436
00:28:59,800 --> 00:29:03,280
So TDS 8.0 and TLS is completely separate.

437
00:29:03,280 --> 00:29:09,200
This is really important because it allows us to easily upgrade the TLS protocol versions

438
00:29:09,200 --> 00:29:12,680
without having to jump through weird and funky hoops.

439
00:29:12,680 --> 00:29:17,040
That's why like today, the older version of SQL Server can't use TLS 1.3.

440
00:29:17,040 --> 00:29:18,840
They just can't.

441
00:29:18,840 --> 00:29:19,840
Now we can.

442
00:29:19,840 --> 00:29:25,960
So SQL Server 2022 does support TLS 1.3, but only when you're using TDS 8.0.

443
00:29:25,960 --> 00:29:31,840
The other nice thing is that it allows, say, a firewall to do packet inspection because

444
00:29:31,840 --> 00:29:37,080
you could actually terminate the TLS connection at the firewall and then do TDS packet inspection

445
00:29:37,080 --> 00:29:38,080
at that point.

446
00:29:38,080 --> 00:29:42,480
So one of the best sort of analogy is think of HTTPS.

447
00:29:42,480 --> 00:29:44,040
It's very, very similar.

448
00:29:44,040 --> 00:29:50,600
What HTTP is to TLS or those two used together, so HTTPS, you can think TDS is very, very

449
00:29:50,600 --> 00:29:52,400
similar to HTTP in this case.

450
00:29:52,400 --> 00:29:55,160
So it's TDS writing over TLS.

451
00:29:55,160 --> 00:29:56,960
So this is great to see.

452
00:29:56,960 --> 00:30:03,200
It allows us to use TLS 1.3 and it also allows us to upgrade to newer protocol versions with

453
00:30:03,200 --> 00:30:08,560
ease relative to prior to TDS 8.

454
00:30:08,560 --> 00:30:10,560
So those are the major features.

455
00:30:10,560 --> 00:30:16,240
Again, I think my favorite is probably the TDS 8 because if you turn on TDS 8, then the

456
00:30:16,240 --> 00:30:17,240
channel is protected.

457
00:30:17,240 --> 00:30:19,240
There's no unprotected channel.

458
00:30:19,240 --> 00:30:21,240
What would be next?

459
00:30:21,240 --> 00:30:22,240
I don't know.

460
00:30:22,240 --> 00:30:24,000
Peter will probably have me say ledger.

461
00:30:24,000 --> 00:30:25,200
I actually really like ledger.

462
00:30:25,200 --> 00:30:29,760
I like ledger a lot because it's just really, really simple.

463
00:30:29,760 --> 00:30:30,760
Verification is simple.

464
00:30:30,760 --> 00:30:32,960
It's easy to use.

465
00:30:32,960 --> 00:30:34,280
It's part of the product.

466
00:30:34,280 --> 00:30:35,600
You don't pay extra for it.

467
00:30:35,600 --> 00:30:38,960
It solves a real world problem, which is great to see.

468
00:30:38,960 --> 00:30:41,960
Do you have any particular favorites in there, AJ?

469
00:30:41,960 --> 00:30:42,960
On the security side?

470
00:30:42,960 --> 00:30:45,640
Well, let's start on the security side and let's just see if you can pick something

471
00:30:45,640 --> 00:30:47,960
out from the other grab bag of stuff I don't understand.

472
00:30:47,960 --> 00:30:48,960
Yeah.

473
00:30:48,960 --> 00:30:55,160
I think on the security side, Purview is interesting because Purview is actually helping

474
00:30:55,160 --> 00:31:02,240
customers manage not just the newer SQL Server version, but it also centrally helps you manage

475
00:31:02,240 --> 00:31:08,120
your entire SQL estate, whether you're on Cloud or on-premises.

476
00:31:08,120 --> 00:31:14,800
On the other side, I really like looking forward to the customers using our innovations

477
00:31:14,800 --> 00:31:20,640
in the Cloud through these Cloud capabilities, both the MI link and Synapse link.

478
00:31:20,640 --> 00:31:22,120
These are some of my favorites.

479
00:31:22,120 --> 00:31:25,440
All right, well, let's bring this episode to a close.

480
00:31:25,440 --> 00:31:27,360
AJ, thanks for joining us this week.

481
00:31:27,360 --> 00:31:32,440
One thing we always ask our guests is if you had just one thought to leave our listeners

482
00:31:32,440 --> 00:31:34,080
with, what would it be?

483
00:31:34,080 --> 00:31:37,200
The thought is we are continuously innovating.

484
00:31:37,200 --> 00:31:44,080
Database SQL Server is one of the most widely used databases in the planet and customers

485
00:31:44,080 --> 00:31:50,600
are learning to adapt to the changing demands of the industry and we are keeping up with

486
00:31:50,600 --> 00:31:51,600
that.

487
00:31:51,600 --> 00:31:57,040
We continue to innovate and lead the industry in many aspects through these innovations.

488
00:31:57,040 --> 00:32:04,880
SQL Server is very much alive and look forward to many more innovations in the upcoming years,

489
00:32:04,880 --> 00:32:09,320
whether through our Cloud options or through on-premises products.

490
00:32:09,320 --> 00:32:12,800
Do provide us the feedback if you have any thoughts or questions.

491
00:32:12,800 --> 00:32:16,560
All right, well, again, AJ, thank you for joining us this week.

492
00:32:16,560 --> 00:32:20,600
I always learn something on each podcast, but it's kind of crazy learning stuff about

493
00:32:20,600 --> 00:32:22,360
the product that you're actually working on.

494
00:32:22,360 --> 00:32:26,960
But anyway, again, I'm quite happy to take on the mantle of being the security nerd amongst

495
00:32:26,960 --> 00:32:28,400
the database engineers.

496
00:32:28,400 --> 00:32:29,400
I'm fine with that.

497
00:32:29,400 --> 00:32:31,520
So, again, thanks for joining us this week.

498
00:32:31,520 --> 00:32:36,120
And to all our listeners out there, we hope you enjoyed this episode.

499
00:32:36,120 --> 00:32:38,840
This is probably going to be the last episode of the year.

500
00:32:38,840 --> 00:32:43,680
So wishing everybody a safe break and we'll see you next time.

501
00:32:43,680 --> 00:32:46,800
Thanks for listening to the Azure Security Podcast.

502
00:32:46,800 --> 00:32:53,640
You can find show notes and other resources at our website azsecuritypodcast.net.

503
00:32:53,640 --> 00:32:58,800
If you have any questions, please find us on Twitter at AzureSecPod.

504
00:32:58,800 --> 00:33:14,840
Music is from ccmixter.com and licensed under the Creative Commons license.

