1
00:00:00,000 --> 00:00:19,120
Hello, and welcome to episode number 11 of the Awesome Algo podcast.

2
00:00:19,120 --> 00:00:26,440
Today's guest is Stefano De Angelis, and he is a solution architect from Algorand with

3
00:00:26,440 --> 00:00:29,480
PhD in cybersecurity.

4
00:00:29,480 --> 00:00:33,960
Stefano has a lot of experience in software engineering and has a very strong academic

5
00:00:33,960 --> 00:00:42,560
background that will allow us to do a very interesting episode that covers a history

6
00:00:42,560 --> 00:00:45,720
of consensus protocols.

7
00:00:45,720 --> 00:00:50,760
To be more precise, I would rather say it's a rather higher level overview of the brief

8
00:00:50,760 --> 00:00:54,600
history because there are just too many awesome things that have happened over the past 50

9
00:00:54,600 --> 00:00:55,600
years.

10
00:00:55,600 --> 00:00:58,400
We can't cover them all, but we'll do our best.

11
00:00:58,400 --> 00:01:02,520
Don't expect us to dive too deep into the details, but essentially the goal of this

12
00:01:02,520 --> 00:01:11,560
episode is to highlight the fascination with the overlap in between the field of distributed

13
00:01:11,560 --> 00:01:17,080
systems and fault-tolerant systems and how both of those contributed a lot to consensus

14
00:01:17,080 --> 00:01:18,960
protocol design.

15
00:01:18,960 --> 00:01:25,360
And with that, I would like to give the stage to Stefano, and let's start with the introduction

16
00:01:25,360 --> 00:01:26,360
essentially.

17
00:01:26,360 --> 00:01:32,080
And if you could tell us and now we'll listen a little bit about yourself and what was sort

18
00:01:32,080 --> 00:01:37,680
of the first domain of computer science that sparked your interest the most.

19
00:01:37,680 --> 00:01:40,840
Yeah, thank you.

20
00:01:40,840 --> 00:01:41,840
Thank you, Hal.

21
00:01:41,840 --> 00:01:46,880
First of all, for inviting me in this great podcast.

22
00:01:46,880 --> 00:01:49,080
I'm super excited.

23
00:01:49,080 --> 00:01:56,760
So yeah, let me give you a brief introduction of myself and also about my background.

24
00:01:56,760 --> 00:02:01,320
So I'm Stefano, the Angelis, as you said.

25
00:02:01,320 --> 00:02:05,560
I work at Algoranda as a research solution architect.

26
00:02:05,560 --> 00:02:15,880
I'm Italian, so I spent the most part of my life in Italy during my education studies.

27
00:02:15,880 --> 00:02:22,080
And as a background, I studied the engineering of computer science for my bachelor degree

28
00:02:22,080 --> 00:02:25,760
at the University of L'Azapienza in Rome.

29
00:02:25,760 --> 00:02:35,480
So during my first degree, I was also working as a full-stack software engineer in a company

30
00:02:35,480 --> 00:02:37,640
in Rome, in a digital company.

31
00:02:37,640 --> 00:02:43,400
And I was almost doing something like web application developments and also mobile applications

32
00:02:43,400 --> 00:02:48,440
development for Android OS.

33
00:02:48,440 --> 00:02:55,280
So my background was mainly the first programming language I had was Java.

34
00:02:55,280 --> 00:02:56,440
And it was OK.

35
00:02:56,440 --> 00:02:57,440
I liked it.

36
00:02:57,440 --> 00:02:58,600
I enjoyed it.

37
00:02:58,600 --> 00:03:04,680
But at the end of my bachelor, I said, OK, maybe programming is not what I really like

38
00:03:04,680 --> 00:03:07,520
for my future.

39
00:03:07,520 --> 00:03:16,320
So in my master, also down in Rome at the University of L'Azapienza, I decided to deepen

40
00:03:16,320 --> 00:03:20,400
topics like cybersecurity and distributed systems.

41
00:03:20,400 --> 00:03:22,080
Why these two topics?

42
00:03:22,080 --> 00:03:28,040
Well, because at that time, it was the rising of cloud computing.

43
00:03:28,040 --> 00:03:35,360
And cloud computing was quite fascinating for me because I was questioning why these

44
00:03:35,360 --> 00:03:41,560
systems, quite complex, can ensure reliability and full tolerance.

45
00:03:41,560 --> 00:03:48,240
Thanks for application and help people to consume services in such efficient way.

46
00:03:48,240 --> 00:03:51,960
So I was very interested in this stuff.

47
00:03:51,960 --> 00:03:57,480
So I said, let's try to deepen distributed computing.

48
00:03:57,480 --> 00:04:03,000
And eventually, during my master, I came up with an amazing book called Reliable and

49
00:04:03,000 --> 00:04:09,080
Distributed Programming written from the professor Christian Kashin, University of

50
00:04:09,080 --> 00:04:11,640
Bern, Switzerland.

51
00:04:11,640 --> 00:04:17,440
And this book was great because it introduced the problem of distributed systems, the basic

52
00:04:17,440 --> 00:04:22,840
problems of distributed systems, like how messages can be broadcast in a distributed

53
00:04:22,840 --> 00:04:31,280
network, how shared memories works across replicas, and of course, the problem of consensus

54
00:04:31,280 --> 00:04:33,040
protocols.

55
00:04:33,040 --> 00:04:37,560
So this was my first approach to distributed systems.

56
00:04:37,560 --> 00:04:42,200
And that's my main research area.

57
00:04:42,200 --> 00:04:45,520
I will make sure to include the links to the resources.

58
00:04:45,520 --> 00:04:49,640
Well, I think it's a great book.

59
00:04:49,640 --> 00:04:54,000
And going a bit further, a nice reference was Java, by the way.

60
00:04:54,000 --> 00:04:59,080
I think it's a great program language to dive into object-oriented programming.

61
00:04:59,080 --> 00:05:08,040
So you mentioned the pursuing of the degree at the University of Sapienza and the PhD

62
00:05:08,040 --> 00:05:10,960
at the University of Southampton.

63
00:05:10,960 --> 00:05:17,200
How do you think the experience of pursuing the degrees there shaped your academic background,

64
00:05:17,200 --> 00:05:19,720
your future goals and aspirations?

65
00:05:19,720 --> 00:05:23,880
Yeah, let me tell you a big story.

66
00:05:23,880 --> 00:05:33,280
So University of Sapienza is probably one of the largest universities in Europe.

67
00:05:33,280 --> 00:05:35,800
And there is this big faculty of engineering.

68
00:05:35,800 --> 00:05:42,400
It's a very, a new faculty just close to the Colosseum in Rome, full of history, full

69
00:05:42,400 --> 00:05:48,840
of a lot of scientists started there.

70
00:05:48,840 --> 00:05:50,800
Because there is all the faculties of engineering.

71
00:05:50,800 --> 00:05:58,240
So electronics, mechanics engineer, computer science, and all of them.

72
00:05:58,240 --> 00:06:04,040
So I had the chance to improve my bad skills in math there.

73
00:06:04,040 --> 00:06:06,760
So it was quite helpful for me.

74
00:06:06,760 --> 00:06:11,040
But so I had the chance to work with brilliant colleagues and professors.

75
00:06:11,040 --> 00:06:16,040
So I collaborated in this research group of cyber security there.

76
00:06:16,040 --> 00:06:24,720
And from the contacts I had the chance to get there, one of the professors offered me

77
00:06:24,720 --> 00:06:32,280
the possibility to go abroad for my final project of my master's thesis.

78
00:06:32,280 --> 00:06:34,600
I won a scholarship at the time.

79
00:06:34,600 --> 00:06:36,480
So I had the possibility to go abroad.

80
00:06:36,480 --> 00:06:42,200
And the professor I met there in the research group in Rome, he offered me this possibility

81
00:06:42,200 --> 00:06:43,200
with Southampton.

82
00:06:43,200 --> 00:06:48,960
Southampton is a city in the UK, South on the UK.

83
00:06:48,960 --> 00:06:54,080
University of Southampton is quite a good, very big and good university because it's

84
00:06:54,080 --> 00:07:02,800
part of a Russell group, which is a list of 24, the best 24 universities in the UK.

85
00:07:02,800 --> 00:07:08,400
So there I was super excited to join the university.

86
00:07:08,400 --> 00:07:12,480
And then I moved to Southampton.

87
00:07:12,480 --> 00:07:18,520
I joined the cyber security research group in the Southampton University, which is an

88
00:07:18,520 --> 00:07:23,160
academic center of excellence for the research in cyber security.

89
00:07:23,160 --> 00:07:29,640
And there the people working there were interested in distributed computing, security and block

90
00:07:29,640 --> 00:07:31,440
chains.

91
00:07:31,440 --> 00:07:35,640
So during my experience at the University of Southampton, I had the chance to work on

92
00:07:35,640 --> 00:07:43,360
various projects mostly related on cyber security and block chains.

93
00:07:43,360 --> 00:07:49,280
But the fascinating thing is that the university was extremely exposed to institutional and

94
00:07:49,280 --> 00:07:50,440
military sectors.

95
00:07:50,440 --> 00:07:55,280
So I had a research from a theoretical point of view for block chains and disability assistance,

96
00:07:55,280 --> 00:08:00,240
but then I had the chance to put things in practice with practical projects.

97
00:08:00,240 --> 00:08:01,240
I see.

98
00:08:01,240 --> 00:08:02,240
That's very nice.

99
00:08:02,240 --> 00:08:05,840
Oh, sorry.

100
00:08:05,840 --> 00:08:09,560
I thought there's a continuation there.

101
00:08:09,560 --> 00:08:12,520
But that's a very interesting background.

102
00:08:12,520 --> 00:08:22,480
And you're mentioning the field of cyber security, which on one glance, it could be a rather

103
00:08:22,480 --> 00:08:23,960
broad topic to cover.

104
00:08:23,960 --> 00:08:30,440
So let's slowly diverge a bit towards the blockchain space because I'm also curious

105
00:08:30,440 --> 00:08:37,640
on why that particular domain in the field of cyber security was of the most interest

106
00:08:37,640 --> 00:08:38,640
to you.

107
00:08:38,640 --> 00:08:43,000
So perhaps if you could also mention one of the first projects you worked on specifically

108
00:08:43,000 --> 00:08:49,040
in the blockchain space and perhaps aside from cyber security, what was the big fascination

109
00:08:49,040 --> 00:08:51,960
that got you involved with the blockchain specifically?

110
00:08:51,960 --> 00:08:56,520
Yeah, yeah, I definitely agree with you.

111
00:08:56,520 --> 00:09:03,000
I focused on the blockchain space and, you know, as I told you, I moved to Southampton

112
00:09:03,000 --> 00:09:09,120
for the final project of my master thesis, which was related with the University of Rome.

113
00:09:09,120 --> 00:09:15,720
And at that time, I was studying security of consensus protocols proposed for private

114
00:09:15,720 --> 00:09:16,720
block chains.

115
00:09:16,720 --> 00:09:22,160
That was my final project of the master thesis.

116
00:09:22,160 --> 00:09:27,440
So I moved to Southampton because the Southampton group was quite experienced on block chains.

117
00:09:27,440 --> 00:09:36,200
And when I moved, at that time, the research group was working on an European project,

118
00:09:36,200 --> 00:09:38,400
H2020.

119
00:09:38,400 --> 00:09:44,880
They were trying to investigate the adoption of a blockchain technology for implementing

120
00:09:44,880 --> 00:09:49,560
a federated Pan-European cloud infrastructure for the public administration.

121
00:09:49,560 --> 00:09:57,800
So it was a big cloud system composed by all European countries.

122
00:09:57,800 --> 00:10:06,080
And this cloud system was using the blockchain as a shared source of trust for the information.

123
00:10:06,080 --> 00:10:12,440
So in that context, I worked as a research fellow for the design of the blockchain infrastructure

124
00:10:12,440 --> 00:10:14,760
and the application use cases.

125
00:10:14,760 --> 00:10:20,280
So it was the first time I was coming from my background in investigating from the theoretical

126
00:10:20,280 --> 00:10:23,920
point of view the security of interest protocol from blockchain.

127
00:10:23,920 --> 00:10:29,000
But then with this project, I had to put in practice things starting to implement the

128
00:10:29,000 --> 00:10:32,440
infrastructure and smart contracts.

129
00:10:32,440 --> 00:10:33,440
I see.

130
00:10:33,440 --> 00:10:34,440
I see.

131
00:10:34,440 --> 00:10:35,440
Interesting.

132
00:10:35,440 --> 00:10:44,440
And going a little bit, going a little bit further to the topic of Algorand specifically,

133
00:10:44,440 --> 00:10:51,600
because, well, going back to some of the insights from the previous episodes, we know that it's

134
00:10:51,600 --> 00:10:57,800
a rather small space if we look at the total amount of engineering force in blockchain

135
00:10:57,800 --> 00:10:58,800
development in general.

136
00:10:58,800 --> 00:11:02,680
We have up to 300,000 people total.

137
00:11:02,680 --> 00:11:04,480
We have multiple companies.

138
00:11:04,480 --> 00:11:08,720
So there's a vast number of projects to pick.

139
00:11:08,720 --> 00:11:10,920
So why specifically Algorand?

140
00:11:10,920 --> 00:11:12,800
How did you first discover that?

141
00:11:12,800 --> 00:11:17,280
And what about its technology picked your interest?

142
00:11:17,280 --> 00:11:20,960
And, well, made you want to get involved?

143
00:11:20,960 --> 00:11:23,840
Yeah, sure.

144
00:11:23,840 --> 00:11:31,120
So when I moved to Southampton, there was this big European project.

145
00:11:31,120 --> 00:11:36,400
The guys were, to be honest, they were working with Ethereum at the time.

146
00:11:36,400 --> 00:11:43,240
It was late 2017, and they were trying to investigate Ethereum as a platform for the

147
00:11:43,240 --> 00:11:44,240
project.

148
00:11:44,240 --> 00:11:50,680
But Ethereum was too slow, and there was extremely a fit in terms of cost for the infrastructure.

149
00:11:50,680 --> 00:11:53,720
I'm sorry, which year was this?

150
00:11:53,720 --> 00:11:58,640
It was 2017, late 2017.

151
00:11:58,640 --> 00:12:02,640
So at that time, Ethereum wasn't feasible for the project.

152
00:12:02,640 --> 00:12:09,360
So we decided to switch to another platform, which was a private blockchain, Hyperledger

153
00:12:09,360 --> 00:12:14,320
Fabric, because it was more efficient for the project and also more feasible, because

154
00:12:14,320 --> 00:12:20,920
as I told you, we were building a pan-European cloud federation.

155
00:12:20,920 --> 00:12:28,040
So when you build the federation, in that time, the private blockchains were the best

156
00:12:28,040 --> 00:12:29,040
solution.

157
00:12:29,040 --> 00:12:36,000
But after the project, I started my PhD, and I'm talking about early 2018.

158
00:12:36,000 --> 00:12:39,040
The project ended, and I started my PhD.

159
00:12:39,040 --> 00:12:46,240
So once the project finished, I started asking myself the following question.

160
00:12:46,240 --> 00:12:53,120
So why was so difficult to achieve scalability and security for that project?

161
00:12:53,120 --> 00:12:58,560
Why was so difficult to use blockchain with a certain level of performance and security

162
00:12:58,560 --> 00:13:00,440
in that system?

163
00:13:00,440 --> 00:13:06,440
So this was my first research question, and it was quite fascinating at the time.

164
00:13:06,440 --> 00:13:11,440
And then I started comparing the solutions the market was proposing.

165
00:13:11,440 --> 00:13:17,520
So early 2018, there was a plenty of solutions, both in the private blockchains and public

166
00:13:17,520 --> 00:13:21,480
blockchains, but it wasn't well documented either.

167
00:13:21,480 --> 00:13:26,840
So I decided to start studying these platforms and their consensus protocols.

168
00:13:26,840 --> 00:13:34,280
And eventually, I came up with a paper from Silvio Michali that was proposing a solution

169
00:13:34,280 --> 00:13:40,400
on how to solve consensus in large scale decentralized networks.

170
00:13:40,400 --> 00:13:47,000
That paper impressed me a lot, because Silvio proposed an extremely elegant way to solve

171
00:13:47,000 --> 00:13:51,280
consensus in such a large scale network.

172
00:13:51,280 --> 00:13:58,520
So the paper impressed me a lot, and I started playing around with Algorand during my PhD.

173
00:13:58,520 --> 00:14:05,280
Both from a theoretical point of view by investigating the consensus protocol, but also from a practical

174
00:14:05,280 --> 00:14:06,280
point of view.

175
00:14:06,280 --> 00:14:15,120
So Algorand was launched in 2019, and at the time, there was some future that you were

176
00:14:15,120 --> 00:14:19,920
able to experiment with, like smart contracts.

177
00:14:19,920 --> 00:14:26,760
So during my research, I started using Algorand and proposing Algorand also for the projects

178
00:14:26,760 --> 00:14:29,800
my research group were working on.

179
00:14:29,800 --> 00:14:38,400
For example, in early 2020, I proposed Algorand together with my professor at St. Hampton

180
00:14:38,400 --> 00:14:43,480
for a very interesting project with the European Public Administration.

181
00:14:43,480 --> 00:14:49,800
So I was involved with the Ministry of Finance, the Defense and the Social Security Institute.

182
00:14:49,800 --> 00:14:58,200
We were trying to achieve a blockchain system for the public administration in Italy.

183
00:14:58,200 --> 00:15:01,160
And we designed a prototype with Algorand.

184
00:15:01,160 --> 00:15:06,320
And we used Algorand for the infrastructure, but also for the application use cases, like

185
00:15:06,320 --> 00:15:12,520
how can we issue a loan on Algorand's smart contracts, or how can we issue a social security

186
00:15:12,520 --> 00:15:15,080
for citizens on Algorand's smart contracts.

187
00:15:15,080 --> 00:15:23,760
So it was quite in the middle between investigation and application, real application use cases.

188
00:15:23,760 --> 00:15:26,360
Interesting.

189
00:15:26,360 --> 00:15:37,120
And I suppose just to recap for the listeners out here, the interesting highlight from Stefano's

190
00:15:37,120 --> 00:15:44,920
background here is the fact that there was indeed a research that has been done in regards

191
00:15:44,920 --> 00:15:53,280
to finding a good chain and the good consensus that can satisfy a very specific problem, which

192
00:15:53,280 --> 00:15:57,120
in this case was the project that you outlined.

193
00:15:57,120 --> 00:16:04,320
And well, going back to the dates, I suppose this was 2017, there was a big boom in the

194
00:16:04,320 --> 00:16:05,600
crypto market back then.

195
00:16:05,600 --> 00:16:12,200
And lots of chains also specifically showed the limits of their scalability.

196
00:16:12,200 --> 00:16:21,960
But it's interesting to hear that Algorand came in as a theoretically good option for

197
00:16:21,960 --> 00:16:26,800
the implementation, and then it showed to be the case after the practical application

198
00:16:26,800 --> 00:16:29,200
of it.

199
00:16:29,200 --> 00:16:36,800
Any other, perhaps minor things, or like some minor open source project or other projects

200
00:16:36,800 --> 00:16:44,680
within Algorand space that you want to highlight, perhaps you had the chance to work with while

201
00:16:44,680 --> 00:16:46,880
exploring the chain or learning it.

202
00:16:46,880 --> 00:16:53,560
Otherwise, we can essentially dive into the main topic of the talk and talk about the

203
00:16:53,560 --> 00:16:57,640
consensus.

204
00:16:57,640 --> 00:17:03,280
So do you mean before starting Algorand?

205
00:17:03,280 --> 00:17:11,880
No, I mean, aside from the project that you worked on during the research, just if there's

206
00:17:11,880 --> 00:17:17,520
anything else within the Algorand space or Algorand ecosystem that you worked on or contributed

207
00:17:17,520 --> 00:17:22,800
to that you wanted to highlight, otherwise we can proceed.

208
00:17:22,800 --> 00:17:29,040
I'm sure there's a lot of things that you got involved with in the ecosystem as well.

209
00:17:29,040 --> 00:17:36,600
But during my PhD, when I was proposing Algorand as a solution for a project, a research project,

210
00:17:36,600 --> 00:17:43,600
I also jumped in as a developer ambassador with Algorand Foundation.

211
00:17:43,600 --> 00:17:51,000
And I took time for proposing a couple of solutions on the Algorand developer portals.

212
00:17:51,000 --> 00:17:54,600
You can, I think they are still available online.

213
00:17:54,600 --> 00:17:59,720
One solution was quite interesting because it was a tutorial on how to build a private

214
00:17:59,720 --> 00:18:09,760
instance of Algorand, like building a side chain of Algorand in your own on-premise or

215
00:18:09,760 --> 00:18:10,760
on-cloud.

216
00:18:10,760 --> 00:18:11,760
I see.

217
00:18:11,760 --> 00:18:18,680
It was like a big fork of Algorand, although it's quite complex.

218
00:18:18,680 --> 00:18:19,680
Yeah.

219
00:18:19,680 --> 00:18:20,680
Exactly.

220
00:18:20,680 --> 00:18:28,920
It was like a parallel and I built it just for research matters.

221
00:18:28,920 --> 00:18:35,480
And there is a tutorial that explains how to configure the notes, the relay notes, participating

222
00:18:35,480 --> 00:18:39,160
notes for doing that.

223
00:18:39,160 --> 00:18:46,320
So this was quite interesting because I had the chance to really understand both the code

224
00:18:46,320 --> 00:18:53,560
base of Algorand but at the infrastructure level, how you can handle commands, configurations

225
00:18:53,560 --> 00:18:57,400
required for notes communications.

226
00:18:57,400 --> 00:18:58,400
Yeah.

227
00:18:58,400 --> 00:19:06,720
I must agree that that's one of the best practical ways to learn how the Algorand network functions

228
00:19:06,720 --> 00:19:15,280
and also to recall the guests from the previous episode, Patrick Bennett, the CEO of NFD, of

229
00:19:15,280 --> 00:19:17,080
TX and Labs.

230
00:19:17,080 --> 00:19:23,640
One of his ways to learn it was also essentially cloning the Algorand Go implementation and

231
00:19:23,640 --> 00:19:32,840
going over the intricate details of the implementation to learn about the network itself.

232
00:19:32,840 --> 00:19:38,520
So once again, Stefano, thanks for this amazing introduction.

233
00:19:38,520 --> 00:19:42,520
This is certainly setting the stage for your background and experience.

234
00:19:42,520 --> 00:19:50,040
And what I'm mostly interested in during this episode is to talk a little bit about the

235
00:19:50,040 --> 00:19:54,080
brief history of the consensus mechanism.

236
00:19:54,080 --> 00:20:00,280
But before we dive into it, I feel like it might be important to firstly set the stage

237
00:20:00,280 --> 00:20:05,720
for the main definitions for the terminology that we will be using.

238
00:20:05,720 --> 00:20:12,480
So perhaps let's start with the very fundamental questions such as, can you, for example, think

239
00:20:12,480 --> 00:20:22,120
about what consensus protocols are and essentially what role do they play in distributed systems?

240
00:20:22,120 --> 00:20:25,000
Yeah, sure.

241
00:20:25,000 --> 00:20:27,120
That's a good point.

242
00:20:27,120 --> 00:20:34,480
And I think before trying to define a consensus protocol, we should define what a distributed

243
00:20:34,480 --> 00:20:37,760
system is in general.

244
00:20:37,760 --> 00:20:46,440
So a distributed system, you can think about a set of connected processes that cooperate

245
00:20:46,440 --> 00:20:54,520
together for a certain goal or to accomplish a common task.

246
00:20:54,520 --> 00:21:02,640
But from a user standpoint, we are used to have in computer systems a server and a client.

247
00:21:02,640 --> 00:21:07,960
So from the client side, the distributed system is transparent.

248
00:21:07,960 --> 00:21:13,320
So you just send your request to a server and you receive an answer.

249
00:21:13,320 --> 00:21:18,880
And you don't care that under the hood, there is a more complex system composed by several

250
00:21:18,880 --> 00:21:26,000
processes that altogether they coordinate to serve your request and produce an output.

251
00:21:26,000 --> 00:21:30,920
So this is what a distributed system is.

252
00:21:30,920 --> 00:21:32,120
Why we need consensus?

253
00:21:32,120 --> 00:21:34,640
Because we have all these consensus in this last case.

254
00:21:34,640 --> 00:21:41,520
So these processes together need to agree on a consistent state.

255
00:21:41,520 --> 00:21:47,080
And to do so, they need a protocol, which is the consensus protocol.

256
00:21:47,080 --> 00:21:50,240
This protocol is executed in a distributed fashion.

257
00:21:50,240 --> 00:21:53,480
So everyone has the code of the protocol.

258
00:21:53,480 --> 00:22:00,520
And they all know, when I say they are in the nodes of the network, they all know how

259
00:22:00,520 --> 00:22:04,520
to process to execute that protocol.

260
00:22:04,520 --> 00:22:10,280
So the first primitive of a consensus protocol is the atomic broadcast.

261
00:22:10,280 --> 00:22:11,280
What is that?

262
00:22:11,280 --> 00:22:17,680
Well, when you have a consensus protocol, you have these processes that need to communicate

263
00:22:17,680 --> 00:22:18,680
each other.

264
00:22:18,680 --> 00:22:22,760
And to do so, they need an algorithm, a protocol.

265
00:22:22,760 --> 00:22:29,000
And the protocol for doing these communications is the atomic broadcast that guarantees

266
00:22:29,000 --> 00:22:33,080
for all replicas of the network that they are on the same page.

267
00:22:33,080 --> 00:22:35,800
So they have the same set of messages.

268
00:22:35,800 --> 00:22:39,280
And this set of messages is in the same order.

269
00:22:39,280 --> 00:22:44,320
So consensus is quite crucial for complex systems.

270
00:22:44,320 --> 00:22:50,160
We can imagine the importance of consensus when we talk about cloud computing, when we

271
00:22:50,160 --> 00:22:57,440
talk about distributed databases, where consistency can be very important for replicas, and then

272
00:22:57,440 --> 00:23:01,720
blockchains, of course.

273
00:23:01,720 --> 00:23:12,480
When describing distributed systems, there is this theoretical mathematical model, I would

274
00:23:12,480 --> 00:23:22,240
say, in this case, that is often used to build the fundamental notion on how to basically

275
00:23:22,240 --> 00:23:24,080
approach consensus mechanism design.

276
00:23:24,080 --> 00:23:30,320
And what I'm referring to is the state machine replication problem, which I think could be

277
00:23:30,320 --> 00:23:36,000
another good terminology to set up before we dive a bit further.

278
00:23:36,000 --> 00:23:41,640
Just if you could also briefly describe a little bit the state machine replication.

279
00:23:41,640 --> 00:23:49,520
And another important aspect to cover is where is the system that is using the consensus

280
00:23:49,520 --> 00:23:51,040
protocol is operating, right?

281
00:23:51,040 --> 00:23:55,760
When we're talking about networking, in this particular case, there is a set of different

282
00:23:55,760 --> 00:23:58,040
assumptions, right?

283
00:23:58,040 --> 00:24:03,080
There is an ideal theoretical assumptions that could be relaxing a lot of conditions

284
00:24:03,080 --> 00:24:06,000
where we have a fully synchronous network.

285
00:24:06,000 --> 00:24:09,240
Everything is we have some sort of centralized clock.

286
00:24:09,240 --> 00:24:16,360
Then we have more realistic real world application, and they reach most of the modern consensus

287
00:24:16,360 --> 00:24:21,480
operating, which is partially synchronous model, and then we have the asynchronous, which is

288
00:24:21,480 --> 00:24:30,440
something that is proven to be impossible to design a consensus mechanism for.

289
00:24:30,440 --> 00:24:32,880
And well, there are assumptions in the protocol design.

290
00:24:32,880 --> 00:24:39,120
Let's perhaps go over those terms as well.

291
00:24:39,120 --> 00:24:45,880
And there would be one more before we will be able to dive into the proper history.

292
00:24:45,880 --> 00:24:48,880
So yeah, let's start with the state machine replication.

293
00:24:48,880 --> 00:24:54,680
If you could just briefly define the definition of this problem.

294
00:24:54,680 --> 00:24:56,520
Right.

295
00:24:56,520 --> 00:25:00,120
So it's a big question.

296
00:25:00,120 --> 00:25:03,600
Let's try to be as quick as possible.

297
00:25:03,600 --> 00:25:04,600
Sure.

298
00:25:04,600 --> 00:25:10,320
We need to sanitize like 30 years or more of distributed system theory.

299
00:25:10,320 --> 00:25:17,400
So first of all, you mentioned about state machine replication, but what is replication?

300
00:25:17,400 --> 00:25:22,680
So I briefly give you an introduction of what a distributed system is.

301
00:25:22,680 --> 00:25:25,440
So replication is quite a similar concept.

302
00:25:25,440 --> 00:25:32,520
So in traditional computers, client server architecture, we have the server that represents

303
00:25:32,520 --> 00:25:35,520
like the single point of failure.

304
00:25:35,520 --> 00:25:40,880
This means that if the server goes down or there are some outages, your service provided

305
00:25:40,880 --> 00:25:45,360
by the server is not available anymore to the client.

306
00:25:45,360 --> 00:25:53,120
So that's why if we want to provide robustness to our service, we need replication.

307
00:25:53,120 --> 00:25:54,760
So what is replication?

308
00:25:54,760 --> 00:25:58,840
You take the server and you replicate your service across different servers.

309
00:25:58,840 --> 00:26:05,360
In this way, if the server goes down because of foods or because whatever reason, the

310
00:26:05,360 --> 00:26:13,080
system can keep operating correctly and the client can be served from the server side.

311
00:26:13,080 --> 00:26:19,720
So this concept was firstly theorized by Leslie Lamport in the 70s with the concept of state

312
00:26:19,720 --> 00:26:22,520
machine replication.

313
00:26:22,520 --> 00:26:24,280
So what is the state machine replication?

314
00:26:24,280 --> 00:26:31,640
So imagine again our distributed network of processes and each product process is running

315
00:26:31,640 --> 00:26:38,560
a deterministic state machine replicated for overall processes.

316
00:26:38,560 --> 00:26:47,040
In that sense, every process starts in the same state and then executes the request coming

317
00:26:47,040 --> 00:26:52,040
from clients in the same order because of atomic broadcast.

318
00:26:52,040 --> 00:27:00,040
So basically given an input to the state machine replication, all the processes do the computation

319
00:27:00,040 --> 00:27:05,920
and then they produce a single output which is equal for any replica of the state machine.

320
00:27:05,920 --> 00:27:08,600
How can we achieve that?

321
00:27:08,600 --> 00:27:15,240
Well, by running a distributed consensus protocol across replicas.

322
00:27:15,240 --> 00:27:22,560
So state machine replication is quite important in distributed system and it is as much important

323
00:27:22,560 --> 00:27:28,520
as full tolerance because you can have a state machine replicated of the nodes, but if there

324
00:27:28,520 --> 00:27:37,040
is not full tolerance, your operation can be faulty if something goes wrong.

325
00:27:37,040 --> 00:27:44,000
So the full tolerance is the concept of guaranteeing that a process, that operations from the system

326
00:27:44,000 --> 00:27:48,880
are served and are correct even if some process goes down.

327
00:27:48,880 --> 00:27:56,480
And we call this kind of system F fault tolerant in the presence of F faulty nodes.

328
00:27:56,480 --> 00:28:01,520
So if we have F faulty nodes and the system can't tolerate those nodes, we call the system

329
00:28:01,520 --> 00:28:06,080
F fault tolerant.

330
00:28:06,080 --> 00:28:10,520
Of course, you will need a certain amount of harness node to keep the system processing

331
00:28:10,520 --> 00:28:13,760
operations.

332
00:28:13,760 --> 00:28:19,480
So in this background, then we have several considerations as you were mentioning about

333
00:28:19,480 --> 00:28:23,600
the execution of your fault tolerant system.

334
00:28:23,600 --> 00:28:25,360
What we have to consider.

335
00:28:25,360 --> 00:28:30,800
So first of all, the network model because we are in a distributed system and our processes

336
00:28:30,800 --> 00:28:34,280
needs to communicate with each other.

337
00:28:34,280 --> 00:28:38,520
The communication happen across a network.

338
00:28:38,520 --> 00:28:44,800
So in an optimal situation, in a perfect world, the communications are synchronous.

339
00:28:44,800 --> 00:28:51,120
So every time process A sends a message to process B, the message is instantly a synchronously

340
00:28:51,120 --> 00:28:53,880
delivered.

341
00:28:53,880 --> 00:28:56,720
And also the processes can rely on the synchronous clocks.

342
00:28:56,720 --> 00:28:58,720
So everyone has the same view of the time.

343
00:28:58,720 --> 00:29:05,320
And this is great for consensus because it's very easy to achieve consensus in such an

344
00:29:05,320 --> 00:29:06,320
environment.

345
00:29:06,320 --> 00:29:13,040
Sadly, this is unfreezeable in the real world because real networks are asynchronous.

346
00:29:13,040 --> 00:29:18,400
So every replica may have a different perception of time.

347
00:29:18,400 --> 00:29:22,880
At each time, they have a different perception of time.

348
00:29:22,880 --> 00:29:30,360
And indeed, I don't know if you heard about it, but there is this very nice study from

349
00:29:30,360 --> 00:29:39,000
a Fisher, Michael Fisher, stating that it's a theorem called FLP impossibility, which

350
00:29:39,000 --> 00:29:44,840
states that it's impossible in a distributed system to achieve consensus during a certain

351
00:29:44,840 --> 00:29:49,880
time bound if our replicas can make crash.

352
00:29:49,880 --> 00:29:54,080
So in presence of crash faults.

353
00:29:54,080 --> 00:29:59,080
So that's why we cannot model consensus in an asynchronous network.

354
00:29:59,080 --> 00:30:03,840
And that's why the most, as you were mentioning, the most accepted model for network is partially

355
00:30:03,840 --> 00:30:05,520
synchronous network.

356
00:30:05,520 --> 00:30:11,560
A partially synchronous network is basically an asynchronous network that eventually behave

357
00:30:11,560 --> 00:30:12,800
as a synchronous network.

358
00:30:12,800 --> 00:30:17,720
So messages eventually will be delivered to hold the honest note.

359
00:30:17,720 --> 00:30:25,040
And under this assumption, a lot of protocols have been developed.

360
00:30:25,040 --> 00:30:31,080
And for sorry, I just wanted to add some additional information here for the listeners and for

361
00:30:31,080 --> 00:30:37,080
folks who are interested in the practicality of what it means to have this assumption in

362
00:30:37,080 --> 00:30:38,840
a partially synchronous model.

363
00:30:38,840 --> 00:30:44,120
In other words, you could also imagine that the consensus protocol essentially has some

364
00:30:44,120 --> 00:30:50,600
sort of assumption on the time it takes for the message to be delivered.

365
00:30:50,600 --> 00:30:57,280
In other words, setting in certain cases in the simplest example we could pick is, let's

366
00:30:57,280 --> 00:31:03,120
say, picking some constant T that would outline the time in the network under which, let's

367
00:31:03,120 --> 00:31:09,320
say, the network is being under attack or the network is trying to achieve the state

368
00:31:09,320 --> 00:31:12,760
under which it's going to reach a synchronicity.

369
00:31:12,760 --> 00:31:20,440
So this assumption that we're atlining here is mostly referring to the theoretical design

370
00:31:20,440 --> 00:31:25,440
of the consensus protocol because it's very important to outline the environment for which

371
00:31:25,440 --> 00:31:28,520
you are designing this protocol.

372
00:31:28,520 --> 00:31:30,120
But yeah, please go ahead.

373
00:31:30,120 --> 00:31:32,440
Yeah, that's correct.

374
00:31:32,440 --> 00:31:37,680
And it's very important because from there, you can read your protocol.

375
00:31:37,680 --> 00:31:41,920
And the second pillar of the analysis, it's fault.

376
00:31:41,920 --> 00:31:46,480
So your process may fall somehow.

377
00:31:46,480 --> 00:31:50,720
You may be subject to faulty processes.

378
00:31:50,720 --> 00:31:57,680
Or maybe your process can be subverted by adversary and act maliciously against the protocol

379
00:31:57,680 --> 00:32:04,160
or even being disconnected for a certain period, like partitioning of the network.

380
00:32:04,160 --> 00:32:08,520
So all these faults are called by-zantine faults.

381
00:32:08,520 --> 00:32:14,320
And there is this term by-zantine that comes from the by-zantine general problem stipulated

382
00:32:14,320 --> 00:32:20,160
again by the genius of Leslie Lampo.

383
00:32:20,160 --> 00:32:26,520
So the by-zantine general problem, it's an allegory to a distributed system represented

384
00:32:26,520 --> 00:32:33,440
as a set of generals around a village that they want to attack this village.

385
00:32:33,440 --> 00:32:35,040
And they need to decide a strategy.

386
00:32:35,040 --> 00:32:40,720
Either to attack the village instantly or retreat for some reason.

387
00:32:40,720 --> 00:32:51,040
So Leslie Lampo explained to us that you need enough generals to achieve a consensus to either

388
00:32:51,040 --> 00:32:53,520
attack or retreat the village.

389
00:32:53,520 --> 00:32:54,520
And why you need that?

390
00:32:54,520 --> 00:32:59,400
Because this general, to communicate each other, must rely on messengers.

391
00:32:59,400 --> 00:33:04,200
And these messengers, again, can be disconnected or delayed.

392
00:33:04,200 --> 00:33:07,800
And again, the generals itself can be traitors.

393
00:33:07,800 --> 00:33:10,000
So they can cheat again.

394
00:33:10,000 --> 00:33:11,000
Exactly.

395
00:33:11,000 --> 00:33:18,360
So to achieve a consensus, Lampo demonstrated that you cannot solve this problem if you

396
00:33:18,360 --> 00:33:19,880
just have three generals.

397
00:33:19,880 --> 00:33:26,960
Or more in general, sorry for the wording, but for more in general, you need one-third

398
00:33:26,960 --> 00:33:29,240
of generals to be by-zantine.

399
00:33:29,240 --> 00:33:34,640
You cannot have more than one-third of generals to be by-zantine.

400
00:33:34,640 --> 00:33:42,840
And when we say that, let's say, a node experience, they Byzantine fault, essentially, in other

401
00:33:42,840 --> 00:33:49,120
words, we could say that this is an aggregated description for a variety of different errors

402
00:33:49,120 --> 00:33:50,120
that could happen.

403
00:33:50,120 --> 00:33:55,600
Either it's a software failure, hardware failure, and this also includes the possibility that

404
00:33:55,600 --> 00:33:59,880
there is some malicious intent behind the node itself.

405
00:33:59,880 --> 00:34:09,600
So when you hear the term, a Byzantine fault, it is essentially a way for highlighting that

406
00:34:09,600 --> 00:34:14,520
there is this array of very unpredictable behavior that can happen within the node.

407
00:34:14,520 --> 00:34:24,320
So a Byzantine fault or a consensus is usually something that can guarantee a very strong

408
00:34:24,320 --> 00:34:30,920
assumptions on security if it is able to withstand Byzantine faults.

409
00:34:30,920 --> 00:34:38,120
And with this little recap for the listener, so we just covered the essentially main, I

410
00:34:38,120 --> 00:34:43,200
would say, terminology that we would be touching in the next parts.

411
00:34:43,200 --> 00:34:49,360
And Stefano was kind enough to provide the fundamentals here in regards to what consensus

412
00:34:49,360 --> 00:34:51,600
is, what distributed systems are.

413
00:34:51,600 --> 00:34:53,480
We covered the state machine replication.

414
00:34:53,480 --> 00:34:58,040
There was very nice mention of the FLP impossibility as well.

415
00:34:58,040 --> 00:35:03,160
And all of the references to what Stefano is mentioning are going to be provided at

416
00:35:03,160 --> 00:35:05,360
the description of the episode.

417
00:35:05,360 --> 00:35:10,640
FLP stands for official lynch and patterns and disenames of the academicians who came

418
00:35:10,640 --> 00:35:16,280
up with this theorem and perhaps for folks from other software engineering backgrounds.

419
00:35:16,280 --> 00:35:20,920
If you're familiar with CAP theorem, this is something that I would say mirrors the

420
00:35:20,920 --> 00:35:28,320
CAP theorem a little bit, although it comes from a slightly different domain of engineering.

421
00:35:28,320 --> 00:35:33,800
And we also covered the networking models, which is synchronous, partially synchronous,

422
00:35:33,800 --> 00:35:39,840
and asynchronous, as well as the Byzantine generals problem and Byzantine faults.

423
00:35:39,840 --> 00:35:45,720
So I think with this in our baggage, we should be able to appreciate the history of the consensus

424
00:35:45,720 --> 00:35:46,720
now.

425
00:35:46,720 --> 00:35:52,960
So diving into one of the first well-known consensus protocol designs, and you mentioned

426
00:35:52,960 --> 00:36:02,200
the work of Leslie Lamport, who certainly gave a tremendous contribution to this field.

427
00:36:02,200 --> 00:36:13,720
So any specific protocols that you could highlight in the early era of the rise of this domain?

428
00:36:13,720 --> 00:36:22,720
Yeah, so of course, well, the contribution of Lamport was great, but we first need to

429
00:36:22,720 --> 00:36:31,600
divide, let's say, our story of consensus and two big families of consensus protocol.

430
00:36:31,600 --> 00:36:39,160
One is the consensus protocols in the first stage of development, and I call them, well,

431
00:36:39,160 --> 00:36:44,960
they are called crash tolerant consensus protocols, because this protocol are just able to tolerate

432
00:36:44,960 --> 00:36:47,200
crash faults.

433
00:36:47,200 --> 00:36:53,600
So when your replica goes down for any reason.

434
00:36:53,600 --> 00:36:58,200
And then there is the second family of consensus protocol, more complex, which is the Byzantine

435
00:36:58,200 --> 00:37:01,320
fault tolerant protocol that we will see in a minute.

436
00:37:01,320 --> 00:37:08,160
Let's take on crash fault tolerant consensus protocol.

437
00:37:08,160 --> 00:37:14,840
So in that stage, there is this big trust assumption.

438
00:37:14,840 --> 00:37:20,120
I was mentioning about a system of n participants.

439
00:37:20,120 --> 00:37:26,800
With n participants, the assumption is that we can tolerate up to f, which is a theoretical

440
00:37:26,800 --> 00:37:29,880
number, faulty processes.

441
00:37:29,880 --> 00:37:38,240
In crash tolerant protocol, this f is represented as the minority of the whole participants in

442
00:37:38,240 --> 00:37:39,960
the distributed system.

443
00:37:39,960 --> 00:37:48,120
So f must be lower than the half of the participants.

444
00:37:48,120 --> 00:37:55,600
In this context, the first implementation of a consensus protocol, guess who, was proposed

445
00:37:55,600 --> 00:38:01,120
by Leslie Lampert in the 80s.

446
00:38:01,120 --> 00:38:04,640
The protocol was called Paxos.

447
00:38:04,640 --> 00:38:09,000
So Paxos is a very complex protocol, actually.

448
00:38:09,000 --> 00:38:12,280
Let me try to give you just the intuition.

449
00:38:12,280 --> 00:38:21,440
Paxos is the, implements the protocol, defining three main actors, which are proposers, acceptors,

450
00:38:21,440 --> 00:38:22,440
and listeners.

451
00:38:22,440 --> 00:38:29,600
So the proposers are in charge in the system to broadcast the new value, to be agreed by

452
00:38:29,600 --> 00:38:32,240
the network.

453
00:38:32,240 --> 00:38:38,200
So they broadcast the message to acceptors and listeners.

454
00:38:38,200 --> 00:38:46,240
The acceptors collect the information and then send back an acknowledgement to the proposer.

455
00:38:46,240 --> 00:38:52,960
If the proposer recedes a majority of our acknowledgement, so a forum in the network,

456
00:38:52,960 --> 00:38:59,440
he's happy that everyone is on the same page because everyone has the same value to update

457
00:38:59,440 --> 00:39:01,240
it on their system.

458
00:39:01,240 --> 00:39:07,680
And then they can consider altogether the value committed on the storage or whatever

459
00:39:07,680 --> 00:39:11,160
shared memory is.

460
00:39:11,160 --> 00:39:16,560
Presents are just silent nodes that observe the communications.

461
00:39:16,560 --> 00:39:23,320
And eventually, if something is approved by proposer and acceptors, they agree with the

462
00:39:23,320 --> 00:39:24,320
system.

463
00:39:24,320 --> 00:39:31,040
So they decide to update their system to the latest proposed value.

464
00:39:31,040 --> 00:39:39,040
In case we have something like concurrent proposers, the acceptors may decide to update

465
00:39:39,040 --> 00:39:46,600
the storage to the latest proposed version of a value in a certain round.

466
00:39:46,600 --> 00:39:50,240
And this ensures that everyone is on the same page.

467
00:39:50,240 --> 00:39:55,480
Of course, this is quite a strange protocol, well, not strange, but quite complex and was

468
00:39:55,480 --> 00:40:01,680
very difficult for developers to implement it and also for scientists to digest it.

469
00:40:01,680 --> 00:40:09,000
So more recently has been proposed a new version, a simplified version of Paxos, which is called

470
00:40:09,000 --> 00:40:10,000
Raft.

471
00:40:10,000 --> 00:40:17,280
So Raft is a consensus protocol that is great because it's one of the first consensus protocols

472
00:40:17,280 --> 00:40:24,240
introducing the concept of leader in a distributed system.

473
00:40:24,240 --> 00:40:26,240
So how does it work?

474
00:40:26,240 --> 00:40:30,240
So Raft has this leader election.

475
00:40:30,240 --> 00:40:33,120
So the system elects one leader.

476
00:40:33,120 --> 00:40:40,320
And this leader is a trusted authority responsible of saying to the others, what's the next value

477
00:40:40,320 --> 00:40:43,640
to be updated on the shared memory?

478
00:40:43,640 --> 00:40:47,640
And the leader is the main entity exposed to clients as well.

479
00:40:47,640 --> 00:40:53,880
So all the requests coming from clients are concentrated to the leader.

480
00:40:53,880 --> 00:40:59,160
Then the leader decides the order of this request and then decides also which request

481
00:40:59,160 --> 00:41:01,880
to broadcast to the rest of the network.

482
00:41:01,880 --> 00:41:07,680
These replicas are listening for messages from the leader.

483
00:41:07,680 --> 00:41:11,240
Once they arrive, they send back the acknowledgement.

484
00:41:11,240 --> 00:41:17,560
So for each round of the protocol, the leader proposes a new value until the majority of

485
00:41:17,560 --> 00:41:21,600
replicas agree on that value.

486
00:41:21,600 --> 00:41:26,280
So once the leader collects the majority, he can consider the value committed.

487
00:41:26,280 --> 00:41:30,440
And why you always heard about majority, majority.

488
00:41:30,440 --> 00:41:36,760
And the point here is that in crush-tolerant systems, we have this bigger assumption that

489
00:41:36,760 --> 00:41:43,800
we don't have more than a half of replicas faulty.

490
00:41:43,800 --> 00:41:49,360
So in that system, with this assumption, you can be sure that if you receive the majority

491
00:41:49,360 --> 00:41:58,040
of votes, you are probably on the same page on the other honest replicas.

492
00:41:58,040 --> 00:42:02,560
And this is almost the two main implementations for crush-tolerance protocols.

493
00:42:02,560 --> 00:42:09,840
But now we can dive into the more interesting part, like the Byzantine-full-tolerant protocol.

494
00:42:09,840 --> 00:42:15,360
Let me say, spend a few words on why it's very important to have a protocol which is

495
00:42:15,360 --> 00:42:22,800
Byzantine-full-tolerant and why this is very important in a blockchain system.

496
00:42:22,800 --> 00:42:29,280
So in a blockchain system, there is no trust, right, in the network.

497
00:42:29,280 --> 00:42:34,400
Ideally, trustless should be permissionless.

498
00:42:34,400 --> 00:42:35,400
Exactly.

499
00:42:35,400 --> 00:42:39,400
Ideally, it's trustless and permissionless.

500
00:42:39,400 --> 00:42:45,440
Even in permissionless networks, you need a bit of more trust, but still you are in

501
00:42:45,440 --> 00:42:46,600
a decentralized system.

502
00:42:46,600 --> 00:42:50,160
But let's stick on the worst case, right?

503
00:42:50,160 --> 00:42:55,000
So the permissionless network where everyone can join, jump in, jump out.

504
00:42:55,000 --> 00:43:01,920
And also everyone can just download the source code of the protocol, which is open source,

505
00:43:01,920 --> 00:43:04,040
usually.

506
00:43:04,040 --> 00:43:08,120
Change it a bit and connect to the main network.

507
00:43:08,120 --> 00:43:15,000
And when I say change it a bit, it could be like corrupting the code and making things

508
00:43:15,000 --> 00:43:24,080
that can be against the protocol, like producing bad blocks or producing double messages.

509
00:43:24,080 --> 00:43:25,080
Double-spending.

510
00:43:25,080 --> 00:43:29,560
There's an entire array of different attacks that you can perform, yeah.

511
00:43:29,560 --> 00:43:30,960
Exactly.

512
00:43:30,960 --> 00:43:35,120
So in that scenario, it's quite important that the protocol, the underlying protocol

513
00:43:35,120 --> 00:43:40,200
of your network is resilient to possible Byzantine behavior.

514
00:43:40,200 --> 00:43:46,640
And that's why it's very important to adopt something like Byzantine for Torrent protocols.

515
00:43:46,640 --> 00:43:52,680
So going back to the history of the distributed systems, after the work on crush Torrent

516
00:43:52,680 --> 00:44:00,680
protocols, there is this theoretical study showing that there is the possibility of implementing

517
00:44:00,680 --> 00:44:06,440
the general problem of Byzantine for Torrent proposed by Leslie Lamport.

518
00:44:06,440 --> 00:44:14,560
And the way we implemented it is through a protocol called Practical Byzantine for Torrent,

519
00:44:14,560 --> 00:44:21,720
PBFT, proposed in 2001 by two academics, Castro and Lisco.

520
00:44:21,720 --> 00:44:28,640
And this was great because they were able to implement a protocol from the assumptions

521
00:44:28,640 --> 00:44:37,240
proposed by Lamport that in a Byzantine environment, no more than one-third of your network can

522
00:44:37,240 --> 00:44:41,200
be faulty.

523
00:44:41,200 --> 00:44:46,280
And from this assumption, they demonstrated that in a partially synchronous network,

524
00:44:46,280 --> 00:44:48,920
they were able to achieve consensus.

525
00:44:48,920 --> 00:44:54,320
So the PBFT protocol, just for giving you the intuition, works in the following way.

526
00:44:54,320 --> 00:45:03,120
So the idea is that we still have a leader, and then the replicas exchange a lot of messages

527
00:45:03,120 --> 00:45:07,840
trying to achieve a certain degree on a certain value.

528
00:45:07,840 --> 00:45:15,000
So in crush Torrent protocols, we have a few message rounds because we don't have

529
00:45:15,000 --> 00:45:16,000
Byzantine behavior.

530
00:45:16,000 --> 00:45:20,720
So we trust that when we receive the message, this message is trustable.

531
00:45:20,720 --> 00:45:23,400
In this scenario, this is not true.

532
00:45:23,400 --> 00:45:30,640
So once as a replica, I receive a message, I also want to know you all what message received

533
00:45:30,640 --> 00:45:35,920
and also my colleagues what message received before committing that message on my shared

534
00:45:35,920 --> 00:45:36,920
memory.

535
00:45:36,920 --> 00:45:41,280
So that's why we need more rounds of communication.

536
00:45:41,280 --> 00:45:46,680
And to be precise, there are three rounds of communication in which, first of all, the

537
00:45:46,680 --> 00:45:50,000
leader proposed a block to all replicas.

538
00:45:50,000 --> 00:45:57,800
And then all the replicas exchange messages in a certain way to achieve a certain consensus

539
00:45:57,800 --> 00:46:01,080
over the message received.

540
00:46:01,080 --> 00:46:06,200
So you might question me, what happens if there are delays?

541
00:46:06,200 --> 00:46:12,800
So if you are connected and receive some message with a certain delay, you may propose, well,

542
00:46:12,800 --> 00:46:14,680
this leader is not honest anymore.

543
00:46:14,680 --> 00:46:16,360
So I want to change leader.

544
00:46:16,360 --> 00:46:20,920
And that's why I can start a new process of voting, try to change the leader.

545
00:46:20,920 --> 00:46:27,160
And here again, there is a voting process composed by three phases where replicas can

546
00:46:27,160 --> 00:46:31,640
ask the network to change the leader.

547
00:46:31,640 --> 00:46:36,160
But even if there is a partition, nobody can communicate with each other.

548
00:46:36,160 --> 00:46:37,160
What happened in that case?

549
00:46:37,160 --> 00:46:42,200
Well, the protocol has been demonstrated to be resilient to that scenario.

550
00:46:42,200 --> 00:46:47,960
And when I say resilient, it means that the protocol is able to ensure consistency across

551
00:46:47,960 --> 00:46:51,440
replicas, despite blindness.

552
00:46:51,440 --> 00:46:54,080
So your protocol cannot proceed.

553
00:46:54,080 --> 00:47:00,640
We are stuck at a certain round of the protocol, but nobody does anything.

554
00:47:00,640 --> 00:47:02,680
So we've done update our storage.

555
00:47:02,680 --> 00:47:06,560
There is no change on the shared memory or whatever.

556
00:47:06,560 --> 00:47:07,920
So we are safe.

557
00:47:07,920 --> 00:47:11,480
We are all on the same page, and we are stuck here.

558
00:47:11,480 --> 00:47:18,280
Once the connection returns and we have enough honest notes to talk with, then we can start

559
00:47:18,280 --> 00:47:23,640
proceeding the protocol again.

560
00:47:23,640 --> 00:47:29,680
And this is quite a quick introduction of crash-tolerant protocols and the Byzantine

561
00:47:29,680 --> 00:47:31,160
fault-tolerant protocols.

562
00:47:31,160 --> 00:47:32,160
Interesting.

563
00:47:32,160 --> 00:47:41,440
So I think we could essentially, or I could eventually, sketch a little something.

564
00:47:41,440 --> 00:47:43,280
A supplement to this episode.

565
00:47:43,280 --> 00:47:51,160
But in terms of the hierarchy that we've just outlined, so once again, we have the crash-tolerant

566
00:47:51,160 --> 00:47:52,660
protocols.

567
00:47:52,660 --> 00:47:59,120
Those systems, such as Paxos and Praf.net, was later introduced.

568
00:47:59,120 --> 00:48:06,280
A lot of the implementation details there is something that all the big tech companies

569
00:48:06,280 --> 00:48:07,840
are relying on.

570
00:48:07,840 --> 00:48:12,680
Because when we are saying crash-tolerant, this is not necessarily the domain of blockchain

571
00:48:12,680 --> 00:48:17,880
applications, because distributed systems are not necessarily blockchain only.

572
00:48:17,880 --> 00:48:26,400
So a lot of big corporations, such as Google, for example, do rely on things like Raft and

573
00:48:26,400 --> 00:48:28,520
Paxos and some of their offerings.

574
00:48:28,520 --> 00:48:33,520
And essentially, this is something that I would say also powered a lot of infrastructure

575
00:48:33,520 --> 00:48:36,000
in Web2 space in general.

576
00:48:36,000 --> 00:48:44,520
But talking about the Byzantine fault-tolerance, this is an entire different branch of different

577
00:48:44,520 --> 00:48:50,280
problems that specifically deal with, well, Byzantine faults.

578
00:48:50,280 --> 00:48:54,360
Some of the folks who are familiar with a lot of terminology in Web3 and blockchain might

579
00:48:54,360 --> 00:48:59,080
be curious when they are going to hear proof of stake and proof of work.

580
00:48:59,080 --> 00:49:05,400
But I believe this is something that is going to be covered as well.

581
00:49:05,400 --> 00:49:15,160
But aside from Byzantine fault-tolerance, I think if we were to look into that research

582
00:49:15,160 --> 00:49:25,160
area, what do you think about the rise of the blockchain technology introducing perhaps

583
00:49:25,160 --> 00:49:29,960
new ways to look at consensus design in general?

584
00:49:29,960 --> 00:49:35,600
We have, for the longest time, there has been a lot of innovation on consensus protocols

585
00:49:35,600 --> 00:49:38,160
that were based on the Byzantine fault-tolerance.

586
00:49:38,160 --> 00:49:44,760
But talking about things such as proof of work, which in this particular case has this

587
00:49:44,760 --> 00:49:52,880
way of embracing this longest chain rule, which is something that sacrifices on consistency

588
00:49:52,880 --> 00:49:56,680
a little bit, but never sacrifices on availability in this case.

589
00:49:56,680 --> 00:50:06,600
What do you think about the way embracing of the longest chain rules based consensus

590
00:50:06,600 --> 00:50:11,320
protocols, how they affected the blockchain and research space in general?

591
00:50:11,320 --> 00:50:19,360
And do you think this is something that is still a significant contender to more classic

592
00:50:19,360 --> 00:50:23,200
Byzantine fault-tolerant consensus protocols?

593
00:50:23,200 --> 00:50:25,200
Great question.

594
00:50:25,200 --> 00:50:36,040
Let's try to think about why we arrived to proof of work, why we needed proof of work.

595
00:50:36,040 --> 00:50:44,080
Proof of work has been proposed with the paper of Bitcoin, trying to solve a specific issue

596
00:50:44,080 --> 00:50:45,840
in distributed systems.

597
00:50:45,840 --> 00:50:52,160
So we had this theorization of blockchain, which is a decentralized network.

598
00:50:52,160 --> 00:50:58,240
The decentralized network is a special class of distributed system.

599
00:50:58,240 --> 00:51:03,280
Because if you think about distributed system as a cloud infrastructure, you still have

600
00:51:03,280 --> 00:51:09,000
a central authority in control of your distributed system.

601
00:51:09,000 --> 00:51:14,320
So the authority can decide, the number of replicas can decide who is joining.

602
00:51:14,320 --> 00:51:20,040
And so in that situation, a protocol like Paxos, as you were mentioning, is perfect.

603
00:51:20,040 --> 00:51:23,960
It's very efficient and you are happy with that.

604
00:51:23,960 --> 00:51:27,400
But if we think about the decentralized network, it's quite different, right?

605
00:51:27,400 --> 00:51:35,240
Because the scale of your system is very large.

606
00:51:35,240 --> 00:51:41,200
And also everyone can join in or can leave the network as they want.

607
00:51:41,200 --> 00:51:44,160
So in that situation, how do we achieve consensus?

608
00:51:44,160 --> 00:51:46,280
So basically, how do we elect the leader?

609
00:51:46,280 --> 00:51:51,000
So we don't have control because maybe I elect you as a leader, but after I elect you

610
00:51:51,000 --> 00:51:53,320
as a leader, you leave the network.

611
00:51:53,320 --> 00:51:55,720
So there is a problem there.

612
00:51:55,720 --> 00:52:02,040
So in a decentralized system and in blockchain space, the proof of work was great.

613
00:52:02,040 --> 00:52:08,800
Was great because it introduced the concept of randomly select leaders to propose new

614
00:52:08,800 --> 00:52:11,720
blocks, propose new messages.

615
00:52:11,720 --> 00:52:18,640
And the randomness was given by the solving of a mathematical puzzle, which was quite

616
00:52:18,640 --> 00:52:22,680
hard to solve and required a lot of computation, mining.

617
00:52:22,680 --> 00:52:26,000
We all know about mining.

618
00:52:26,000 --> 00:52:33,280
When the solution was found, then the node that found the solution was the proposer of

619
00:52:33,280 --> 00:52:34,280
the block.

620
00:52:34,280 --> 00:52:42,400
So such a puzzle is so difficult to solve that is quite improbable having two nodes,

621
00:52:42,400 --> 00:52:47,520
two miners solving the problem at the same time.

622
00:52:47,520 --> 00:52:52,360
And that's why, imagine we are in a network, I solve the problem with a certain probability,

623
00:52:52,360 --> 00:52:57,520
then I can spread the block I solved throughout the network.

624
00:52:57,520 --> 00:53:06,840
What happened if me and you are lucky guys, and we find the solution at the same time?

625
00:53:06,840 --> 00:53:14,040
So I start proposing the block I solved to my neighborhood and you start proposing the

626
00:53:14,040 --> 00:53:16,840
block you solved to your neighborhood.

627
00:53:16,840 --> 00:53:21,320
And in that case, there is a big issue because the network start to be partitioned, not in

628
00:53:21,320 --> 00:53:27,360
terms of communication, but in terms of consistency of their storage.

629
00:53:27,360 --> 00:53:30,840
And now that we've reached the blockchain, when I talked about storage, I just talked

630
00:53:30,840 --> 00:53:34,520
about the ledger.

631
00:53:34,520 --> 00:53:40,560
So in that case, we need a certain rule that governs our system and then that says to your

632
00:53:40,560 --> 00:53:44,640
neighbors or to my neighbors which block they need to adopt.

633
00:53:44,640 --> 00:53:50,520
In Bitcoin, we have this rule of the longest chain, which is let's keep pushing on the

634
00:53:50,520 --> 00:53:56,360
same ledger until you receive messages.

635
00:53:56,360 --> 00:54:02,760
You can consider a block finalized if there is six blocks in front of him that have been

636
00:54:02,760 --> 00:54:04,600
committed.

637
00:54:04,600 --> 00:54:09,640
Because in that case, it will be quite hard to revert the chain and remove that block

638
00:54:09,640 --> 00:54:14,000
from the chain of the network.

639
00:54:14,000 --> 00:54:16,320
So proof of work is great.

640
00:54:16,320 --> 00:54:21,040
It's great for random election of leaders in this entire system.

641
00:54:21,040 --> 00:54:23,040
The problem is that it was inefficient.

642
00:54:23,040 --> 00:54:26,480
Why is it inefficient?

643
00:54:26,480 --> 00:54:32,560
Because the larger is your network, the more difficult is the puzzle to be solved.

644
00:54:32,560 --> 00:54:41,720
Because if the puzzle is easy to solve, miners will be easier for miners to solve the problem.

645
00:54:41,720 --> 00:54:47,520
And then you might have a lot of concurrency in blockchain.

646
00:54:47,520 --> 00:54:53,080
But if it's so difficult, it will be very slow as well.

647
00:54:53,080 --> 00:54:57,880
I'm not talking about the energy demanding of course of mining process.

648
00:54:57,880 --> 00:55:04,720
So proof of work is great, but it wasn't like a solution for, it's a good solution probably

649
00:55:04,720 --> 00:55:10,840
for Bitcoin, but maybe it's not a good solution for smart contract platforms or for decentralized

650
00:55:10,840 --> 00:55:12,720
computation.

651
00:55:12,720 --> 00:55:19,120
And also there is big issue we were discussing before about that, which is the problem of

652
00:55:19,120 --> 00:55:23,040
proof of work tend to be centralized.

653
00:55:23,040 --> 00:55:24,040
What's the point here?

654
00:55:24,040 --> 00:55:34,760
Well, since solving the mathematical puzzle is quite hard, nodes tend to converge to put

655
00:55:34,760 --> 00:55:40,800
the force together in order to improve the computational power.

656
00:55:40,800 --> 00:55:47,280
And this is the phenomenon of mining pools where people trying to put their forces together

657
00:55:47,280 --> 00:55:54,440
in terms of computation, trying to achieve the proof of work.

658
00:55:54,440 --> 00:56:02,680
And if you have like three mining pools that converge all the mining power of the world,

659
00:56:02,680 --> 00:56:05,200
probably these mining pools are controlling your network.

660
00:56:05,200 --> 00:56:09,680
And you might not be happy to do that for that.

661
00:56:09,680 --> 00:56:17,840
So this is the point of proof of work and that's why proof of stake arrived.

662
00:56:17,840 --> 00:56:23,960
Let's talk a little bit more about proof of stake, but just a quick recap of what you

663
00:56:23,960 --> 00:56:27,080
just said in regards to long machine rule.

664
00:56:27,080 --> 00:56:33,280
So as Stefano just outlined, despite being great innovation in regards to introducing

665
00:56:33,280 --> 00:56:39,400
randomness for the selection of the nodes and having this idea to have an economic

666
00:56:39,400 --> 00:56:45,720
incentive, was certainly an innovative idea for a chain that didn't have a need to also

667
00:56:45,720 --> 00:56:52,040
run during complete computation and smart contracts on it.

668
00:56:52,040 --> 00:56:59,920
But in other words, in real world, after Bitcoin essentially started running for many years,

669
00:56:59,920 --> 00:57:06,840
what a lot of people saw over the years is the centralization of control over different

670
00:57:06,840 --> 00:57:07,840
mining pools.

671
00:57:07,840 --> 00:57:15,400
So essentially, if you put an economic incentive on compute, eventually you will see your network

672
00:57:15,400 --> 00:57:22,920
slowly becoming centralized because there's only so much players that can allocate so

673
00:57:22,920 --> 00:57:26,040
much resources to pay for this compute.

674
00:57:26,040 --> 00:57:30,640
And which leads us to proof of stake, which is an interesting alternative on that.

675
00:57:30,640 --> 00:57:35,520
We covered a lot of the terminology for proof of stake in the previous episode.

676
00:57:35,520 --> 00:57:41,280
So don't be constrained setting up the stage for it.

677
00:57:41,280 --> 00:57:46,960
But yeah, let's talk a bit about proof of stake now and what was the main innovation

678
00:57:46,960 --> 00:57:47,960
there.

679
00:57:47,960 --> 00:57:55,520
And I suppose there has been some things from the proof of work that certainly served as

680
00:57:55,520 --> 00:57:57,520
an inspiration.

681
00:57:57,520 --> 00:57:59,520
Yeah.

682
00:57:59,520 --> 00:58:07,520
So proof of stake replaced the computing power required for leader election in proof of work

683
00:58:07,520 --> 00:58:09,020
with stake.

684
00:58:09,020 --> 00:58:10,020
What does it mean?

685
00:58:10,020 --> 00:58:14,360
Well, we don't have miners anymore here, but we have validators.

686
00:58:14,360 --> 00:58:19,720
So we have these nodes that are in charge of participating in the consensus protocol.

687
00:58:19,720 --> 00:58:25,600
How do we select these nodes is the big problem of proof of stake network.

688
00:58:25,600 --> 00:58:31,320
So first of all, the nodes entitled to be validators are the ones that are willing to

689
00:58:31,320 --> 00:58:40,320
commit some state, some or the worth in terms of stake to the network.

690
00:58:40,320 --> 00:58:41,960
There are some limitations.

691
00:58:41,960 --> 00:58:45,360
The first limitation, I will split the problem in two parts.

692
00:58:45,360 --> 00:58:47,320
The first part is leader election.

693
00:58:47,320 --> 00:58:51,640
So once you have these validators, you need to elect one leader.

694
00:58:51,640 --> 00:58:56,120
And once you elect the leader, you need to achieve consensus on the message the leader

695
00:58:56,120 --> 00:59:00,920
is proposing.

696
00:59:00,920 --> 00:59:05,800
These are two big problems because let's bear in mind that we are in a decentralized network

697
00:59:05,800 --> 00:59:08,200
which is a very large network.

698
00:59:08,200 --> 00:59:17,840
So if we go back to the PBFT, so the Byzantine Fault Tolerant Protocol and implementation,

699
00:59:17,840 --> 00:59:22,040
we know that we can achieve consensus in such an environment.

700
00:59:22,040 --> 00:59:26,640
The problem of Byzantine Fault Tolerant Protocol is that they require a lot of messages to

701
00:59:26,640 --> 00:59:31,240
be exchanged by processes.

702
00:59:31,240 --> 00:59:36,960
So if we are in a decentralized network with thousands, hundreds of thousands of nodes,

703
00:59:36,960 --> 00:59:40,920
probably it will take a lot of time.

704
00:59:40,920 --> 00:59:46,520
So although your validators are committing their stake and they can join the network,

705
00:59:46,520 --> 00:59:52,040
probably it will take ages to achieve consensus protocols.

706
00:59:52,040 --> 00:59:55,040
There are some solutions that have been proposed.

707
00:59:55,040 --> 00:59:57,520
So there are different variants of proof of stake.

708
00:59:57,520 --> 01:00:00,760
The first variant is bounded proof of stake.

709
01:00:00,760 --> 01:00:07,560
In that case, the stakeholders lock their funds and for each fund, they got voting rights.

710
01:00:07,560 --> 01:00:13,800
With these voting rights, they can select validator nodes that can be elected as a leader.

711
01:00:13,800 --> 01:00:19,760
The node with more votes is the leader and this leader will just propose the block to

712
01:00:19,760 --> 01:00:22,280
the network.

713
01:00:22,280 --> 01:00:27,560
An alternative to that, just to reduce the complexity of leader election, is the delegated

714
01:00:27,560 --> 01:00:28,560
proof of stake.

715
01:00:28,560 --> 01:00:32,760
In the delegated proof of stake, you don't have a lot of validators.

716
01:00:32,760 --> 01:00:37,480
You just have a small set of validators, like hundreds of validators, 200 of validators,

717
01:00:37,480 --> 01:00:38,800
or 2,000 validators.

718
01:00:38,800 --> 01:00:40,600
I don't know.

719
01:00:40,600 --> 01:00:44,880
And these validators are known by the network.

720
01:00:44,880 --> 01:00:51,760
Stakeholders may decide to either build its own validator or they decide to commit their

721
01:00:51,760 --> 01:00:55,880
stake to the validator, delegate some stake to a validator.

722
01:00:55,880 --> 01:00:56,880
Why do I want to do that?

723
01:00:56,880 --> 01:01:03,480
Well, because validators with more stake have the possibility to be elected as leader.

724
01:01:03,480 --> 01:01:08,560
And the validator with large stake can propose the blocks and dominate the network.

725
01:01:08,560 --> 01:01:14,240
But maybe it sounds to you something like centralization again, because you are centralized

726
01:01:14,240 --> 01:01:20,840
to fuel the power of many.

727
01:01:20,840 --> 01:01:23,440
So this is probably a trade-off, right?

728
01:01:23,440 --> 01:01:29,920
We need to do that because BFT protocols are so complex in terms of mixed-suggest changes

729
01:01:29,920 --> 01:01:37,200
that we need to reduce the problem in a smaller scale.

730
01:01:37,200 --> 01:01:41,640
So if we think about proof of work, there is the concept of randomness.

731
01:01:41,640 --> 01:01:46,120
And both in bounded proof of stake or delegated proof of stake, I haven't mentioned about

732
01:01:46,120 --> 01:01:52,240
randomness, although some solution are trying to adopt randomness to select leaders.

733
01:01:52,240 --> 01:01:54,320
But I haven't mentioned about it.

734
01:01:54,320 --> 01:01:58,280
In Algorand, there is the pure proof of stake algorithm.

735
01:01:58,280 --> 01:02:00,000
What does it mean, pure proof of stake?

736
01:02:00,000 --> 01:02:07,160
So the idea behind that, so the paper proposed by Silvio, is that solving the problem, the

737
01:02:07,160 --> 01:02:12,520
consensus requires a leader election fair.

738
01:02:12,520 --> 01:02:18,080
A fair leader election means that everyone can join the network and can be elected as

739
01:02:18,080 --> 01:02:19,240
a leader.

740
01:02:19,240 --> 01:02:25,200
And to do so, there is not special validator's committee or something like that.

741
01:02:25,200 --> 01:02:33,720
There is just everyone that have some stake can be elected as a leader.

742
01:02:33,720 --> 01:02:38,560
The likelihood to be elected as leader is proportional to the stake you have in your

743
01:02:38,560 --> 01:02:39,560
wallet.

744
01:02:39,560 --> 01:02:42,920
But you are not committing that stake to any validator.

745
01:02:42,920 --> 01:02:47,720
You are not delegating the stake to others.

746
01:02:47,720 --> 01:02:54,200
And the idea of randomness is that these leaders are elected randomly from the old

747
01:02:54,200 --> 01:02:56,560
stake holders in the network.

748
01:02:56,560 --> 01:03:04,800
The random election is governed by a cryptographic function called the verifiable random function.

749
01:03:04,800 --> 01:03:09,000
And the idea behind that is the same idea of proof of work with mining.

750
01:03:09,000 --> 01:03:13,160
You do some work, some computation, you obtain a result.

751
01:03:13,160 --> 01:03:18,000
This result eventually gives you the winning ticket of a lottery.

752
01:03:18,000 --> 01:03:20,040
With proof of work, we do that with mining.

753
01:03:20,040 --> 01:03:24,400
In pure proof of stake, you do that with verifiable random functions.

754
01:03:24,400 --> 01:03:31,200
You can run this verifiable random function with your stake.

755
01:03:31,200 --> 01:03:34,200
And eventually, you are elected as a leader.

756
01:03:34,200 --> 01:03:39,440
When you are elected as a leader, as in proof of work, you can spread the block you want

757
01:03:39,440 --> 01:03:42,440
to propose to the network.

758
01:03:42,440 --> 01:03:47,720
Another is the second phase of proof of work consensus protocols, which is the agreement

759
01:03:47,720 --> 01:03:49,880
on the message you receive.

760
01:03:49,880 --> 01:03:54,240
So in bounded proof of stake and delegated proof of stake, you just receive the message

761
01:03:54,240 --> 01:03:57,160
from the leader and you commit the message.

762
01:03:57,160 --> 01:04:00,920
Like I trust the leader because I've been elected by the stakeholders.

763
01:04:00,920 --> 01:04:02,760
I will commit the message.

764
01:04:02,760 --> 01:04:08,920
Then if there are two leaders at the same time, probably there will be a fork in your

765
01:04:08,920 --> 01:04:10,120
network.

766
01:04:10,120 --> 01:04:17,720
And we can use the longest chain rule or something like that to eventually agree on a consistent

767
01:04:17,720 --> 01:04:18,720
state.

768
01:04:18,720 --> 01:04:25,320
But in pure proof of stake, what happens is that once the leader is elected randomly with

769
01:04:25,320 --> 01:04:32,440
the verifiable random function, then there is a committee of nodes, again, selected randomly,

770
01:04:32,440 --> 01:04:38,280
that run a traditional Byzantine fault tolerant protocol, as I explained you before.

771
01:04:38,280 --> 01:04:42,000
So with three stages, message exchanges.

772
01:04:42,000 --> 01:04:43,000
What does this?

773
01:04:43,000 --> 01:04:47,600
This committee, what does is just validating and verifying the block received by the leader.

774
01:04:47,600 --> 01:04:53,040
If everyone agrees on that on the block, then the block can be considered committed on the

775
01:04:53,040 --> 01:04:54,200
chain.

776
01:04:54,200 --> 01:04:56,520
And if that's the case, there is no fork, right?

777
01:04:56,520 --> 01:05:04,640
So if the committee accept the block, the block is committed on the blockchain.

778
01:05:04,640 --> 01:05:10,600
And this is probably the best way to achieve consensus in a very large scale network because

779
01:05:10,600 --> 01:05:16,200
you are still reducing the problem, having a large number of nodes, but reducing to a

780
01:05:16,200 --> 01:05:20,840
small committee, but the committee is randomly selected as the leader is.

781
01:05:20,840 --> 01:05:24,680
And we use randomness for that reason.

782
01:05:24,680 --> 01:05:31,760
And I really like the fact that one of the interesting innovations that the pure proof

783
01:05:31,760 --> 01:05:38,200
of stake is also introducing is the fact that the selection of the proposers and the selection

784
01:05:38,200 --> 01:05:46,160
of the members of that committee are happening through the VRF that is actually existing

785
01:05:46,160 --> 01:05:47,160
in the security of the line.

786
01:05:47,160 --> 01:05:49,280
It's a very cheap cryptographic operation.

787
01:05:49,280 --> 01:05:59,920
If you compare it with, for example, well, running your mining pool to massive racks

788
01:05:59,920 --> 01:06:01,760
of GPUs and Isis.

789
01:06:01,760 --> 01:06:07,560
But the beauty of cryptography in this particular case is that it's a O1 constant operation.

790
01:06:07,560 --> 01:06:09,560
You can run it locally.

791
01:06:09,560 --> 01:06:16,880
But then once you actually have the result at this point as outlined in different key

792
01:06:16,880 --> 01:06:22,440
notes that Professor Mikali was also doing, at this point, it's pretty much equivalent

793
01:06:22,440 --> 01:06:25,240
of putting something on WikiLeaks.

794
01:06:25,240 --> 01:06:31,320
We have the Gossip Protocol nodes are using the Gossip Protocol to spread the message.

795
01:06:31,320 --> 01:06:36,840
And once you actually have the result, there's no point to attack it because it's already

796
01:06:36,840 --> 01:06:40,080
spreading over the network.

797
01:06:40,080 --> 01:06:46,560
So I think that aspect of having the ability to cheaply execute it without actually revealing

798
01:06:46,560 --> 01:06:52,120
the result until you have it was also a very interesting application of the pure proof

799
01:06:52,120 --> 01:06:55,240
of stake.

800
01:06:55,240 --> 01:07:00,880
And to continue with that, and once again, thanks for the great coverage here.

801
01:07:00,880 --> 01:07:07,320
I think we indeed did cover a very big branch of different consensus mechanisms.

802
01:07:07,320 --> 01:07:13,640
And I think it's important for the listeners out there to understand the importance of

803
01:07:13,640 --> 01:07:21,520
the terminology and the foundational primitives that are required for you to understand the

804
01:07:21,520 --> 01:07:22,520
consensus.

805
01:07:22,520 --> 01:07:24,800
And I think it's a hard of any blockchain.

806
01:07:24,800 --> 01:07:28,920
So if you're studying, trying to deal with some particular blockchain project, it's

807
01:07:28,920 --> 01:07:29,920
very important.

808
01:07:29,920 --> 01:07:34,160
First, to understand the consensus.

809
01:07:34,160 --> 01:07:40,920
And if we are to continue with it, so we did cover a lot of different innovations that

810
01:07:40,920 --> 01:07:49,120
have been happening ever since the works of Leslie Lampert and Paxos Protocol.

811
01:07:49,120 --> 01:07:58,000
Given your experience in the research, what do you think is the current state in the field

812
01:07:58,000 --> 01:08:00,840
in general, the field of consensus protocol design?

813
01:08:00,840 --> 01:08:05,760
And where do you think the field is headed in the future?

814
01:08:05,760 --> 01:08:10,000
What do you think is the biggest challenge, essentially, for the current state of the

815
01:08:10,000 --> 01:08:13,480
science?

816
01:08:13,480 --> 01:08:22,280
So I think before blockchain, we were in a sort of plateau, so we reached the optimal,

817
01:08:22,280 --> 01:08:29,040
actually, the acceptable level of efficiency in a distributed system.

818
01:08:29,040 --> 01:08:32,720
But then, blockchains arrived with decentralization.

819
01:08:32,720 --> 01:08:35,200
And the problems were a lot.

820
01:08:35,200 --> 01:08:42,120
In terms of efficiency and security, there are a lot of three dots there.

821
01:08:42,120 --> 01:08:45,560
I think with the pure proof of stake, we are in a good stage.

822
01:08:45,560 --> 01:08:51,960
Because if you think about the concept of randomness for electing leaders, that we adopted

823
01:08:51,960 --> 01:08:52,960
a lot of algorithms.

824
01:08:52,960 --> 01:08:56,000
Now, a lot of blockchains are switching to that concept as well.

825
01:08:56,000 --> 01:09:00,600
For example, in Cardano, there is the concept of randomness, but also Polkadot, they are

826
01:09:00,600 --> 01:09:04,880
adopting VRIF for electing leaders.

827
01:09:04,880 --> 01:09:13,200
So we are like in a standard, at least for the blockchain space.

828
01:09:13,200 --> 01:09:19,120
In my opinion, but there is a lot of work to do for the improvement of scalability.

829
01:09:19,120 --> 01:09:22,560
Because networks will be larger and larger.

830
01:09:22,560 --> 01:09:26,400
And it's always more complex to achieve consensus.

831
01:09:26,400 --> 01:09:35,880
Even if we create small committees and we try to simulate the whole network with a random

832
01:09:35,880 --> 01:09:43,640
elected committee, still it will be complex to proceed with the consensus there.

833
01:09:43,640 --> 01:09:52,760
So I think there is a lot of improvement in terms of performance and security tradeoffs.

834
01:09:52,760 --> 01:09:57,960
Of course, the consensus protocols in the future, maybe it will be more performant because

835
01:09:57,960 --> 01:10:05,080
of Algon, for example, we can produce finalized block in 3.7 seconds on average.

836
01:10:05,080 --> 01:10:09,840
So I think there is a lot of improvement there as well.

837
01:10:09,840 --> 01:10:15,760
And another field that I think is very interesting is the cross-chain communication.

838
01:10:15,760 --> 01:10:22,280
So I imagine that in the future, we might have an ecosystem of blockchains.

839
01:10:22,280 --> 01:10:27,680
There is some work on that, but something more smooth, more flexible, something that

840
01:10:27,680 --> 01:10:33,880
can be very easy to use in terms of, I don't know, I like the toolkit proposed by Halgron,

841
01:10:33,880 --> 01:10:36,640
but I also want to use something on Cardano.

842
01:10:36,640 --> 01:10:38,760
And these are completely two different ecosystems.

843
01:10:38,760 --> 01:10:44,160
Now we are trying to implement bridges and in Algon, we are doing well with state groups.

844
01:10:44,160 --> 01:10:47,160
But I think there is a lot of improvement from the consensus part.

845
01:10:47,160 --> 01:10:54,880
So what if I have a protocol that enables direct communications, enabling consensus

846
01:10:54,880 --> 01:10:58,000
between different worlds, let me say.

847
01:10:58,000 --> 01:11:02,080
So the word of Algon and the word of another chain.

848
01:11:02,080 --> 01:11:04,080
What if I do have a consensus there?

849
01:11:04,080 --> 01:11:05,720
How can I achieve the consensus there?

850
01:11:05,720 --> 01:11:08,400
What's the complexity of the consensus there?

851
01:11:08,400 --> 01:11:14,520
I think this is a good point for the research field in the future.

852
01:11:14,520 --> 01:11:21,560
Yeah, I would definitely agree that I also personally think that there is going to be

853
01:11:21,560 --> 01:11:28,840
a lot more research and of course practical applications in collaboration in this space.

854
01:11:28,840 --> 01:11:36,560
Because I think that's the big factor for succeeding and for the blockchains who essentially

855
01:11:36,560 --> 01:11:41,640
reach the adoption levels that can provide a lot of value.

856
01:11:41,640 --> 01:11:44,800
And it's not necessarily just competing with each other.

857
01:11:44,800 --> 01:11:52,800
It's setting the good standard for the competition by actually collaborating on standards, collaborating

858
01:11:52,800 --> 01:11:57,600
on protocols that will unblock cross-chain collaboration.

859
01:11:57,600 --> 01:12:03,720
And certainly great efforts being done by Algrand with the announcements this year done

860
01:12:03,720 --> 01:12:08,760
in regards to the state proofs as well as state proofs allowing for the post-quantum

861
01:12:08,760 --> 01:12:11,640
security.

862
01:12:11,640 --> 01:12:20,480
So with that, I think this has been pretty much a great coverage of a, hopefully we didn't

863
01:12:20,480 --> 01:12:24,760
board some of our listeners by diving too deep in some of the aspects, but I can assure

864
01:12:24,760 --> 01:12:27,600
you this is as high level as we can go.

865
01:12:27,600 --> 01:12:35,680
And essentially this should ideally spark some interest in regards to the variety of

866
01:12:35,680 --> 01:12:41,120
different things that are happening in this particular space at the moment.

867
01:12:41,120 --> 01:12:47,240
Before we proceed to the final question, I wanted to highlight one question that was

868
01:12:47,240 --> 01:12:49,600
asked on the Ask Osomalgo.

869
01:12:49,600 --> 01:12:56,040
This is a platform where listeners can essentially ask questions by submitting questions on the

870
01:12:56,040 --> 01:12:57,400
Algrand blockchain.

871
01:12:57,400 --> 01:13:05,160
So there was one particular question I wanted to ask, which is, let me read it right now

872
01:13:05,160 --> 01:13:06,160
actually.

873
01:13:06,160 --> 01:13:12,480
So what are other prominent alternatives to BFT and longest chain rule based consensus

874
01:13:12,480 --> 01:13:16,000
protocols that are popular these days in your opinion?

875
01:13:16,000 --> 01:13:22,040
So we mentioned BFT, we mentioned chains that based on longest chain rule, but I believe

876
01:13:22,040 --> 01:13:29,520
there are some particular exotic consensus protocols being designed and researched as

877
01:13:29,520 --> 01:13:31,680
well.

878
01:13:31,680 --> 01:13:36,840
Any particular protocols that you can highlight that are, you know, could be in the minority,

879
01:13:36,840 --> 01:13:44,240
but in fact are complete, you know, alternatives to what we've just outlined in the discussion.

880
01:13:44,240 --> 01:13:52,680
Well, something that is, we discussed about different approaches to consensus in blockchain

881
01:13:52,680 --> 01:13:59,720
when it was about proof of work and proof of stake.

882
01:13:59,720 --> 01:14:07,000
In particular, for proof of stake, we have this concept of leader election with the run

883
01:14:07,000 --> 01:14:16,160
on SRO with other kind of models, and in that part, we do have protocols like Polkadot

884
01:14:16,160 --> 01:14:21,280
that are Ethereum as well with the merge that they are adopting proof of stake.

885
01:14:21,280 --> 01:14:24,640
They have a sort of BFT protocol to achieve consensus.

886
01:14:24,640 --> 01:14:30,160
And then they also adopt the longest chain rule because they might have forks in the network.

887
01:14:30,160 --> 01:14:36,080
And so they need a way to reconfigure, to converge to the same state.

888
01:14:36,080 --> 01:14:42,600
And there is this protocol called GOST protocol that enable, let's say the model, the longest

889
01:14:42,600 --> 01:14:48,640
chain rule adopted in Bitcoin, but in a proof of stake environment.

890
01:14:48,640 --> 01:14:53,320
But still we are talking about longest chain, we are talking about BFT, something that was

891
01:14:53,320 --> 01:14:57,960
very different from my experience.

892
01:14:57,960 --> 01:15:02,920
And it's fascinating, I admit, it's the protocol behind Avalance.

893
01:15:02,920 --> 01:15:07,960
Avalance is a layer one blockchain like others I mentioned here.

894
01:15:07,960 --> 01:15:13,040
And they are trying to propose something new, very efficient as well, because they propose

895
01:15:13,040 --> 01:15:21,120
a protocol that probabilistically accept a consistent state across the replicas just

896
01:15:21,120 --> 01:15:27,880
by observing a local view of transactions.

897
01:15:27,880 --> 01:15:39,880
So just the intuition, intuition is that if I'm a node and I receive transactions, then

898
01:15:39,880 --> 01:15:45,840
with a certain probability, there is a threshold that tells me that after the threshold, I

899
01:15:45,840 --> 01:15:51,480
can consider the transaction to be accepted by the rest of the network and finalized.

900
01:15:51,480 --> 01:15:56,280
It's a sort of Byzantine-Tofoltarian protocol and it's a sort of longest chain rule.

901
01:15:56,280 --> 01:16:01,000
But with a different approach based on direct acyclic graphs.

902
01:16:01,000 --> 01:16:04,960
So it's quite an interesting protocol as well.

903
01:16:04,960 --> 01:16:06,640
I see.

904
01:16:06,640 --> 01:16:13,440
Well, I will certainly make sure to highlight that particular answer to the person who asked

905
01:16:13,440 --> 01:16:15,360
this question.

906
01:16:15,360 --> 01:16:21,960
To continue on the final part, this is something that is being asked to all of the guests of

907
01:16:21,960 --> 01:16:30,280
this podcast, but I usually tend to ask on some listeners of this podcast are aspiring

908
01:16:30,280 --> 01:16:33,880
engineers who may not be familiar with the blockchain development but would like to get

909
01:16:33,880 --> 01:16:36,560
into.

910
01:16:36,560 --> 01:16:42,560
Is there any advice that you can give for aspiring engineers who want to try their hands-on blockchain

911
01:16:42,560 --> 01:16:50,040
development on Algorand or get into, let's say, Web3 space in general?

912
01:16:50,040 --> 01:16:57,760
So, I hope I will differentiate my answer from the others.

913
01:16:57,760 --> 01:17:04,560
So, what I really like about blockchain, it's a fascinating industry, right?

914
01:17:04,560 --> 01:17:09,640
And what I really like is that the technological foundation of blockchain is very, there is

915
01:17:09,640 --> 01:17:12,280
a lot of topics that involve blockchain.

916
01:17:12,280 --> 01:17:18,320
So we talk about cryptography, we talk about distributed systems, we talk about telecommunications,

917
01:17:18,320 --> 01:17:21,360
the economy, a lot of things.

918
01:17:21,360 --> 01:17:29,400
So almost everyone can jump in and start being involved in the ecosystem.

919
01:17:29,400 --> 01:17:35,040
Apart from that, from a technological point of view for engineers, there is much more

920
01:17:35,040 --> 01:17:36,200
we can do, right?

921
01:17:36,200 --> 01:17:41,560
Because there are toolkits, there are SDKs, you can start developing smart contracts very

922
01:17:41,560 --> 01:17:44,440
easily because the tools are improving a lot.

923
01:17:44,440 --> 01:17:48,880
For example, in Algorand, if you want to be involved, it's very easy because there is

924
01:17:48,880 --> 01:17:53,840
plenty of resources on the Algorand developer portal.

925
01:17:53,840 --> 01:17:55,040
Get to start the tutorial.

926
01:17:55,040 --> 01:18:00,520
I mentioned about the contributions from the developer ambassadors there with a lot of

927
01:18:00,520 --> 01:18:06,240
examples of real use cases that you can build with Algorand using Algorand smart contracts,

928
01:18:06,240 --> 01:18:11,320
using standard assets, and all the technology that Algorand proposed.

929
01:18:11,320 --> 01:18:16,920
And it's quite straightforward because if you're an engineer and you are a software engineer,

930
01:18:16,920 --> 01:18:19,160
you go there, you start playing a bit.

931
01:18:19,160 --> 01:18:25,040
So my suggestion is try to understand the fundamentals of the technology, try to understand what you

932
01:18:25,040 --> 01:18:31,080
are doing because we are changing the paradigm here from traditional client-server applications

933
01:18:31,080 --> 01:18:33,360
to decentralized applications.

934
01:18:33,360 --> 01:18:38,480
And then try to develop things using the resources you can find online.

935
01:18:38,480 --> 01:18:45,000
And get in touch if you want because I'm more than happy to help everyone.

936
01:18:45,000 --> 01:18:46,000
Awesome.

937
01:18:46,000 --> 01:18:52,440
Well, Stefano, I think this has been a great episode.

938
01:18:52,440 --> 01:18:54,600
We did cover a lot of topics.

939
01:18:54,600 --> 01:19:02,480
I hope that this episode is going to highlight some of the aspects from the, I would say,

940
01:19:02,480 --> 01:19:09,240
theoretical angle that is not something that is brought up too often in the conversations

941
01:19:09,240 --> 01:19:18,480
because most of the time it's a lot of industrial, I would say, angles in regards to explaining

942
01:19:18,480 --> 01:19:25,600
what blockchain is as well as more, I would say, practical or development-specific angles.

943
01:19:25,600 --> 01:19:36,720
But this was essentially to show that a lot of work that has been done is built on very

944
01:19:36,720 --> 01:19:42,400
strong theoretical background that started in the 80s essentially since the branching

945
01:19:42,400 --> 01:19:47,080
out of the distributed system protocol such as PAXOS.

946
01:19:47,080 --> 01:19:53,400
And with that, thank you for listening and I will see you in the next episode.

947
01:19:53,400 --> 01:20:04,600
Thank you.

