CITA-Cloud中consensus微服务的实现,基于raft-rs。
docker build -t citacloud/consensus_raft .
$ consensus -h
consensus 6.7.0
Rivtower Technologies <contact@rivtower.com>
Usage: consensus [COMMAND]
Commands:
run run the service
help Print this message or the help of the given subcommand(s)
Options:
-h, --help Print help
-V, --version Print version
运行consensus服务。
$ consensus run -h
consensus-run
run the service
USAGE:
consensus run [OPTIONS]
OPTIONS:
-c, --config <config> the consensus config [default: config.toml]
-d, --log-dir <log-dir> the log dir. Overrides the config
-f, --log-file-name <log-file-name> the log file name. Overrride the config
-h, --help Print help information
--stdout if specified, log to stdout. Overrides the config
参数:
-
config微服务配置文件。参见示例
example/config.toml。其中:
controller_port为依赖的controller微服务的gRPC服务监听的端口号。grpc_listen_port为本微服务gRPC服务监听的端口号。network_port为依赖的network微服务的gRPC服务监听的端口号。node_addr为本节点地址文件路径。
-
log-dir日志的输出目录。 -
log-file-name日志输出的文件名。 -
--stdout不传该参数时,日志输出到文件;传递该参数时,日志输出到标准输出。
输出到日志文件:
$ consensus run -c example/config.toml -d . -f consensus.log
$ cat consensus.log
Mar 14 08:32:55.131 INFO controller grpc addr: http://localhost:50004, tag: controller, module: consensus::client:45
Mar 14 08:32:55.131 INFO network grpc addr: http://localhost:50000, tag: network, module: consensus::client:167
Mar 14 08:32:55.131 INFO registering network msg handler..., tag: network, module: consensus::client:191
输出到标准输出:
$ consensus run -c example/config.toml --stdout
Mar 14 08:34:00.124 INFO controller grpc addr: http://localhost:50004, tag: controller, module: consensus::client:45
Mar 14 08:34:00.125 INFO network grpc addr: http://localhost:50000, tag: network, module: consensus::client:167
Mar 14 08:34:00.125 INFO registering network msg handler..., tag: network, module: consensus::client:191
Please check the ConsensusService
and Consensus2ControllerService
in cita_cloud_proto
which defines the service that consensus should implement.
The main workflow for consensus service is as follow:
- Get proposal either from the local controller or from other remote consensus peers.
- If the proposal comes from peers, ask the local controller to check it first.
- Achieve consensus over the given proposal.
- Commit the proposal with its proof to the local controller.
The proof, for example, is the nonce for POW consensus, and is empty for non-byzantine consensus like this raft implementation. It will be used later by peers' controller to validate the corresponding block when they sync the missing blocks from others.
To communicate with other peers, you need to:
- Implement the
NetworkMsgHandlerServicewhich handles the messages from peers. - Register your service to the network by
RegisterNetworkMsgHandler, which tells the network to forward the messages you are concerned about.
After all of that, you can send your messages to others by SendMsg
or Broadcast provided by the network service.
raft-rs 提供了最核心的 Consensus Module,而其他的组件,包括 Log,State Machine,Transport,都是需要应用去定制实现。
-
Storage
基于trait Storage实现RaftStorage- RaftStorage
impl Storage for RaftStorage { fn initial_state(&self) -> raft::Result<RaftState> { Ok(self.initial_state()) } fn first_index(&self) -> raft::Result<u64> { Ok(self.first_index()) } fn last_index(&self) -> raft::Result<u64> { Ok(self.last_index()) } fn term(&self, idx: u64) -> raft::Result<u64> { self.term(idx) } fn entries( &self, low: u64, high: u64, max_size: impl Into<Option<u64>>, ) -> raft::Result<Vec<Entry>> { self.entries(low, high, max_size) } fn snapshot(&self, request_index: u64) -> raft::Result<Snapshot> { self.snapshot(request_index) }
} ```
-
Log and State Machine
raft的运行原理如下图所示:Raft的模型是一个基于Log复制的状态机模型。客户端向服务端Leader发起写入数据操作,Leader将该操作添加到Log并复制给所有Follower,当超过半数节点确认就可以将这条操作应用到State Machine中。通过
Log复制的方式保证所有节点Log顺序一致,其目的是保证State Machine中数据状态的一致性。随着数据量的积累Log会不断增大,实际应用中会在适当时机对日志进行压缩,对当前State Machine的数据状态进行快照,将其作为应用数据的基础,并重新记录日志。一般的Raft应用中Log的数据轻,而State Machine的数据重,做快照的开销大,不宜频繁使用。 而本实现作为区块链系统中的共识模块,关注重点在于利用Raft的Consensus Module。State Machine的数据是ConsensusConfig,并非真正的区块链的状态,它是为Consensus Module的正常运行服务的,而Log的数据是Proposal,相比之下Log的数据过于沉重。充分利用这一实际应用特点和日志压缩的原理,这里的做法是:每个Proposal被应用之后都对State Machine的数据状态进行快照并本地保存,并不断清空已被应用Proposal,数据状态一致性(Log查询不到会用快照同步)和重启状态恢复(本地保存的快照)都通过快照来实现。 -
Transport
该能力由network 实现
-
运行流程中的
handle ready步骤按照raft-rs文档 实现

