-
Notifications
You must be signed in to change notification settings - Fork 440
RATIS-1860. Add ratis-shell cmd to generate a new raft-meta.conf. #901
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RATIS-1860. Add ratis-shell cmd to generate a new raft-meta.conf. #901
Conversation
e1f47d5 to
672d5f9
Compare
|
@codings-dan Can you help me review this PR? Thanks! |
codings-dan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for working on this! Left some comment
| List<RaftProtos.RaftPeerProto> raftPeerProtos = new ArrayList<>(); | ||
| for (String peer : peersStr.split(",")) { | ||
| String[] peerInfos = peer.split(":"); | ||
| String peerId = peerInfos[0] + "_" + peerInfos[1]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
final InetSocketAddress serverAddress = parseInetSocketAddress(address);
final String peerId = RaftUtils.getPeerId(serverAddress);
| @Override | ||
| public int run(CommandLine cl) throws IOException { | ||
| String peersStr = cl.getOptionValue(PEER_OPTION_NAME); | ||
| String path = cl.getOptionValue(PATH_OPTION_NAME); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
peersStr and path need check not null.
| import java.util.ArrayList; | ||
| import java.util.List; | ||
|
|
||
| public class RaftMetaConfCommand extends AbstractRatisCommand { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about use GroupCommand as the parent command? BTW, please add some necessary notes.
codings-dan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the update! Could you add some doc about this command? see https://github.com/apache/ratis/blob/master/ratis-docs/src/site/markdown/cli.md
@codings-dan Thanks for your review, i have updated the cli.md. |
codings-dan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the change looks good, +1
|
@szetszwo Could you help take a look at the pull request? Thanks in advance |
|
@codings-dan , sure, will review this. |
szetszwo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@chunxiaozheng , thanks a lot for working on this!
The new command here is very different from the other existing commands since it does not connect the servers. It just reads and writes the raft-meta file.
We have a raft log tool similar to this in the ratis-tools module. How about moving this to ratis-tools?
|
@szetszwo Thanks for your review! But I have a question, if we put this command in the ratis-tool module, how should we use it? |
|
@chunxiaozheng , do you mean how to run it? We may run it directly with the class (or using maven). Another option is to add a new parent command, say |
@szetszwo Thanks! Adding a local command is a good idea, which makes it very easy to use this tool, and I have already followed this approach to modify PR. |
| * Command for generate a new raft-meta.conf file based on original raft-meta.conf and new peers, | ||
| * which is used to move a raft node to a new node. | ||
| */ | ||
| public class RaftMetaConfCommand extends AbstractRatisCommand { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@chunxiaozheng , thanks a lot for the update. If we implement a local command, we should not extend AbstractRatisCommand since it will connect to remote servers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@szetszwo Thanks for your review! I have updated this PR.
In addition, AbstractParentCommand should not extends AbstractRatisCommand too, since the commands that extends AbstractParentCommand means that it is a parent command, it does not need to connect to remote servers too.
szetszwo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the update. Please see the comments inlined and also https://issues.apache.org/jira/secure/attachment/13062287/901_review.patch
| public abstract class AbstractCommand implements Command { | ||
|
|
||
| public static InetSocketAddress parseInetSocketAddress(String address) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's also move the Context and PrintStream related code from AbstractRatisCommand to AbstractCommand.
| public abstract class AbstractParentCommand implements Command { | ||
| private final Map<String, Command> subs; | ||
|
|
||
| public AbstractParentCommand(Context context, List<Function<Context, AbstractRatisCommand>> subCommandConstructors) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's change the Function to Function<Context, Command>. Then, the new LocalCommand can also use it.
| String peersStr = cl.getOptionValue(PEER_OPTION_NAME); | ||
| String path = cl.getOptionValue(PATH_OPTION_NAME); | ||
| if (peersStr == null || path == null || peersStr.isEmpty() || path.isEmpty()) { | ||
| System.out.println("peers or path can't be empty."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should use the PrintStream from Context instead of System.out.
| import org.apache.commons.cli.CommandLine; | ||
| import org.apache.commons.cli.Option; | ||
| import org.apache.commons.cli.Options; | ||
| import org.apache.ratis.proto.RaftProtos; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's import the inner classes. Then, the code is easier to read.
import org.apache.ratis.proto.RaftProtos.LogEntryProto;
import org.apache.ratis.proto.RaftProtos.RaftConfigurationProto;
import org.apache.ratis.proto.RaftProtos.RaftPeerProto;
import org.apache.ratis.proto.RaftProtos.RaftPeerRole;| try (InputStream in = new FileInputStream(new File(path, RAFT_META_CONF)); | ||
| OutputStream out = new FileOutputStream(new File(path, NEW_RAFT_META_CONF))) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use the Files api instead.
|
@szetszwo I have updated the PR according to the comments,thanks for your review! |
szetszwo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 change looks good.
|
@chunxiaozheng , thanks again for working on this! Could you post a comment on RATIS-1860 so that we can assign the JIRA to you? |
@szetszwo of course, but the page display |
|
@chunxiaozheng , I see. What username have you submitted? |
@szetszwo I'm very sorry, I just saw the email, my registered username is |
|
@chunxiaozheng , don't worry. Thank you! |
…ache#901) * Add ratis-shell cmd to generate a new raft-meta.conf. * Remove index option. * Style fix. * Use GroupCommand as the parent of RaftMetaConfCommand. * Modified by review comments. * Update cli.md * Use LocalCommand as the parent of RaftMetaConfCommand. * ReConstruct some abstract command. * Modified by review comments. * Use println instead of printf. * Checkstyle error fix(delete some unuse import). --------- Co-authored-by: idellzheng <idellzheng@tencent.com>
### What changes were proposed in this pull request? Bump Ratis version from 2.5.1 to 3.0.1. Address incompatible changes: - RATIS-589. Eliminate buffer copying in SegmentedRaftLogOutputStream.(apache/ratis#964) - RATIS-1677. Do not auto format RaftStorage in RECOVER.(apache/ratis#718) - RATIS-1710. Refactor metrics api and implementation to separated modules. (apache/ratis#749) ### Why are the changes needed? Bump Ratis version from 2.5.1 to 3.0.1. Ratis has released v3.0.0, v3.0.1, which release note refers to [3.0.0](https://ratis.apache.org/post/3.0.0.html), [3.0.1](https://ratis.apache.org/post/3.0.1.html). The 3.0.x version include new features like pluggable metrics and lease read, etc, some improvements and bugfixes including: - 3.0.0: Change list of ratis 3.0.0 In total, there are roughly 100 commits diffing from 2.5.1 including: - Incompatible Changes - RaftStorage Auto-Format - RATIS-1677. Do not auto format RaftStorage in RECOVER. (apache/ratis#718) - RATIS-1694. Fix the compatibility issue of RATIS-1677. (apache/ratis#731) - RATIS-1871. Auto format RaftStorage when there is only one directory configured. (apache/ratis#903) - Pluggable Ratis-Metrics (RATIS-1688) - RATIS-1689. Remove the use of the thirdparty Gauge. (apache/ratis#728) - RATIS-1692. Remove the use of the thirdparty Counter. (apache/ratis#732) - RATIS-1693. Remove the use of the thirdparty Timer. (apache/ratis#734) - RATIS-1703. Move MetricsReporting and JvmMetrics to impl. (apache/ratis#741) - RATIS-1704. Fix SuppressWarnings(“VisibilityModifier”) in RatisMetrics. (apache/ratis#742) - RATIS-1710. Refactor metrics api and implementation to separated modules. (apache/ratis#749) - RATIS-1712. Add a dropwizard 3 implementation of ratis-metrics-api. (apache/ratis#751) - RATIS-1391. Update library dropwizard.metrics version to 4.x (apache/ratis#632) - RATIS-1601. Use the shaded dropwizard metrics and remove the dependency (apache/ratis#671) - Streaming Protocol Change - RATIS-1569. Move the asyncRpcApi.sendForward(..) call to the client side. (apache/ratis#635) - New Features - Leader Lease (RATIS-1864) - RATIS-1865. Add leader lease bound ratio configuration (apache/ratis#897) - RATIS-1866. Maintain leader lease after AppendEntries (apache/ratis#898) - RATIS-1894. Implement ReadOnly based on leader lease (apache/ratis#925) - RATIS-1882. Support read-after-write consistency (apache/ratis#913) - StateMachine API - RATIS-1874. Add notifyLeaderReady function in IStateMachine (apache/ratis#906) - RATIS-1897. Make TransactionContext available in DataApi.write(..). (apache/ratis#930) - New Configuration Properties - RATIS-1862. Add the parameter whether to take Snapshot when stopping to adapt to different services (apache/ratis#896) - RATIS-1930. Add a conf for enable/disable majority-add. (apache/ratis#961) - RATIS-1918. Introduces parameters that separately control the shutdown of RaftServerProxy by JVMPauseMonitor. (apache/ratis#950) - RATIS-1636. Support re-config ratis properties (apache/ratis#800) - RATIS-1860. Add ratis-shell cmd to generate a new raft-meta.conf. (apache/ratis#901) - Improvements & Bug Fixes - Netty - RATIS-1898. Netty should use EpollEventLoopGroup by default (apache/ratis#931) - RATIS-1899. Use EpollEventLoopGroup for Netty Proxies (apache/ratis#932) - RATIS-1921. Shared worker group in WorkerGroupGetter should be closed. (apache/ratis#955) - RATIS-1923. Netty: atomic operations require side-effect-free functions. (apache/ratis#956) - RaftServer - RATIS-1924. Increase the default of raft.server.log.segment.size.max. (apache/ratis#957) - RATIS-1892. Unify the lifetime of the RaftServerProxy thread pool (apache/ratis#923) - RATIS-1889. NoSuchMethodError: RaftServerMetricsImpl.addNumPendingRequestsGauge apache/ratis#922 (apache/ratis#922) - RATIS-761. Handle writeStateMachineData failure in leader. (apache/ratis#927) - RATIS-1902. The snapshot index is set incorrectly in InstallSnapshotReplyProto. (apache/ratis#933) - RATIS-1912. Fix infinity election when perform membership change. (apache/ratis#954) - RATIS-1858. Follower keeps logging first election timeout. (apache/ratis#894) - 3.0.1:This is a bugfix release. See the [changes between 3.0.0 and 3.0.1](apache/ratis@ratis-3.0.0...ratis-3.0.1) releases. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Cluster manual test. Closes #2480 from SteNicholas/CELEBORN-1400. Authored-by: SteNicholas <programgeek@163.com> Signed-off-by: Shuang <lvshuang.xjs@alibaba-inc.com>
### What changes were proposed in this pull request? Add `local` command in `celeborn_ratis_shell.md` to sync document [cli.md](https://github.com/apache/ratis/blob/ratis-3.0.1/ratis-docs/src/site/markdown/cli.md). ### Why are the changes needed? Celeborn has already bumped Ratis version from 2.5.1 to 3.0.1. Ratis v3.0.1 supports `local` command to process local operation, which no need to connect to ratis server. `celeborn_ratis_shell.md` should add local command to guide users to process local operation. Backport: apache/ratis#901 ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? No. Closes #2575 from SteNicholas/CELEBORN-1466. Authored-by: SteNicholas <programgeek@163.com> Signed-off-by: mingji <fengmingxiao.fmx@alibaba-inc.com>
### What changes were proposed in this pull request? Add `local` command in `celeborn_ratis_shell.md` to sync document [cli.md](https://github.com/apache/ratis/blob/ratis-3.0.1/ratis-docs/src/site/markdown/cli.md). ### Why are the changes needed? Celeborn has already bumped Ratis version from 2.5.1 to 3.0.1. Ratis v3.0.1 supports `local` command to process local operation, which no need to connect to ratis server. `celeborn_ratis_shell.md` should add local command to guide users to process local operation. Backport: apache/ratis#901 ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? No. Closes #2575 from SteNicholas/CELEBORN-1466. Authored-by: SteNicholas <programgeek@163.com> Signed-off-by: mingji <fengmingxiao.fmx@alibaba-inc.com> (cherry picked from commit e4d3b80) Signed-off-by: mingji <fengmingxiao.fmx@alibaba-inc.com>
What changes were proposed in this pull request?
By reading the code related to updating peers, I found that the currentConf in the ConfigurationManager class maintains the current peers information, and a TreeMap object maintains the historical peers information, the TreeMap object uses the log entry index as the key. Normally, if the index of the newly added Configuration is the largest, the currentConf will be set to the latest added configuration information. Therefore, as long as modify the raft-meta.conf, follow the above rules, set the appropriate index and the latest peering information, we can solve the problem raised in RATIS-1860.
Through analysis, reading the index in old raft-meta.conf and add 1 as the new index is a suitable choice. If we directly use the index in the old raft-meta.conf, an exception will be throw during Preconditions.assertTrue(found.equals(conf)).If we directly set a larger value, such as Long.MAX_Value, an exception will be throw when switch leader.
In this PR, a ratis-shell cmd has been added, which will read raft-meta.conf in the specified path, get the log entry index and construct a new index(old index + 1), then use the new index and the newly passed peers to construct a new raft-meta.conf file to support normal startup when moving log entries and checkpoints to other nodes.
In addition, it is also supported to directly pass an new index, which can be used directly when generating a new raft-meta.conf file. However, this method requires observing the old index. If the input index is inappropriate, it may also cause the new node to not start properly.
What is the link to the Apache JIRA
(Please create an issue in ASF JIRA before opening a pull request,
and you need to set the title of the pull request which starts with
the corresponding JIRA issue number. (e.g. RATIS-XXXX. Fix a typo in YYY.)
Please replace this section with the link to the Apache JIRA)
How was this patch tested?
(Please explain how this patch was tested. Ex: unit tests, manual tests)
(If this patch involves UI changes, please attach a screen-shot; otherwise, remove this)