prevent PANIC when leader is None#450
Conversation
happend in tests with tifs: thread 'tokio-runtime-worker' panicked at /home/uli/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tikv-client-0.3.0/src/pd/cluster.rs:270:47: called `Option::unwrap()` on a `None` value Signed-off-by: Ulrich Hornung <hornunguli@gmx.de>
bcff6ae to
278324c
Compare
pingyu
left a comment
There was a problem hiding this comment.
The panic seems to happen only when PD return GetMembersResponse without leader field. But I think it should not happen as only the PD leader can return a successful response of get members request.
So I think the root cause would be, we actually get a GetMemebersResponse with error (see GrpcServer.GetMembers). We need the check the header.error at the first place.
| /// No leader is found for the given id. | ||
| #[error("Leader of region {} is not found", region_id)] | ||
| LeaderNotFound { region_id: u64 }, | ||
| LeaderOfRegionNotFound { region_id: u64 }, |
There was a problem hiding this comment.
Suggest not to change this error as some existed user codes will be using it.
There was a problem hiding this comment.
There is no concept of "Leader of cluster", but "region leader" or "PD leader". so it would be a little confused.
At the same time, it seems that it is not very necessary to add a new error enum, unless you want to handle this specified error. I think use InternalError the same as existed code of PD would be enough.
I will test and report in case I still see it. Thanks for notification. I assume this PR would be obsolete then? Or should I prepare still some parts of it for merging? |
It is obsolete now. Unless it's not the reason of |
Hello,
I'm currently experimenting with TiFS (https://github.com/Hexilee/tifs).
When copying large files, the TiKV cluster that I have locally gets under a constant load for a longer period of time (~1-2 hours).
During these tests I came accross a panic caused by the rust-client. I located the code responsible for this, and implemented a fix.
Please have a look, tell me what you think and what to adapt to let it get into the main branch.
Thanks and best regards,
cre4ture
log entry of the panic: