Skip to content

Conversation

@sandeepvinayak
Copy link
Contributor

@sandeepvinayak sandeepvinayak commented Mar 3, 2021

Please refer to jira for the description.
https://issues.apache.org/jira/browse/HBASE-25627

@Apache-HBase
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 6m 49s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+1 💚 hbaseanti 0m 0s Patch does not have any anti-patterns.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 2 new or modified test files.
_ branch-1 Compile Tests _
+0 🆗 mvndep 2m 26s Maven dependency ordering for branch
+1 💚 mvninstall 8m 1s branch-1 passed
+1 💚 compile 1m 13s branch-1 passed with JDK Azul Systems, Inc.-1.8.0_262-b19
+1 💚 compile 1m 20s branch-1 passed with JDK Azul Systems, Inc.-1.7.0_272-b10
+1 💚 checkstyle 2m 10s branch-1 passed
+1 💚 shadedjars 3m 2s branch has no errors when building our shaded downstream artifacts.
+1 💚 javadoc 1m 10s branch-1 passed with JDK Azul Systems, Inc.-1.8.0_262-b19
+1 💚 javadoc 1m 15s branch-1 passed with JDK Azul Systems, Inc.-1.7.0_272-b10
+0 🆗 spotbugs 2m 45s Used deprecated FindBugs config; considering switching to SpotBugs.
+1 💚 findbugs 4m 7s branch-1 passed
_ Patch Compile Tests _
+0 🆗 mvndep 0m 16s Maven dependency ordering for patch
-1 ❌ mvninstall 1m 13s root in the patch failed.
-1 ❌ compile 0m 35s hbase-server in the patch failed with JDK Azul Systems, Inc.-1.8.0_262-b19.
-1 ❌ javac 0m 35s hbase-server in the patch failed with JDK Azul Systems, Inc.-1.8.0_262-b19.
-1 ❌ compile 0m 42s hbase-server in the patch failed with JDK Azul Systems, Inc.-1.7.0_272-b10.
-1 ❌ javac 0m 42s hbase-server in the patch failed with JDK Azul Systems, Inc.-1.7.0_272-b10.
+1 💚 checkstyle 0m 13s The patch passed checkstyle in hbase-hadoop-compat
+1 💚 checkstyle 0m 15s hbase-hadoop2-compat: The patch generated 0 new + 1 unchanged - 8 fixed = 1 total (was 9)
+1 💚 checkstyle 1m 27s hbase-server: The patch generated 0 new + 1 unchanged - 1 fixed = 1 total (was 2)
+1 💚 whitespace 0m 0s The patch has no whitespace issues.
-1 ❌ shadedjars 1m 54s patch has 14 errors when building our shaded downstream artifacts.
-1 ❌ hadoopcheck 1m 12s The patch causes 14 errors with Hadoop v2.8.5.
-1 ❌ hadoopcheck 3m 7s The patch causes 14 errors with Hadoop v2.9.2.
+1 💚 javadoc 0m 59s the patch passed with JDK Azul Systems, Inc.-1.8.0_262-b19
+1 💚 javadoc 1m 13s the patch passed with JDK Azul Systems, Inc.-1.7.0_272-b10
-1 ❌ findbugs 0m 46s hbase-server in the patch failed.
_ Other Tests _
+1 💚 unit 0m 28s hbase-hadoop-compat in the patch passed.
+1 💚 unit 0m 40s hbase-hadoop2-compat in the patch passed.
-1 ❌ unit 0m 46s hbase-server in the patch failed.
+1 💚 asflicense 0m 37s The patch does not generate ASF License warnings.
51m 33s
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3009/1/artifact/out/Dockerfile
GITHUB PR #3009
JIRA Issue HBASE-25627
Optional Tests dupname asflicense javac javadoc unit spotbugs findbugs shadedjars hadoopcheck hbaseanti checkstyle compile
uname Linux 4e33ab635cbe 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 16:55:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality /home/jenkins/jenkins-home/workspace/Base-PreCommit-GitHub-PR_PR-3009/out/precommit/personality/provided.sh
git revision branch-1 / 4cfbf19
Default Java Azul Systems, Inc.-1.7.0_272-b10
Multi-JDK versions /usr/lib/jvm/zulu-8-amd64:Azul Systems, Inc.-1.8.0_262-b19 /usr/lib/jvm/zulu-7-amd64:Azul Systems, Inc.-1.7.0_272-b10
mvninstall https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3009/1/artifact/out/patch-mvninstall-root.txt
compile https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3009/1/artifact/out/patch-compile-hbase-server-jdkAzulSystems,Inc.-1.8.0_262-b19.txt
javac https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3009/1/artifact/out/patch-compile-hbase-server-jdkAzulSystems,Inc.-1.8.0_262-b19.txt
compile https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3009/1/artifact/out/patch-compile-hbase-server-jdkAzulSystems,Inc.-1.7.0_272-b10.txt
javac https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3009/1/artifact/out/patch-compile-hbase-server-jdkAzulSystems,Inc.-1.7.0_272-b10.txt
shadedjars https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3009/1/artifact/out/patch-shadedjars.txt
hadoopcheck https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3009/1/artifact/out/patch-javac-2.8.5.txt
hadoopcheck https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3009/1/artifact/out/patch-javac-2.9.2.txt
findbugs https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3009/1/artifact/out/patch-findbugs-hbase-server.txt
unit https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3009/1/artifact/out/patch-unit-hbase-server.txt
Test Results https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3009/1/testReport/
Max. process+thread count 181 (vs. ulimit of 10000)
modules C: hbase-hadoop-compat hbase-hadoop2-compat hbase-server U: .
Console output https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3009/1/console
versions git=1.9.1 maven=3.0.5 findbugs=3.0.1
Powered by Apache Yetus 0.12.0 https://yetus.apache.org

This message was automatically generated.

Copy link
Contributor

@wchevreuil wchevreuil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this relevant/affecting branch-1 only? Shouldn't we target master branch as the primary branch for the changes, then backport it to all related lower branches?

}

@Override
public void setPeerZkConnectionFailures(boolean success) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is a numeric metric, we should comply with the other numeric metrics and provide an incrPeerZkConnectionFailures method, rather than a setter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wchevreuil This was intentional since we want to reset it to zero once we get the success connection.
Do you suggest having a separate method for a reset?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic in a global source doesn't give the intended result here because peerZkConnectionFailures is reset to 0 if any source succeeds. That doesn't seem right.

Copy link
Contributor

@apurtell apurtell Mar 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, add a separate method for set, but set methods on counter metrics accept the integer value as the new value for the metric, not a weird boolean. Set and Increment is fine. Set(0) to reset.

while (this.isSourceActive() && this.peerClusterId == null) {
this.peerClusterId = replicationEndpoint.getPeerUUID();
if (this.isSourceActive() && this.peerClusterId == null) {
metrics.setPeerZkConnectionFailures(false);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of the motivations described in the jira is that there's no evidence of this condition in the logs. Since log files are normally the first source of info operators normally look after, is it possible to add a WARN reporting the peerId info is missing, possibly due to ZK connectivity issues?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I agree that WARN should be there as well but the metric helps to monitor and alert in this case. Since connection to peer's ZK blocks the whole replication, it should be good to monitor this as a metric.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had a brief discussion offline with @sandeepvinayak on how to generalize this. I think the intent here is capture and flag any sources that are stuck during initialization and ZK connection failure is just a symptom. So capturing those ZK failure count may not add much value, instead we track number of uninitialized sources, (gauge) that'd be much more helpful.

So one way forward is to track number of such uninitialized sources at a global scope (that the monitoring tooling can flag if its > 0 for a time window) and then backport https://issues.apache.org/jira/browse/HBASE-22731 to branch-1. These two together should help us narrow down it to the right RS and root cause.

Thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with @wchevreuil , metrics and logs are consumed differently by operators and a reasonable request to add a log is not solved by emitting a metric

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, I have added a metric as well as a log.

}

@Override
public void setPeerZkConnectionFailures(boolean success) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic in a global source doesn't give the intended result here because peerZkConnectionFailures is reset to 0 if any source succeeds. That doesn't seem right.

while (this.isSourceActive() && this.peerClusterId == null) {
this.peerClusterId = replicationEndpoint.getPeerUUID();
if (this.isSourceActive() && this.peerClusterId == null) {
metrics.setPeerZkConnectionFailures(false);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had a brief discussion offline with @sandeepvinayak on how to generalize this. I think the intent here is capture and flag any sources that are stuck during initialization and ZK connection failure is just a symptom. So capturing those ZK failure count may not add much value, instead we track number of uninitialized sources, (gauge) that'd be much more helpful.

So one way forward is to track number of such uninitialized sources at a global scope (that the monitoring tooling can flag if its > 0 for a time window) and then backport https://issues.apache.org/jira/browse/HBASE-22731 to branch-1. These two together should help us narrow down it to the right RS and root cause.

Thoughts?

public static final String SOURCE_FAILED_RECOVERY_QUEUES = "source.failedRecoverQueues";
/* Used to track the age of oldest wal in ms since its creation time */
String OLDEST_WAL_AGE = "source.oldestWalAge";
public static final String SOURCE_PEER_ZK_CONNECTION_FAILURE = "source.peerZkConnectionFailure";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just call this peerConnectionFailure? ZK may or may not be the reason, if not now, then in the future. What we want to count is connection failures, let the naming reflect that. (We need to care about metric names because it becomes part of operational compat.)

void incrFailedRecoveryQueue();
void setOldestWalAge(long age);
long getOldestWalAge();
void setPeerZkConnectionFailures(boolean success);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is weird. It should be an increment function. See above incrFailedRecoveryQueue as example.

void setOldestWalAge(long age);
long getOldestWalAge();
void setPeerZkConnectionFailures(boolean success);
long getPeerZkConnectionFailures();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Getters are not needed by convention here, remove

}

@Override
public void setPeerZkConnectionFailures(boolean success) {
Copy link
Contributor

@apurtell apurtell Mar 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, add a separate method for set, but set methods on counter metrics accept the integer value as the new value for the metric, not a weird boolean. Set and Increment is fine. Set(0) to reset.

while (this.isSourceActive() && this.peerClusterId == null) {
this.peerClusterId = replicationEndpoint.getPeerUUID();
if (this.isSourceActive() && this.peerClusterId == null) {
metrics.setPeerZkConnectionFailures(false);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with @wchevreuil , metrics and logs are consumed differently by operators and a reasonable request to add a log is not solved by emitting a metric

@apurtell
Copy link
Contributor

apurtell commented Mar 3, 2021

I also agree with @wchevreuil that starting with branch-1 for an issue that affects all branches is upside down. We should have a master patch and merge it before merging this. Sure, a separate PR may be needed for branch-1 because of code difference, that is fine, but patch application should proceed in the normal order, which is master -> branch-2 -> releasing branch-2s -> branch-1.

@bharathv
Copy link
Contributor

bharathv commented Mar 3, 2021

@apurtell I believe our reviews overlapped. I proposed an alternative metric to track. It helps to track the number of uninitialized sources (stuck during initialization) that we can flag via monitoring right away (along with additional logging from back port of HBASE-22731).

@Apache-HBase
Copy link

🎊 +1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 12m 29s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+1 💚 hbaseanti 0m 0s Patch does not have any anti-patterns.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 2 new or modified test files.
_ branch-1 Compile Tests _
+0 🆗 mvndep 2m 25s Maven dependency ordering for branch
+1 💚 mvninstall 8m 4s branch-1 passed
+1 💚 compile 1m 8s branch-1 passed with JDK Azul Systems, Inc.-1.8.0_262-b19
+1 💚 compile 1m 20s branch-1 passed with JDK Azul Systems, Inc.-1.7.0_272-b10
+1 💚 checkstyle 2m 16s branch-1 passed
+1 💚 shadedjars 3m 16s branch has no errors when building our shaded downstream artifacts.
+1 💚 javadoc 1m 5s branch-1 passed with JDK Azul Systems, Inc.-1.8.0_262-b19
+1 💚 javadoc 1m 14s branch-1 passed with JDK Azul Systems, Inc.-1.7.0_272-b10
+0 🆗 spotbugs 2m 54s Used deprecated FindBugs config; considering switching to SpotBugs.
+1 💚 findbugs 4m 17s branch-1 passed
_ Patch Compile Tests _
+0 🆗 mvndep 0m 17s Maven dependency ordering for patch
+1 💚 mvninstall 2m 4s the patch passed
+1 💚 compile 1m 8s the patch passed with JDK Azul Systems, Inc.-1.8.0_262-b19
+1 💚 javac 1m 8s the patch passed
+1 💚 compile 1m 20s the patch passed with JDK Azul Systems, Inc.-1.7.0_272-b10
+1 💚 javac 1m 20s the patch passed
+1 💚 checkstyle 0m 13s The patch passed checkstyle in hbase-hadoop-compat
+1 💚 checkstyle 0m 14s hbase-hadoop2-compat: The patch generated 0 new + 1 unchanged - 8 fixed = 1 total (was 9)
+1 💚 checkstyle 1m 44s hbase-server: The patch generated 0 new + 1 unchanged - 1 fixed = 1 total (was 2)
+1 💚 whitespace 0m 0s The patch has no whitespace issues.
+1 💚 shadedjars 3m 5s patch has no errors when building our shaded downstream artifacts.
+1 💚 hadoopcheck 4m 50s Patch does not cause any errors with Hadoop 2.8.5 2.9.2.
+1 💚 javadoc 0m 57s the patch passed with JDK Azul Systems, Inc.-1.8.0_262-b19
+1 💚 javadoc 1m 15s the patch passed with JDK Azul Systems, Inc.-1.7.0_272-b10
+1 💚 findbugs 4m 40s the patch passed
_ Other Tests _
+1 💚 unit 0m 27s hbase-hadoop-compat in the patch passed.
+1 💚 unit 0m 37s hbase-hadoop2-compat in the patch passed.
+1 💚 unit 118m 40s hbase-server in the patch passed.
+1 💚 asflicense 0m 59s The patch does not generate ASF License warnings.
184m 21s
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3009/2/artifact/out/Dockerfile
GITHUB PR #3009
JIRA Issue HBASE-25627
Optional Tests dupname asflicense javac javadoc unit spotbugs findbugs shadedjars hadoopcheck hbaseanti checkstyle compile
uname Linux 988d48e1a065 4.15.0-136-generic #140-Ubuntu SMP Thu Jan 28 05:20:47 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality /home/jenkins/jenkins-home/workspace/Base-PreCommit-GitHub-PR_PR-3009/out/precommit/personality/provided.sh
git revision branch-1 / 4cfbf19
Default Java Azul Systems, Inc.-1.7.0_272-b10
Multi-JDK versions /usr/lib/jvm/zulu-8-amd64:Azul Systems, Inc.-1.8.0_262-b19 /usr/lib/jvm/zulu-7-amd64:Azul Systems, Inc.-1.7.0_272-b10
Test Results https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3009/2/testReport/
Max. process+thread count 3694 (vs. ulimit of 10000)
modules C: hbase-hadoop-compat hbase-hadoop2-compat hbase-server U: .
Console output https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3009/2/console
versions git=1.9.1 maven=3.0.5 findbugs=3.0.1
Powered by Apache Yetus 0.12.0 https://yetus.apache.org

This message was automatically generated.

@Apache-HBase
Copy link

🎊 +1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 11m 49s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+1 💚 hbaseanti 0m 0s Patch does not have any anti-patterns.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 2 new or modified test files.
_ branch-1 Compile Tests _
+0 🆗 mvndep 2m 27s Maven dependency ordering for branch
+1 💚 mvninstall 8m 7s branch-1 passed
+1 💚 compile 1m 8s branch-1 passed with JDK Azul Systems, Inc.-1.8.0_262-b19
+1 💚 compile 1m 18s branch-1 passed with JDK Azul Systems, Inc.-1.7.0_272-b10
+1 💚 checkstyle 2m 7s branch-1 passed
+1 💚 shadedjars 3m 0s branch has no errors when building our shaded downstream artifacts.
+1 💚 javadoc 1m 8s branch-1 passed with JDK Azul Systems, Inc.-1.8.0_262-b19
+1 💚 javadoc 1m 17s branch-1 passed with JDK Azul Systems, Inc.-1.7.0_272-b10
+0 🆗 spotbugs 2m 43s Used deprecated FindBugs config; considering switching to SpotBugs.
+1 💚 findbugs 4m 7s branch-1 passed
_ Patch Compile Tests _
+0 🆗 mvndep 0m 17s Maven dependency ordering for patch
+1 💚 mvninstall 1m 58s the patch passed
+1 💚 compile 1m 11s the patch passed with JDK Azul Systems, Inc.-1.8.0_262-b19
+1 💚 javac 1m 11s the patch passed
+1 💚 compile 1m 20s the patch passed with JDK Azul Systems, Inc.-1.7.0_272-b10
+1 💚 javac 1m 20s the patch passed
+1 💚 checkstyle 0m 14s The patch passed checkstyle in hbase-hadoop-compat
+1 💚 checkstyle 0m 16s hbase-hadoop2-compat: The patch generated 0 new + 1 unchanged - 8 fixed = 1 total (was 9)
+1 💚 checkstyle 1m 30s hbase-server: The patch generated 0 new + 1 unchanged - 1 fixed = 1 total (was 2)
+1 💚 whitespace 0m 0s The patch has no whitespace issues.
+1 💚 shadedjars 2m 50s patch has no errors when building our shaded downstream artifacts.
+1 💚 hadoopcheck 4m 39s Patch does not cause any errors with Hadoop 2.8.5 2.9.2.
+1 💚 javadoc 1m 0s the patch passed with JDK Azul Systems, Inc.-1.8.0_262-b19
+1 💚 javadoc 1m 16s the patch passed with JDK Azul Systems, Inc.-1.7.0_272-b10
+1 💚 findbugs 4m 23s the patch passed
_ Other Tests _
+1 💚 unit 0m 28s hbase-hadoop-compat in the patch passed.
+1 💚 unit 0m 40s hbase-hadoop2-compat in the patch passed.
+1 💚 unit 120m 9s hbase-server in the patch passed.
+1 💚 asflicense 1m 19s The patch does not generate ASF License warnings.
184m 10s
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3009/3/artifact/out/Dockerfile
GITHUB PR #3009
JIRA Issue HBASE-25627
Optional Tests dupname asflicense javac javadoc unit spotbugs findbugs shadedjars hadoopcheck hbaseanti checkstyle compile
uname Linux efad72cacb2b 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality /home/jenkins/jenkins-home/workspace/Base-PreCommit-GitHub-PR_PR-3009/out/precommit/personality/provided.sh
git revision branch-1 / 4cfbf19
Default Java Azul Systems, Inc.-1.7.0_272-b10
Multi-JDK versions /usr/lib/jvm/zulu-8-amd64:Azul Systems, Inc.-1.8.0_262-b19 /usr/lib/jvm/zulu-7-amd64:Azul Systems, Inc.-1.7.0_272-b10
Test Results https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3009/3/testReport/
Max. process+thread count 4208 (vs. ulimit of 10000)
modules C: hbase-hadoop-compat hbase-hadoop2-compat hbase-server U: .
Console output https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3009/3/console
versions git=1.9.1 maven=3.0.5 findbugs=3.0.1
Powered by Apache Yetus 0.12.0 https://yetus.apache.org

This message was automatically generated.

@sandeepvinayak
Copy link
Contributor Author

sandeepvinayak commented Mar 5, 2021

@bharathv @apurtell @wchevreuil I believe @bharathv suggestion to have a metric at source level instead makes sense to me. That metric will eventually catch the situation of peer connection failure as well.
I have raised a PR for master branch here. Can you please review it?

Once that is committed, I can change this one on branch-1.

@bharathv
Copy link
Contributor

@sandeepvinayak Mind refreshing this PR with the latest patch? Thanks.

@sandeepvinayak sandeepvinayak changed the title HBASE-25627: HBase replication should have a metric to represent if it cannot talk to peer's zk HBASE-25627: HBase replication should have a metric to represent if the source is stuck getting initialized Mar 20, 2021
@Apache-HBase
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 1m 12s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+1 💚 hbaseanti 0m 0s Patch does not have any anti-patterns.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 2 new or modified test files.
_ branch-1 Compile Tests _
+0 🆗 mvndep 2m 27s Maven dependency ordering for branch
+1 💚 mvninstall 8m 0s branch-1 passed
+1 💚 compile 1m 9s branch-1 passed with JDK Azul Systems, Inc.-1.8.0_262-b19
+1 💚 compile 1m 21s branch-1 passed with JDK Azul Systems, Inc.-1.7.0_272-b10
+1 💚 checkstyle 2m 6s branch-1 passed
+1 💚 shadedjars 3m 2s branch has no errors when building our shaded downstream artifacts.
+1 💚 javadoc 1m 9s branch-1 passed with JDK Azul Systems, Inc.-1.8.0_262-b19
+1 💚 javadoc 1m 16s branch-1 passed with JDK Azul Systems, Inc.-1.7.0_272-b10
+0 🆗 spotbugs 2m 41s Used deprecated FindBugs config; considering switching to SpotBugs.
+1 💚 findbugs 4m 5s branch-1 passed
_ Patch Compile Tests _
+0 🆗 mvndep 0m 18s Maven dependency ordering for patch
+1 💚 mvninstall 1m 55s the patch passed
+1 💚 compile 1m 11s the patch passed with JDK Azul Systems, Inc.-1.8.0_262-b19
+1 💚 javac 1m 11s the patch passed
+1 💚 compile 1m 18s the patch passed with JDK Azul Systems, Inc.-1.7.0_272-b10
+1 💚 javac 1m 18s the patch passed
-1 ❌ checkstyle 1m 31s hbase-server: The patch generated 1 new + 15 unchanged - 2 fixed = 16 total (was 17)
+1 💚 whitespace 0m 0s The patch has no whitespace issues.
+1 💚 shadedjars 2m 47s patch has no errors when building our shaded downstream artifacts.
+1 💚 hadoopcheck 4m 34s Patch does not cause any errors with Hadoop 2.8.5 2.9.2.
+1 💚 javadoc 1m 0s the patch passed with JDK Azul Systems, Inc.-1.8.0_262-b19
+1 💚 javadoc 1m 16s the patch passed with JDK Azul Systems, Inc.-1.7.0_272-b10
+1 💚 findbugs 4m 24s the patch passed
_ Other Tests _
+1 💚 unit 0m 29s hbase-hadoop-compat in the patch passed.
+1 💚 unit 0m 40s hbase-hadoop2-compat in the patch passed.
-1 ❌ unit 100m 13s hbase-server in the patch failed.
+1 💚 asflicense 1m 14s The patch does not generate ASF License warnings.
153m 29s
Reason Tests
Failed junit tests hadoop.hbase.TestCachedClusterId
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3009/4/artifact/out/Dockerfile
GITHUB PR #3009
JIRA Issue HBASE-25627
Optional Tests dupname asflicense javac javadoc unit spotbugs findbugs shadedjars hadoopcheck hbaseanti checkstyle compile
uname Linux b439bb2a9230 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality /home/jenkins/jenkins-home/workspace/Base-PreCommit-GitHub-PR_PR-3009/out/precommit/personality/provided.sh
git revision branch-1 / bea87b3
Default Java Azul Systems, Inc.-1.7.0_272-b10
Multi-JDK versions /usr/lib/jvm/zulu-8-amd64:Azul Systems, Inc.-1.8.0_262-b19 /usr/lib/jvm/zulu-7-amd64:Azul Systems, Inc.-1.7.0_272-b10
checkstyle https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3009/4/artifact/out/diff-checkstyle-hbase-server.txt
unit https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3009/4/artifact/out/patch-unit-hbase-server.txt
Test Results https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3009/4/testReport/
Max. process+thread count 3533 (vs. ulimit of 10000)
modules C: hbase-hadoop-compat hbase-hadoop2-compat hbase-server U: .
Console output https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3009/4/console
versions git=1.9.1 maven=3.0.5 findbugs=3.0.1
Powered by Apache Yetus 0.12.0 https://yetus.apache.org

This message was automatically generated.

@Apache-HBase
Copy link

🎊 +1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 1m 7s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 1s No case conflicting files found.
+1 💚 hbaseanti 0m 0s Patch does not have any anti-patterns.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 2 new or modified test files.
_ branch-1 Compile Tests _
+0 🆗 mvndep 2m 29s Maven dependency ordering for branch
+1 💚 mvninstall 8m 7s branch-1 passed
+1 💚 compile 1m 11s branch-1 passed with JDK Azul Systems, Inc.-1.8.0_262-b19
+1 💚 compile 1m 22s branch-1 passed with JDK Azul Systems, Inc.-1.7.0_272-b10
+1 💚 checkstyle 2m 5s branch-1 passed
+1 💚 shadedjars 2m 59s branch has no errors when building our shaded downstream artifacts.
+1 💚 javadoc 1m 9s branch-1 passed with JDK Azul Systems, Inc.-1.8.0_262-b19
+1 💚 javadoc 1m 15s branch-1 passed with JDK Azul Systems, Inc.-1.7.0_272-b10
+0 🆗 spotbugs 2m 46s Used deprecated FindBugs config; considering switching to SpotBugs.
+1 💚 findbugs 4m 8s branch-1 passed
_ Patch Compile Tests _
+0 🆗 mvndep 0m 17s Maven dependency ordering for patch
+1 💚 mvninstall 1m 53s the patch passed
+1 💚 compile 1m 11s the patch passed with JDK Azul Systems, Inc.-1.8.0_262-b19
+1 💚 javac 1m 11s the patch passed
+1 💚 compile 1m 18s the patch passed with JDK Azul Systems, Inc.-1.7.0_272-b10
+1 💚 javac 1m 18s the patch passed
+1 💚 checkstyle 0m 14s The patch passed checkstyle in hbase-hadoop-compat
+1 💚 checkstyle 0m 15s The patch passed checkstyle in hbase-hadoop2-compat
+1 💚 checkstyle 1m 28s hbase-server: The patch generated 0 new + 15 unchanged - 2 fixed = 15 total (was 17)
+1 💚 whitespace 0m 0s The patch has no whitespace issues.
+1 💚 shadedjars 2m 48s patch has no errors when building our shaded downstream artifacts.
+1 💚 hadoopcheck 4m 38s Patch does not cause any errors with Hadoop 2.8.5 2.9.2.
+1 💚 javadoc 1m 1s the patch passed with JDK Azul Systems, Inc.-1.8.0_262-b19
+1 💚 javadoc 1m 17s the patch passed with JDK Azul Systems, Inc.-1.7.0_272-b10
+1 💚 findbugs 4m 24s the patch passed
_ Other Tests _
+1 💚 unit 0m 28s hbase-hadoop-compat in the patch passed.
+1 💚 unit 0m 40s hbase-hadoop2-compat in the patch passed.
+1 💚 unit 104m 0s hbase-server in the patch passed.
+1 💚 asflicense 1m 19s The patch does not generate ASF License warnings.
157m 27s
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3009/5/artifact/out/Dockerfile
GITHUB PR #3009
JIRA Issue HBASE-25627
Optional Tests dupname asflicense javac javadoc unit spotbugs findbugs shadedjars hadoopcheck hbaseanti checkstyle compile
uname Linux 386c8a914803 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality /home/jenkins/jenkins-home/workspace/Base-PreCommit-GitHub-PR_PR-3009/out/precommit/personality/provided.sh
git revision branch-1 / f807800
Default Java Azul Systems, Inc.-1.7.0_272-b10
Multi-JDK versions /usr/lib/jvm/zulu-8-amd64:Azul Systems, Inc.-1.8.0_262-b19 /usr/lib/jvm/zulu-7-amd64:Azul Systems, Inc.-1.7.0_272-b10
Test Results https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3009/5/testReport/
Max. process+thread count 3703 (vs. ulimit of 10000)
modules C: hbase-hadoop-compat hbase-hadoop2-compat hbase-server U: .
Console output https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3009/5/console
versions git=1.9.1 maven=3.0.5 findbugs=3.0.1
Powered by Apache Yetus 0.12.0 https://yetus.apache.org

This message was automatically generated.

@sandeepvinayak sandeepvinayak changed the title HBASE-25627: HBase replication should have a metric to represent if the source is stuck getting initialized HBASE-25627: [Backport]HBase replication should have a metric to represent if the source is stuck getting initialized Mar 21, 2021
@sandeepvinayak
Copy link
Contributor Author

fyi @bharathv

Copy link
Contributor

@bharathv bharathv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll let other reviewers take a look or merge by EOD.

@bharathv bharathv merged commit 97c152e into apache:branch-1 Mar 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants