metrics: enhance diagnostic capabilities for gRPC network issues (#67811)#68462
Conversation
Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>
|
@zyguan This PR has conflicts, I have hold it. |
|
@ti-chi-bot: ## If you want to know how to resolve it, please read the guide in TiDB Dev Guide. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository. |
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (4)
📝 WalkthroughWalkthroughAdds an internal singleton gRPC Channelz Prometheus collector wired into RegisterMetrics, test infrastructure and tests for the collector, BUILD/test dependency updates, and goleak IgnoreTopFunction entries across multiple TestMain files to suppress known gRPC background goroutines. ChangesgRPC Channelz Metrics Collection
Sequence DiagramsequenceDiagram
participant RegisterMetrics
participant setupChannelzCollector
participant initGrpcChannelzCollectorLocked
participant bufconnListener
participant grpcServer
participant grpcClient
participant PrometheusRegistry
RegisterMetrics->>setupChannelzCollector: invoke
setupChannelzCollector->>initGrpcChannelzCollectorLocked: initialize once (non-test)
initGrpcChannelzCollectorLocked->>bufconnListener: create listener
initGrpcChannelzCollectorLocked->>grpcServer: register channelz service & serve
initGrpcChannelzCollectorLocked->>grpcClient: dial via bufconn
initGrpcChannelzCollectorLocked->>PrometheusRegistry: create/register collector with filter
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@pkg/metrics/BUILD.bazel`:
- Around line 67-71: The BUILD file contains unresolved git merge markers around
the shard_count setting which breaks Bazel parsing; open the BUILD file and
remove the conflict markers and choose the correct value for shard_count (either
keep shard_count = 4 or shard_count = 8) so the stanza is a single valid
attribute assignment, ensuring the final line uses the chosen shard_count and no
"<<<<<<<", "=======", or ">>>>>>>" tokens remain.
In `@pkg/metrics/metrics_internal_test.go`:
- Around line 22-27: The test file contains unresolved git conflict markers
(<<<<<<<, =======, >>>>>>>) in the import block and later code (around the
import lines that mention "github.com/pingcap/tidb/pkg/util/intest",
"github.com/prometheus/client_golang/prometheus", dto alias) and between lines
~35-173; remove all conflict markers and reconcile the two versions so the
imports and test code form a valid Go file — ensure you keep the correct set of
imports (combine needed ones like intest, prometheus, dto) and the intended test
code paths, delete the marker lines, and run `go test` to verify compilation.
In `@pkg/metrics/metrics.go`:
- Around line 347-375: The file contains leftover git merge markers (<<<<<<<,
=======, >>>>>>>) inside the RegisterMetrics block which prevents compilation;
remove the conflict markers and keep the intended registrations and calls
(ensure prometheus.MustRegister calls for GlobalMemArbitrationDuration,
GlobalMemArbitratorWorkMode, GlobalMemArbitratorQuota,
GlobalMemArbitratorWaitingTask, GlobalMemArbitratorRuntimeMemMagnifi,
GlobalMemArbitratorRootPool, GlobalMemArbitratorEventCounter,
GlobalMemArbitratorTaskExecCounter, TLSVersion, TLSCipher,
IndexLookUpExecutorDuration, IndexLookRowsCounter, IndexLookUpExecutorRowNumber,
IndexLookUpCopTaskCount, StmtSummaryWindowRecordCount,
StmtSummaryWindowEvictedCount are present) and the setupChannelzCollector()
invocation inside RegisterMetrics; remove the merge markers and leave the final
intended code so RegisterMetrics compiles cleanly.
In `@pkg/server/main_test.go`:
- Around line 75-84: The file contains unresolved Git merge conflict markers
(<<<<<<<, =======, >>>>>>>) around the goleak.IgnoreTopFunction entries in
pkg/server/main_test.go; remove the conflict markers and keep the intended set
of ignore entries (the combined/desired goleak.IgnoreTopFunction lines such as
google.golang.org/grpc.(*addrConn).resetTransport,
google.golang.org/grpc.(*ccBalancerWrapper).watcher,
google.golang.org/grpc/internal/transport.(*controlBuffer).get,
google.golang.org/grpc/internal/transport.(*http2Client).keepalive,
google.golang.org/grpc/internal/grpcsync.(*CallbackSerializer).run,
google.golang.org/grpc/test/bufconn.(*Listener).Accept,
github.com/tikv/client-go/v2/txnkv/transaction.keepAlive) by deleting the
"<<<<<<<", "=======", and ">>>>>>>" markers and leaving the correct
IgnoreTopFunction lines; then run gofmt/goimports or go vet to ensure the test
file is valid Go.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: f0adcc76-18bd-483b-969b-32d039572acb
📒 Files selected for processing (14)
br/cmd/br/main_test.gopkg/metrics/BUILD.bazelpkg/metrics/main_test.gopkg/metrics/metrics.gopkg/metrics/metrics_internal_test.gopkg/server/handler/extractorhandler/main_test.gopkg/server/handler/optimizor/main_test.gopkg/server/handler/tests/main_test.gopkg/server/main_test.gopkg/server/tests/commontest/main_test.gopkg/server/tests/cursor/main_test.gopkg/server/tests/main_test.gopkg/server/tests/standby/main_test.gopkg/server/tests/tls/main_test.go
Signed-off-by: zyguan <zhongyangguan@gmail.com>
|
/unhold |
|
/retest |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## release-nextgen-20251011 #68462 +/- ##
=============================================================
Coverage ? 71.8902%
=============================================================
Files ? 1835
Lines ? 493715
Branches ? 0
=============================================================
Hits ? 354933
Misses ? 115435
Partials ? 23347
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: 3pointer, cfzjywxk, nolouch The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
56385fd
into
pingcap:release-nextgen-20251011
This is an automated cherry-pick of #67811
What problem does this PR solve?
Issue Number: close #67810
Problem Summary: ref #67810
What changed and how does it work?
Bump client-go and register channelz collector.
Check List
Tests
Side effects
Documentation
Release note
Please refer to Release Notes Language Style Guide to write a quality release note.
Summary by CodeRabbit
New Features
Tests