-
Notifications
You must be signed in to change notification settings - Fork 2
debugging
Pre-Alpha. This page describes behavior that may change.
Debugging Ze is mostly a matter of knowing which tool to reach for. The daemon has a lot of surfaces, and the right tool depends on what you are trying to see. This page lists the tools that pay off most often.
| Tool | What it is for |
|---|---|
ze config dump |
Inspect parsed configuration. |
ze config validate |
Check a config file without starting the daemon. |
ze.log.* |
Per-subsystem debug logging. |
ze cli monitor event |
Live event stream, filterable. |
ze cli monitor bgp |
Live peer dashboard. |
ze show warnings, ze show errors
|
Operational report bus. |
ze bgp decode |
Decode raw BGP wire bytes into structured JSON. |
ze bgp encode |
Encode route commands into BGP wire bytes. |
ze --pprof <addr:port> |
Start the pprof HTTP profiler. |
ze signal quit |
Goroutine dump to stderr (kills the daemon). |
ze-test --server N / --client N
|
Interactive functional test debugging. |
Most Ze debugging starts with more log output. The levels are hierarchical, so you can be specific.
ze -d config.conf # Base level debug
ZE_LOG_BGP_REACTOR=debug ze config.conf # Just the reactor
ZE_LOG_BGP_REACTOR_PEER=debug ze config.conf # Just peer lifecycleAt runtime, without a restart:
ze cli -c "bgp log set bgp.reactor debug"
ze cli -c "bgp log set bgp.fsm debug"
ze cli -c "bgp log levels"The hierarchical naming means bgp.reactor=debug turns on debug for every bgp.reactor.* child subsystem at once. Use this when you need to see everything a single area is doing.
When you are chasing a runtime bug that depends on what a peer is sending, the event stream is usually the fastest way to see it.
ze cli monitor event peer upstream include update direction received
ze cli monitor event include state # All peer state changes
ze cli monitor event exclude keepalive # Everything except heartbeatsPipe through | json for structured output, | match <regex> for a filter, | no-more for no paging.
If the daemon is not doing what the config says, the first question is "did the config parse the way I think it did?". ze config dump reads the file, parses it, and prints what Ze actually sees.
ze config dump config.conf
ze config dump --json config.conf # Structured outputThis is where you catch typos that survived validation, misplaced blocks that landed under the wrong container, and values that inherited differently than you expected.
When you have hex bytes from a packet capture, a log, or a peer that sent something odd, ze bgp decode turns them into structured JSON.
ze bgp decode --update <hex>
ze bgp decode --open <hex>
ze bgp decode --nlri ipv4/unicast <hex>Every path attribute that Ze knows how to parse gets decoded. The output format is the same JSON envelope that shows up in the live event stream, which makes comparisons across the two surfaces straightforward.
Going the other way, ze bgp encode takes a human-readable route command and produces the wire bytes.
ze bgp encode --nlri ipv4/unicast "announce route 10.0.0.0/24 next-hop 192.168.1.1"Round-tripping (encode, then decode, then compare) is a good sanity check when you suspect a bug in the encoder or the decoder.
ze show warnings and ze show errors are not log lines: they are state snapshots from the in-process report bus. If you have any doubt about whether something operational is happening right now, this is the first place to look.
ze show warnings | json
ze show errors | jsonFull coverage is in operational reports.
Start the daemon with --pprof <addr:port> to enable the standard Go HTTP profiler.
ze --pprof 127.0.0.1:6060 config.confThen use go tool pprof to drive it.
go tool pprof -http :8080 http://localhost:6060/debug/pprof/profile?seconds=30 # CPU
go tool pprof -http :8080 http://localhost:6060/debug/pprof/heap # Heap
go tool pprof -http :8080 http://localhost:6060/debug/pprof/goroutine # GoroutinesThis is also the fastest way to see what the daemon is doing when it is hung: the goroutine profile shows every stack.
When the daemon is wedged and you want the full stack trace without a running pprof, ze signal quit dumps every goroutine stack to stderr and exits the process. This is the nuclear option: it kills the daemon. Use it when you have already collected the context you need and are willing to restart.
Functional tests run by ze-test can be debugged interactively.
ze-test bgp encode --server 0 # Start server test 0, let it wait
ze-test bgp plugin --client 3 # Start client test 3, let it waitThe test pauses at a known point, and you connect to it with a second shell to poke at the state. This is the right tool when a functional test is failing in CI but passing locally: run it interactively and watch the decoded messages as they go past.
The chaos framework produces NDJSON event logs that replay deterministically.
ze-chaos --event-log run.ndjson --seed <N> # Record the failing run
ze-chaos --replay run.ndjson # Reproduce it
ze-chaos --shrink run.ndjson > minimal.ndjson # Reduce to minimal reproCommit the shrunk scenario as a test. The test stays in the suite even after you fix the bug.
- Testing for the test suites that run every time.
- Logging for the operator view of logging.
- In-tree debugging tools for the full reference.
Adapted from main/docs/debugging-tools.md.
Unreviewed draft. This wiki was authored in bulk and has not been reviewed. File corrections on the issue tracker.
- Overview
- YANG Model
- Editor Workflow
- Archive and Rollback
- System
- Interfaces
- BFD
- FIB
- Firewall
- Traffic Control
- L2TP/PPP
- VPP Data Plane
- RPKI
- TACACS+ AAA
- Fleet
- BGP
- Starting and Stopping
- Show Commands
- Monitoring
- Logging
- Operational Reports
- Healthcheck
- MRT Analysis
- Upgrade and Restart
- Storage
- Policy
- Core
- Resilience
- Validation
- Capabilities
- Address Families
- Protocol
- Subsystems
- Infrastructure
- Route Server at an IXP
- Transit Edge with RPKI
- Public Looking Glass
- ExaBGP Migration Walkthrough
- FlowSpec Injection
- Chaos-Tested Peering
- AS Path Topology