Skip to content

debugging

Thomas Mangin edited this page Apr 8, 2026 · 1 revision

Pre-Alpha. This page describes behavior that may change.

Debugging Ze is mostly a matter of knowing which tool to reach for. The daemon has a lot of surfaces, and the right tool depends on what you are trying to see. This page lists the tools that pay off most often.

Quick reference

Tool What it is for
ze config dump Inspect parsed configuration.
ze config validate Check a config file without starting the daemon.
ze.log.* Per-subsystem debug logging.
ze cli monitor event Live event stream, filterable.
ze cli monitor bgp Live peer dashboard.
ze show warnings, ze show errors Operational report bus.
ze bgp decode Decode raw BGP wire bytes into structured JSON.
ze bgp encode Encode route commands into BGP wire bytes.
ze --pprof <addr:port> Start the pprof HTTP profiler.
ze signal quit Goroutine dump to stderr (kills the daemon).
ze-test --server N / --client N Interactive functional test debugging.

Turning the log up

Most Ze debugging starts with more log output. The levels are hierarchical, so you can be specific.

ze -d config.conf                             # Base level debug
ZE_LOG_BGP_REACTOR=debug ze config.conf       # Just the reactor
ZE_LOG_BGP_REACTOR_PEER=debug ze config.conf  # Just peer lifecycle

At runtime, without a restart:

ze cli -c "bgp log set bgp.reactor debug"
ze cli -c "bgp log set bgp.fsm    debug"
ze cli -c "bgp log levels"

The hierarchical naming means bgp.reactor=debug turns on debug for every bgp.reactor.* child subsystem at once. Use this when you need to see everything a single area is doing.

The live event stream

When you are chasing a runtime bug that depends on what a peer is sending, the event stream is usually the fastest way to see it.

ze cli monitor event peer upstream include update direction received
ze cli monitor event include state                  # All peer state changes
ze cli monitor event exclude keepalive              # Everything except heartbeats

Pipe through | json for structured output, | match <regex> for a filter, | no-more for no paging.

Inspecting parsed config

If the daemon is not doing what the config says, the first question is "did the config parse the way I think it did?". ze config dump reads the file, parses it, and prints what Ze actually sees.

ze config dump          config.conf
ze config dump --json   config.conf            # Structured output

This is where you catch typos that survived validation, misplaced blocks that landed under the wrong container, and values that inherited differently than you expected.

Decoding wire bytes

When you have hex bytes from a packet capture, a log, or a peer that sent something odd, ze bgp decode turns them into structured JSON.

ze bgp decode --update <hex>
ze bgp decode --open   <hex>
ze bgp decode --nlri ipv4/unicast <hex>

Every path attribute that Ze knows how to parse gets decoded. The output format is the same JSON envelope that shows up in the live event stream, which makes comparisons across the two surfaces straightforward.

Going the other way, ze bgp encode takes a human-readable route command and produces the wire bytes.

ze bgp encode --nlri ipv4/unicast "announce route 10.0.0.0/24 next-hop 192.168.1.1"

Round-tripping (encode, then decode, then compare) is a good sanity check when you suspect a bug in the encoder or the decoder.

The operational report bus

ze show warnings and ze show errors are not log lines: they are state snapshots from the in-process report bus. If you have any doubt about whether something operational is happening right now, this is the first place to look.

ze show warnings | json
ze show errors   | json

Full coverage is in operational reports.

Profiling with pprof

Start the daemon with --pprof <addr:port> to enable the standard Go HTTP profiler.

ze --pprof 127.0.0.1:6060 config.conf

Then use go tool pprof to drive it.

go tool pprof -http :8080 http://localhost:6060/debug/pprof/profile?seconds=30    # CPU
go tool pprof -http :8080 http://localhost:6060/debug/pprof/heap                  # Heap
go tool pprof -http :8080 http://localhost:6060/debug/pprof/goroutine             # Goroutines

This is also the fastest way to see what the daemon is doing when it is hung: the goroutine profile shows every stack.

Goroutine dumps

When the daemon is wedged and you want the full stack trace without a running pprof, ze signal quit dumps every goroutine stack to stderr and exits the process. This is the nuclear option: it kills the daemon. Use it when you have already collected the context you need and are willing to restart.

Functional test debugging

Functional tests run by ze-test can be debugged interactively.

ze-test bgp encode --server 0         # Start server test 0, let it wait
ze-test bgp plugin --client 3         # Start client test 3, let it wait

The test pauses at a known point, and you connect to it with a second shell to poke at the state. This is the right tool when a functional test is failing in CI but passing locally: run it interactively and watch the decoded messages as they go past.

When the chaos framework catches a bug

The chaos framework produces NDJSON event logs that replay deterministically.

ze-chaos --event-log run.ndjson --seed <N>    # Record the failing run
ze-chaos --replay  run.ndjson                 # Reproduce it
ze-chaos --shrink  run.ndjson > minimal.ndjson  # Reduce to minimal repro

Commit the shrunk scenario as a test. The test stays in the suite even after you fix the bug.

See also

Adapted from main/docs/debugging-tools.md.

Home

About

First Steps

Configuration

Operation

Interfaces

Plugins

Plugin Development

Chaos Testing

Blueprints

Development

Reference

Clone this wiki locally