Summary
Add fault correlation/muting capability to FaultManager to handle "expected cascades" of faults.
Use cases (from this discussion: https://discordapp.com/channels/1451323858547118216/1459576596733104158):
- e-Stop cascades: when e-stop triggers, downstream motor/comm faults are expected noise
- OTA/restart transitions: temporary bursts of timeouts and comm errors during updates
- degradation modes: intentionally disabled subsystems shouldn't keep surfacing known faults
Currently all faults are treated equally, making it hard to identify the actual root cause when multiple related faults fire simultaneously.
Proposed solution (optional)
Root Cause and Symptoms mapping
Instead of simple muting, define relationships between faults:
# 1. Define fault patterns (regex matching on diagnostic name/message)
fault_patterns:
motor_low_power:
name: "Motor.*"
message: "Low Voltage|Power Loss"
motor_comm_timeout:
name: "Motor.*"
message: "Timeout|No Response"
# 2. Define root causes with expected symptoms
root_causes:
estop_pressed:
name: "E-Stop Pressed"
symptoms:
- motor_low_power
- motor_comm_timeout
Benefits over simple muting:
- Faults aren't hidden, they're contextualized
- Unexpected faults (not matching any symptom) stand out as potentially real issues
Implementation location: FaultManager (centralized, full system context)
Additional context (optional)
Summary
Add fault correlation/muting capability to FaultManager to handle "expected cascades" of faults.
Use cases (from this discussion: https://discordapp.com/channels/1451323858547118216/1459576596733104158):
Currently all faults are treated equally, making it hard to identify the actual root cause when multiple related faults fire simultaneously.
Proposed solution (optional)
Root Cause and Symptoms mapping
Instead of simple muting, define relationships between faults:
Benefits over simple muting:
Implementation location: FaultManager (centralized, full system context)
Additional context (optional)