Skip to content

fix: Matrix transport parity — dedup, message queue, drain-on-stop#103

Open
junlov wants to merge 12 commits intoPleasePrompto:mainfrom
junlov:fix/matrix-transport-parity
Open

fix: Matrix transport parity — dedup, message queue, drain-on-stop#103
junlov wants to merge 12 commits intoPleasePrompto:mainfrom
junlov:fix/matrix-transport-parity

Conversation

@junlov
Copy link
Copy Markdown

@junlov junlov commented Apr 12, 2026

Summary

Brings the Matrix transport closer to parity with Telegram by fixing several missing features:

  • Message dedup: Events replayed during Matrix sync were processed multiple times. Now uses DedupeCache (same as Telegram) keyed by event_id.
  • Pending task tracking: Messages arriving while a CLI session is active are now tracked per chat_id in a new MatrixMessageQueue class.
  • Drain on /stop: Previously, /stop killed the active CLI but queued message tasks immediately acquired the lock and started new sessions (bot appeared to "keep going"). /stop and /stop_all now drain all pending tasks first.
  • Transport merge bug: Sub-agents with transport=matrix were started as Telegram bots because merge_sub_agent_config didn't sync the transports list — a Pydantic validator reset transport back to "telegram" from the main agent's stale list.

Also adds:

  • Matrix support in ductor agents add (interactive wizard + CLI flags)
  • Provider-aware model selection in ductor agents add (shows actual codex/gemini models from cache, not just claude defaults)
  • Transport column in ductor agents list output
  • i18n strings for all new Matrix prompts (en)
  • 14 unit tests for MatrixMessageQueue

Files changed

File Change
messenger/matrix/message_queue.py New — dedup, task tracking, drain
messenger/matrix/bot.py Wire queue into event handlers and /stop
multiagent/models.py Fix transports list in sub-agent merge
cli_commands/agents.py Matrix support + provider-aware model select
i18n/en/cli.toml New prompt strings for Matrix agent setup
tests/test_matrix_message_queue.py New — 14 tests

Test plan

  • MatrixMessageQueue unit tests (14 passing)
  • Manual: ductor agents add <name> interactive with Matrix transport
  • Manual: ductor agents add <name> --transport matrix --homeserver ... CLI mode
  • Manual: Send message in Matrix → /stop → verify no queued re-trigger
  • Manual: Verify Telegram agents still work unchanged

🤖 Generated with Claude Code

Panda and others added 12 commits April 12, 2026 09:59
Matrix bot was missing several features that Telegram has:

1. **Message dedup**: Events replayed during sync were processed multiple
   times. Now uses the same DedupeCache as Telegram keyed by event_id.

2. **Pending task tracking**: Messages arriving while a CLI session is
   active were spawned as independent asyncio tasks with no visibility.
   Now tracked per chat_id in MatrixMessageQueue.

3. **Drain on /stop**: When /stop killed the active CLI, queued message
   tasks immediately acquired the lock and started new sessions, making
   the bot appear to "keep going." /stop and /stop_all now drain all
   pending message tasks before aborting.

4. **Transport merge bug**: Sub-agents with transport=matrix were started
   as Telegram bots because merge_sub_agent_config didn't sync the
   `transports` list, and a Pydantic validator reset `transport` back
   to the first entry in the stale list.

Also adds:
- Matrix support in `ductor agents add` (interactive + CLI flags)
- Provider-aware model selection (shows codex/gemini models, not just claude)
- Transport column in `ductor agents list`
- 14 unit tests for MatrixMessageQueue

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add pytest-xdist to the test extras and default to `-n auto` so the
test suite runs across all CPU cores: 145s → 49s (~3x faster).

Add a justfile with `fix`, `check`, and `test` recipes using just's
[parallel] attribute to run linters, type checks, and tests concurrently.
This brings another roughly 20% speedup (62s → 52s).

Fix failing tests.

Update README with uv installation and just-based dev workflow.
- Add opencode_provider.py: OpenCodeCLI class implementing BaseCLI
- Add opencode_events.py: NDJSON stream parser for OpenCode output
- Add opencode to factory.py: provider selection branch
- Add opencode_cli_parameters to CLIParametersConfig and CLIServiceConfig
- Add check_opencode_auth to auth.py and _CHECKERS dict
- Add opencode to init_wizard.py: CLI detection and i18n
- Add opencode to orchestrator/core.py: both CLIServiceConfig instances
- Add opencode entry to en/wizard.toml i18n

Allows /model command to switch between claude, codex, gemini, and opencode CLIs.
Gemini CLI installed via Homebrew (macOS) stores gemini-cli-core in
Cellar/<version>/libexec/lib/node_modules/ instead of the usual npm
layout.  Add _gemini_cellar_candidates() that walks up the directory
tree from the package root to find the Homebrew Cellar prefix and
discovers models.js there.

Discovery now finds all 7 models instead of 0:
- gemini-2.5-flash, gemini-2.5-flash-lite, gemini-2.5-pro
- gemini-3-flash-preview, gemini-3-pro-preview
- gemini-3.1-pro-preview, gemini-3.1-pro-preview-customtools
- config.py: add opencode model prefixes (minimax/, kimi-for-coding/, opencode/)
  to provider_for() so model IDs route to opencode provider
- providers.py: add opencode to default_model_for_provider() -> minimax/MiniMax-M2.7
- providers.py: add opencode to resolve_session_directive() for @OpenCode directive
- model_selector.py: add opencode branch in _build_model_step() that
  immediately switches to default MiniMax model without showing a picker
- model_selector.py: add opencode to _handle_model_selected() switch so
  opencode models switch immediately (no reasoning effort picker)
- _build_model_step() now takes orch+key args to support async switch
- Remove redundant double-check in _OpenCodeServer.ensure_running()
  (mypy incorrectly flagged second check as unreachable)
- Add explicit None guards for process.stdout/stderr pipes
- Change send_streaming to yield events directly via async for
  delegation instead of returning a coroutine
The cherry-picked OpenCode PR added OpenCode to the provider auth
and model selection logic, but the root /model selector button list
was in the commit we intentionally skipped (which also removed Claude).
Add the OpenCode button without removing Claude.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Matrix startup now mirrors Telegram's full startup sequence:

- Restart sentinel handling (explicit /restart notifications)
- Startup kind detection (first_start, system_reboot, service_restart)
  with user-facing notifications for first start and reboot
- Recovery of interrupted work (in-flight turns, named sessions)
- Restart marker still works as before

Previously Matrix only notified on manual restart markers, missing
first-start, reboot, and recovery notifications entirely.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ault

The OpenCode PR hardcoded `minimax/MiniMax-M2.7` as the default model,
but that model doesn't exist (ProviderModelNotFoundError). Now uses the
configured model when provider is already opencode, falling back to
`opencode/kimi-k2.5` otherwise — matching the pattern used by Claude.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants