Skip to content

Document intentionally ignored wait_for_session_map_entry return value The return value was already handled correctly (proceed regardless), but the ignored bool looked like a bug. Add a comment explaining that on timeout the monitor's 2s poll cycle picks up the entry, and thread binding, pending text, and topic rename work without session_map. https://claude.ai/code/session_016c4b8ioybZyscNayeY6Y18#4

Merged
JanusMarko merged 14 commits intomainfrom
claude/fix-duplicate-interactive-messages-FpKAU
Mar 2, 2026

Conversation

@JanusMarko
Copy link
Copy Markdown
Owner

No description provided.

claude added 14 commits March 1, 2026 20:14
Add timestamp-based deduplication in handle_interactive_ui() to prevent
both JSONL monitor and status poller from sending new interactive messages
in the same short window. The check-and-set has no await between them,
making it atomic in the asyncio event loop.

Also add a defensive check in status_polling.py to skip calling
handle_interactive_ui() when an interactive message is already tracked
for the user/thread (e.g. sent by the JSONL monitor path).

https://claude.ai/code/session_016c4b8ioybZyscNayeY6Y18
…returning list snapshot

iter_thread_bindings() was a generator yielding from live dicts. Callers
with await between iterations (find_users_for_session, status_poll_loop)
could allow concurrent unbind_thread() calls to mutate the dict mid-iteration,
causing RuntimeError: dictionary changed size during iteration.

Fix: rename to all_thread_bindings() returning a materialized list snapshot.
The list comprehension captures all (user_id, thread_id, window_id) tuples
eagerly, so no live dict reference escapes across await points.

Changes:
- session.py: iter_thread_bindings -> all_thread_bindings, returns list
- bot.py, status_polling.py: update all 4 call sites
- Remove unused Iterator import from collections.abc
- Add tests: snapshot independence, returns list type, empty bindings

https://claude.ai/code/session_016c4b8ioybZyscNayeY6Y18
queue.join() in handle_new_message blocked the entire monitor loop while
waiting for one user's queue to drain. If Telegram was rate-limiting, this
could stall all sessions for 30+ seconds.

Fix: use enqueue_callable() to push interactive UI handling as a callable
task into the queue. The worker executes it in FIFO order after all pending
content messages, guaranteeing correct ordering without blocking.

Also fixes:
- Callable tasks silently dropped during flood control (the guard checked
  task_type != "content" which matched "callable" too; changed to explicit
  check for "status_update"/"status_clear" only)
- Updated stale docstring in _merge_content_tasks referencing queue.join()

https://claude.ai/code/session_016c4b8ioybZyscNayeY6Y18
unpin_all_forum_topic_messages was used every 60s to detect deleted topics,
but it destructively removed all user-pinned messages as a side effect.

Replace with send_chat_action(ChatAction.TYPING) which is ephemeral
(5s typing indicator) and raises the same BadRequest("Topic_id_invalid")
for deleted topics. All existing error handling works unchanged.

https://claude.ai/code/session_016c4b8ioybZyscNayeY6Y18
- Add MAX_TASK_RETRIES=3 retry loop for short RetryAfter (sleep and retry)
- Re-queue tasks on long RetryAfter (>10s) with MAX_REQUEUE_COUNT=5 cap
- Convert callable_fn from Coroutine to Callable factory (coroutines are
  single-use; retry requires a fresh coroutine each attempt)
- Catch RetryAfter from _check_and_send_status to prevent cosmetic status
  updates from triggering content message re-sends
- Fix test isolation: clear _last_interactive_send in test fixtures

https://claude.ai/code/session_016c4b8ioybZyscNayeY6Y18
The _file_mtimes dict used mtime+size to skip unchanged JSONL files, but
this introduced edge cases (sub-second writes, clock skew, file replacement).
For append-only JSONL files, comparing file size against last_byte_offset is
sufficient and eliminates all mtime-related issues.

https://claude.ai/code/session_016c4b8ioybZyscNayeY6Y18
Previously byte offsets were persisted to disk BEFORE delivering messages
to Telegram. If the bot crashed after save but before delivery, messages
were silently lost. Now offsets are saved AFTER the delivery loop,
guaranteeing at-least-once delivery: a crash before save means messages
are re-read and re-delivered on restart (safe duplicate) rather than
permanently lost.

https://claude.ai/code/session_016c4b8ioybZyscNayeY6Y18
Dead sessions were cleaned from persistent state but never from the
in-memory _pending_tools dict, causing a slow memory leak over time.
Add pop() calls in both cleanup paths (startup + runtime).

https://claude.ai/code/session_016c4b8ioybZyscNayeY6Y18
Previously _pending_thread_text was cleared from user_data BEFORE
attempting to send it to the tmux window. If send_to_window() failed,
the message was lost and the user had to retype it. Now the pending
text is only cleared after a successful send.

https://claude.ai/code/session_016c4b8ioybZyscNayeY6Y18
Typing indicators in forum topics were silently failing because
message_thread_id was not passed to send_chat_action calls. Users
in forum topics wouldn't see typing indicators while Claude worked.

https://claude.ai/code/session_016c4b8ioybZyscNayeY6Y18
The except Exception handler was catching RetryAfter (Telegram 429
rate limiting) and BadRequest("message is not modified"), preventing
proper rate limit propagation and causing unnecessary duplicate
message sends.

Changes:
- Re-raise RetryAfter in both edit and send paths so the queue
  worker retry loop can handle rate limiting correctly
- Treat BadRequest "is not modified" as success (content identical)
- For other BadRequest errors (message deleted, too old), delete
  orphan message before falling through to send new
- Log exception details in catch-all handler for debugging

https://claude.ai/code/session_016c4b8ioybZyscNayeY6Y18
When JSONL monitoring enqueues _send_interactive_ui, the callable may
execute after the interactive UI has been dismissed. This caused stale
callables to potentially send duplicate interactive messages.

Fix: introduce a monotonically incrementing generation counter per
(user_id, thread_id) key. Every state transition (set_interactive_mode,
clear_interactive_mode, clear_interactive_msg) increments the counter.
The JSONL monitor captures the generation at enqueue time and passes it
to handle_interactive_ui via expected_generation parameter. If the
generation has changed by execution time, the function bails out.

The status poller is unaffected (passes None, skipping the guard).

https://claude.ai/code/session_016c4b8ioybZyscNayeY6Y18
The second all_thread_bindings() call gets a fresh snapshot that
naturally excludes entries unbound by the topic probe loop above.
This is correct behavior, not a bug — add a comment to clarify
the intent for future readers.

https://claude.ai/code/session_016c4b8ioybZyscNayeY6Y18
The return value was already handled correctly (proceed regardless),
but the ignored bool looked like a bug. Add a comment explaining that
on timeout the monitor's 2s poll cycle picks up the entry, and thread
binding, pending text, and topic rename work without session_map.

https://claude.ai/code/session_016c4b8ioybZyscNayeY6Y18
@JanusMarko JanusMarko merged commit 3d4e0fa into main Mar 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants