Conversation
WalkthroughFrontend and backend components are enhanced to support PII redaction and anonymization. Frontend markdown rendering now processes and displays redacted tokens, with utility functions exposed for token escaping. Backend propagates anonymization preferences through system prompts and artifact generation, with conditional PII redaction blocks added across multiple locale-specific templates. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested labels
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment Tip You can generate walkthrough in a markdown collapsible section to save space.Enable the |
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (4)
echo/server/prompt_templates/get_reply_system.en.jinja (1)
1-24:⚠️ Potential issue | 🟠 MajorDrop the
keytermsexception in anonymized mode.The allow-list in Lines 37-38 is both undefined and unsafe: this template never exposes a
keytermssection, so the model cannot apply the rule deterministically, and if that input is wired in later it becomes a straight bypass for anonymization for any name/email/phone placed there. Remove the exception or constrain it to a separate non-PII allow-list, then mirror that fix across the copied blocks in the localized reply and artifact templates.🔒 Suggested change
- - Allow-list: if a detected PII token or exact multi-token phrase appears in keyterms, KEEP it as spoken (no redaction). + - Do not whitelist detected PII from project metadata or key terms. If a token matches a PII class, redact it.Also applies to: 37-38
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@echo/server/prompt_templates/get_reply_system.en.jinja` around lines 1 - 24, Remove the unsafe `keyterms` exception from the anonymization allow-list in the get_reply_system template: eliminate any branch or rule that treats a `keyterms` section as a bypass for PII removal, or alternatively replace it with a clearly-named non-PII allow-list (e.g., `non_pii_allowlist`) and enforce that only non-PII tokens are permitted; update the same change in the localized reply and artifact templates (the copied blocks that reference `keyterms`) so no template exposes an undefined or unrestricted `keyterms` allow-list that could bypass anonymization.echo/server/dembrane/reply_utils.py (2)
176-187:⚠️ Potential issue | 🟠 MajorPrompt-level redaction still leaks participant names.
When
is_anonymizedis true, we only add apii_redactioninstruction, but the current conversation and every adjacent conversation still serializeparticipant_nameverbatim. The model still sees real names and can echo them back, so anonymized replies stay best-effort. Replace names with<redacted_name>or drop the field beforeformat_conversation().Also applies to: 280-285, 299-307
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@echo/server/dembrane/reply_utils.py` around lines 176 - 187, The code builds Conversation objects with participant_name even when is_anonymized is true, so redact or remove participant_name before creating or serializing Conversations passed to format_conversation(); specifically, when is_anonymized is true set Conversation.name to "<redacted_name>" (or delete the field) for current_conversation and any adjacent conversations built the same way, and ensure build_conversation_transcript/format_conversation are not re-introducing real names. Update the Conversation constructors and the other similar blocks that construct adjacent conversations (the other Conversation-creation sites) to apply the same redaction.
124-164:⚠️ Potential issue | 🟠 MajorDon't block the event loop with Directus and token counting here.
generate_reply_for_conversation()is async, but thesedirectus.get_items/directus.create_itemcalls and repeatedtoken_counter(...)invocations still run synchronously on the request path. On large projects, this will pin the worker during prompt assembly and reply persistence. Push them behindawait run_in_thread_pool(...).As per coding guidelines "Always wrap blocking I/O calls using
run_in_thread_poolfromdembrane.async_helpersin backend code. Wrap calls todirectus.*... and CPU-heavy utilities like token counting or summary generation if they are sync."Also applies to: 237-255, 287-327, 470-477
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@echo/server/dembrane/reply_utils.py` around lines 124 - 164, generate_reply_for_conversation currently calls blocking functions (directus.get_items, directus.create_item and synchronous token_counter calls) directly on the async path; wrap those calls using await run_in_thread_pool(...) from dembrane.async_helpers so they execute off the event loop. Specifically, replace directus.get_items and any directus.create_item invocations inside generate_reply_for_conversation with calls executed via run_in_thread_pool, and wrap synchronous token_counter and other CPU-bound utilities used during prompt assembly and reply persistence similarly. Also apply the same wrapping to the other occurrences in this module where directus.* or token_counter are called (e.g., the blocks that assemble prompts and persist replies), ensuring you import run_in_thread_pool and keep call signatures consistent when moving the sync call into the thread pool.echo/server/dembrane/api/verify.py (1)
744-775:⚠️ Potential issue | 🟠 MajorDon't pass the participant name into anonymized artifact prompts.
is_anonymizedonly redacts email here. We still injectparticipant_nameunchanged, which lets the model reuse a real name despite the newpii_redactionguard. Redact or omit it alongside email.🐛 Minimal fix
participant_name = conversation.get("participant_name") if participant_name: - parts.append(f"Participant name: {participant_name}") + parts.append( + f"Participant name: {'<redacted_name>' if is_anonymized else participant_name}" + )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@echo/server/dembrane/api/verify.py` around lines 744 - 775, The function _build_user_message_content currently only redacts participant_email but still injects participant_name; when is_anonymized is True you must redact or omit participant_name as well (e.g., use '<redacted_name>' or skip adding the "Participant name" line). Update the logic around participant_name in _build_user_message_content to respect the is_anonymized flag and also ensure any previous-artifact rendering via _format_previous_artifacts does not reintroduce the real name (pass a redaction flag into _format_previous_artifacts or ensure it performs its own redaction).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@echo/frontend/src/components/common/RedactedText.tsx`:
- Around line 5-33: The current escape/unescape uses the public inline-code form
"redacted:" and causes collisions; change to a collision-safe internal sentinel
and update the related regexes and functions: replace REDACTED_CODE_PREFIX with
a non-colliding sentinel string (e.g. include unlikely characters or a GUID-like
token) and update SAFE_REDACTED_PATTERN to match that sentinel, then update
escapeRedactedTokens to emit `\`<sentinel><type>\`` and unescapeRedactedTokens
to reverse only that sentinel pattern; ensure REDACTED_PATTERN (the original
incoming token regex) and the replacement logic in escapeRedactedTokens and
unescapeRedactedTokens are consistent so normal user-authored code like
`redacted:name` is no longer transformed.
In `@echo/server/prompt_templates/generate_artifact.en.jinja`:
- Around line 20-25: The PII redaction block references an allow-list via
`keyterms` but the template never receives `keyterms`, making the rule
non-deterministic; either remove the allow-list clause or accept and use a
`keyterms` context variable in this template so the rule can be applied
deterministically. Update the `PII Redaction (MANDATORY)` block: if you choose
to keep the allow-list, add `keyterms` to the template context and reference it
wherever the allow-list is checked; otherwise delete the sentence that preserves
detected PII when in `keyterms` so all detected PII is consistently redacted.
Ensure references in the template to `keyterms` (the allow-list) match the exact
variable name you add to callers so behavior is deterministic.
In `@echo/server/prompt_templates/get_reply_system.en.jinja`:
- Around line 31-50: The PII redaction rules are duplicated across multiple
Jinja templates (get_reply_system.*.jinja and revise_artifact.en.jinja); extract
the common placeholder map and cleanup rules into a single shared Jinja macro or
data file (e.g., redaction_macros.jinja providing a macro or variables) and
replace the duplicated blocks in each template with an import/use of that macro
(use {% import %} or {% include %} and call the macro to render the rules),
leaving only the locale-specific prose inside each get_reply_system.*.jinja and
revise_artifact.en.jinja file; update references to the placeholders so
templates preserve existing <redacted_*> tokens exactly and keep the
allow-list/cleanup behavior inside the shared macro.
---
Outside diff comments:
In `@echo/server/dembrane/api/verify.py`:
- Around line 744-775: The function _build_user_message_content currently only
redacts participant_email but still injects participant_name; when is_anonymized
is True you must redact or omit participant_name as well (e.g., use
'<redacted_name>' or skip adding the "Participant name" line). Update the logic
around participant_name in _build_user_message_content to respect the
is_anonymized flag and also ensure any previous-artifact rendering via
_format_previous_artifacts does not reintroduce the real name (pass a redaction
flag into _format_previous_artifacts or ensure it performs its own redaction).
In `@echo/server/dembrane/reply_utils.py`:
- Around line 176-187: The code builds Conversation objects with
participant_name even when is_anonymized is true, so redact or remove
participant_name before creating or serializing Conversations passed to
format_conversation(); specifically, when is_anonymized is true set
Conversation.name to "<redacted_name>" (or delete the field) for
current_conversation and any adjacent conversations built the same way, and
ensure build_conversation_transcript/format_conversation are not re-introducing
real names. Update the Conversation constructors and the other similar blocks
that construct adjacent conversations (the other Conversation-creation sites) to
apply the same redaction.
- Around line 124-164: generate_reply_for_conversation currently calls blocking
functions (directus.get_items, directus.create_item and synchronous
token_counter calls) directly on the async path; wrap those calls using await
run_in_thread_pool(...) from dembrane.async_helpers so they execute off the
event loop. Specifically, replace directus.get_items and any
directus.create_item invocations inside generate_reply_for_conversation with
calls executed via run_in_thread_pool, and wrap synchronous token_counter and
other CPU-bound utilities used during prompt assembly and reply persistence
similarly. Also apply the same wrapping to the other occurrences in this module
where directus.* or token_counter are called (e.g., the blocks that assemble
prompts and persist replies), ensuring you import run_in_thread_pool and keep
call signatures consistent when moving the sync call into the thread pool.
In `@echo/server/prompt_templates/get_reply_system.en.jinja`:
- Around line 1-24: Remove the unsafe `keyterms` exception from the
anonymization allow-list in the get_reply_system template: eliminate any branch
or rule that treats a `keyterms` section as a bypass for PII removal, or
alternatively replace it with a clearly-named non-PII allow-list (e.g.,
`non_pii_allowlist`) and enforce that only non-PII tokens are permitted; update
the same change in the localized reply and artifact templates (the copied blocks
that reference `keyterms`) so no template exposes an undefined or unrestricted
`keyterms` allow-list that could bypass anonymization.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: b7d65812-545b-41ab-8b3f-7940d5778f41
📒 Files selected for processing (20)
echo/frontend/src/components/common/Markdown.tsxecho/frontend/src/components/common/RedactedText.tsxecho/frontend/src/components/form/MarkdownWYSIWYG/MarkdownWYSIWYG.tsxecho/frontend/src/components/participant/UserChunkMessage.tsxecho/frontend/src/locales/de-DE.poecho/frontend/src/locales/en-US.poecho/frontend/src/locales/es-ES.poecho/frontend/src/locales/fr-FR.poecho/frontend/src/locales/it-IT.poecho/frontend/src/locales/nl-NL.poecho/server/dembrane/api/verify.pyecho/server/dembrane/reply_utils.pyecho/server/prompt_templates/generate_artifact.en.jinjaecho/server/prompt_templates/get_reply_system.de.jinjaecho/server/prompt_templates/get_reply_system.en.jinjaecho/server/prompt_templates/get_reply_system.es.jinjaecho/server/prompt_templates/get_reply_system.fr.jinjaecho/server/prompt_templates/get_reply_system.it.jinjaecho/server/prompt_templates/get_reply_system.nl.jinjaecho/server/prompt_templates/revise_artifact.en.jinja
💤 Files with no reviewable changes (1)
- echo/frontend/src/components/participant/UserChunkMessage.tsx
Summary by CodeRabbit
Release Notes
New Features
Improvements