feat(wpp-retry): PR-1 — idempotency_key migration + silent dedup#78
Open
mt-alarcon wants to merge 2 commits into
Open
feat(wpp-retry): PR-1 — idempotency_key migration + silent dedup#78mt-alarcon wants to merge 2 commits into
mt-alarcon wants to merge 2 commits into
Conversation
…ps 1+2) Step 1 — Migration (models.py + app.py): - TriggerExecution ganha 3 colunas nullable: idempotency_key, error_category, last_replay_at - to_dict() exposto com os 3 campos novos - Auto-migrate idempotente no startup: ALTER TABLE + IF NOT EXISTS em cada bloco - Partial unique index uq_trigger_idem (trigger_id, idempotency_key) WHERE NOT NULL - Basic index ix_trigger_executions_idem_key para lookups por key - SQLite 3.51 confirmado — partial index nativo; EXPLAIN QUERY PLAN confirma uso do índice Step 2 — Silent dedup (triggers.py): - webhook_receiver extrai idem_key de idempotency_key / messageId / data.messageId - Se key já existe: log idempotent_replay + 200 OK silencioso (pattern F6) - Race condition: IntegrityError no db.commit() → rollback + 200 OK silencioso - Legado (GitHub, Stripe, Linear): sem key → idem_key=None → fluxo normal inalterado - Limpeza: current_app movido para import no topo; imports inline removidos Testes passados: migration up/down idempotente, partial index unicidade, NULLs livres, extração de key (6 casos), race condition via IntegrityError, EXPLAIN QUERY PLAN. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Reviewer's GuideImplements the first step of the WhatsApp retry pattern by adding idempotency-related columns and indexes to TriggerExecution, running an auto-migration at app startup, and wiring webhook handling to extract an idempotency key and silently deduplicate duplicate executions using both application logic and a DB-level unique constraint. Sequence diagram for webhook_receiver idempotent deduplicationsequenceDiagram
actor Source
participant Flask as webhook_receiver
participant TE as TriggerExecution
participant DB as db.session
Source->>Flask: POST /triggers/<id>/webhook_receiver
Flask->>Flask: extract idem_key from event_data
alt [idem_key present]
Flask->>TE: TriggerExecution.query.filter_by(trigger_id, idempotency_key)
TE-->>Flask: existing or None
alt [existing found]
Flask->>Flask: current_app.logger.info(evt=idempotent_replay)
Flask-->>Source: 200 OK
else [no existing]
Flask->>DB: add(TriggerExecution(..., idempotency_key))
Flask->>DB: commit()
alt [commit ok]
DB-->>Flask: success
Flask-->>Source: 200 OK (execution created)
else IntegrityError
DB-->>Flask: IntegrityError
Flask->>DB: rollback()
Flask->>Flask: current_app.logger.info(evt=idempotent_replay_race)
Flask-->>Source: 200 OK
end
end
else [no idem_key]
Flask->>DB: add(TriggerExecution(..., idempotency_key=None))
Flask->>DB: commit()
DB-->>Flask: success
Flask-->>Source: 200 OK (execution created)
end
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Hey - I've found 2 issues, and left some high level feedback:
- The
except IntegrityErrorinwebhook_receiverwill treat any integrity violation as an idempotent replay; consider constraining this to the idempotency_key unique index (e.g., by checkingidem_keyis set and/or inspecting the error message) to avoid masking other DB issues. - You’re creating an explicit
ix_trigger_executions_idem_keyindex in the startup migration and also declareidempotency_keywithindex=Trueon the model; this will lead to two separate indexes—consider relying on just the partial unique index plus either the model-level index or the raw SQL one to avoid redundant indexes.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The `except IntegrityError` in `webhook_receiver` will treat any integrity violation as an idempotent replay; consider constraining this to the idempotency_key unique index (e.g., by checking `idem_key` is set and/or inspecting the error message) to avoid masking other DB issues.
- You’re creating an explicit `ix_trigger_executions_idem_key` index in the startup migration and also declare `idempotency_key` with `index=True` on the model; this will lead to two separate indexes—consider relying on just the partial unique index plus either the model-level index or the raw SQL one to avoid redundant indexes.
## Individual Comments
### Comment 1
<location path="dashboard/backend/routes/triggers.py" line_range="353-355" />
<code_context>
+ _cur.execute("ALTER TABLE trigger_executions ADD COLUMN last_replay_at TIMESTAMP")
+ _conn.commit()
+ # Basic index for idempotency lookups by key alone
+ try:
+ _cur.execute(
+ "CREATE INDEX IF NOT EXISTS ix_trigger_executions_idem_key "
</code_context>
<issue_to_address>
**issue (bug_risk):** Differentiate idempotency unique constraint violations from other IntegrityError cases
Catching all IntegrityError instances and treating them as idempotent replays can hide unrelated DB issues. Please either inspect the underlying exception/constraint to confirm it’s specifically the uq_trigger_idem violation, or at least log the full exception and re-raise for non-idempotency cases so real data problems aren’t returned as 200 OK.
</issue_to_address>
### Comment 2
<location path="dashboard/backend/app.py" line_range="630-638" />
<code_context>
+ _cur.execute("ALTER TABLE trigger_executions ADD COLUMN last_replay_at TIMESTAMP")
+ _conn.commit()
+ # Basic index for idempotency lookups by key alone
+ try:
+ _cur.execute(
+ "CREATE INDEX IF NOT EXISTS ix_trigger_executions_idem_key "
+ "ON trigger_executions (idempotency_key)"
</code_context>
<issue_to_address>
**suggestion (bug_risk):** Avoid silently ignoring failures to create the partial unique index for idempotency
The except block currently swallows index-creation failures (e.g., unsupported SQLite version), leaving only best-effort app-level dedup and a race window where duplicates can slip through unnoticed. At minimum, log the failure (including SQLite version and error message) so it’s visible when the DB-level safeguard isn’t active, or consider failing the migration outright if partial indexes aren’t supported on the target runtime.
```suggestion
# Basic index for idempotency lookups by key alone
try:
_cur.execute(
"CREATE INDEX IF NOT EXISTS ix_trigger_executions_idem_key "
"ON trigger_executions (idempotency_key)"
)
_conn.commit()
except Exception as exc:
# Do not silently ignore failures to create this index; log enough context
# (including SQLite versions) so it's visible when the DB-level safeguard
# is not active and we are relying solely on app-level deduplication.
import logging
import sqlite3
library_version = getattr(sqlite3, "sqlite_version", None)
try:
runtime_version = _cur.execute("select sqlite_version()").fetchone()[0]
except Exception:
runtime_version = None
logger = logging.getLogger(__name__)
logger.warning(
"Failed to create ix_trigger_executions_idem_key index; "
"falling back to best-effort app-level idempotency only. "
"sqlite_library_version=%r sqlite_runtime_version=%r error=%r",
library_version,
runtime_version,
exc,
)
```
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
This was referenced May 13, 2026
mt-alarcon
pushed a commit
to mt-alarcon/evo-nexus
that referenced
this pull request
May 13, 2026
…ência Endereça review do Sourcery em PR evolution-foundation#78: Antes, `except IntegrityError` no `webhook_receiver` capturava qualquer violação de integridade e respondia 200 OK silencioso — tratando como "replay idempotente". Qualquer outro problema (NOT NULL, FK quebrada, novo unique constraint adicionado depois) seria mascarado como sucesso e o cliente nunca saberia que o evento não foi processado. Agora o catch só absorve o erro se: 1. `idem_key` está definido (sem chave de idempotência não há como ser violação de uq_trigger_idem) 2. A mensagem do erro do driver menciona `uq_trigger_idem` ou `idempotency_key` (constraint específica) Para os demais IntegrityError, loga com contexto completo (trigger_id, key, mensagem original) e re-raise. O webhook retornará 500 e o erro fica visível em logs em vez de virar 200 silencioso. Não há teste dedicado pro path nessa branch (testes WPP só entram no PR-2/PR-3); sintaxe validada com py_compile.
mt-alarcon
pushed a commit
to mt-alarcon/evo-nexus
that referenced
this pull request
May 13, 2026
…iggers Endereça review do Sourcery em PR evolution-foundation#80 (Comment 1) e estende a mesma proteção para o endpoint de replay (não citado pelo Sourcery, mas contém o mesmo bug latente — `except IntegrityError` genérico). Dois pontos corrigidos em dashboard/backend/routes/triggers.py: 1. `webhook_receiver` (linha ~389): mesmo bug do PR evolution-foundation#78. Capturava qualquer IntegrityError e respondia 200 OK silencioso. Agora só absorve se a mensagem do driver menciona `uq_trigger_idem` ou `idempotency_key` e `idem_key` está definido. Caso contrário, loga erro completo e re-raise. 2. Endpoint `/triggers/executions/<id>/replay` (linha ~476): cria nova execution reusando `ex.idempotency_key`. Tinha o mesmo catch genérico, mascarando outros IntegrityError como "idempotent_skip". Aplicada a mesma proteção, usando `ex.idempotency_key` no check. Sem teste dedicado pra esse path nessa branch — sintaxe validada com py_compile.
mt-alarcon
pushed a commit
to mt-alarcon/evo-nexus
that referenced
this pull request
May 13, 2026
Polish do review do Sourcery em PR evolution-foundation#78 (sugestões code-quality): 1. Remove `index=True` de `TriggerExecution.idempotency_key` em models.py. Estávamos criando 3 índices para a mesma coluna: - `ix_trigger_executions_idempotency_key` (auto-gerado por SQLAlchemy via index=True) - `ix_trigger_executions_idem_key` (raw SQL no startup) - `uq_trigger_idem` (partial unique, raw SQL) Mantemos só os dois últimos — explicitamente criados pela migration de startup, com nomes versionados no rollback plan. 2. Substitui `except Exception: pass` por log estruturado nos dois `CREATE INDEX` (basic e partial unique). Antes, qualquer falha (ex.: versão de SQLite sem suporte a partial index) era engolida silenciosamente — operador só descobriria no primeiro race. Agora loga `sqlite_lib`, `sqlite_runtime` e a exceção original. Sem mudança de comportamento em runtime saudável; só melhora observabilidade quando algo falha.
… índice + logs Consolidado dos fixes pedidos pelo Sourcery no review do PR evolution-foundation#78: 1. `except IntegrityError` restrito à violação de idempotência (dashboard/backend/routes/triggers.py) Antes, qualquer IntegrityError no `webhook_receiver` virava 200 OK silencioso — tratado como "replay idempotente". Outros problemas (NOT NULL, FK quebrada, novo unique constraint adicionado depois) ficariam mascarados como sucesso e o cliente nunca saberia que o evento não foi processado. Agora o catch só absorve o erro se: a) `idem_key` está definido (sem chave de idempotência não há como ser violação de uq_trigger_idem) b) A mensagem do erro do driver menciona `uq_trigger_idem` ou `idempotency_key` (constraint específica) Para os demais IntegrityError, loga com contexto completo (trigger_id, key, mensagem original) e re-raise — webhook retorna 500 e o erro fica visível em logs em vez de virar 200 silencioso. 2. Remove índice redundante (models.py) `TriggerExecution.idempotency_key` estava criando 3 índices pra mesma coluna: - `ix_trigger_executions_idempotency_key` (auto-gerado via index=True no model) - `ix_trigger_executions_idem_key` (raw SQL no startup) - `uq_trigger_idem` (partial unique, raw SQL) Mantemos só os dois últimos — explicitamente criados pela migration de startup, com nomes versionados no rollback plan. `index=True` removido da definição do model. 3. Loga falha de CREATE INDEX em vez de silenciar (app.py) Substitui `except Exception: pass` por log estruturado nos dois `CREATE INDEX` (basic e partial unique). Antes, qualquer falha (ex.: versão de SQLite sem suporte a partial index) era engolida silenciosamente — operador só descobriria no primeiro race. Agora loga `sqlite_lib`, `sqlite_runtime` e a exceção original. Sintaxe validada com `python3 -m py_compile`. Sem teste dedicado pro path nessa branch (testes WPP só entram nos PRs evolution-foundation#79/evolution-foundation#80).
db9e954 to
9c7cd5e
Compare
mt-alarcon
pushed a commit
to mt-alarcon/evo-nexus
that referenced
this pull request
May 13, 2026
… SQL, ApiError Consolidado dos fixes pedidos pelo Sourcery no review do PR evolution-foundation#80: 1. `except IntegrityError` restrito à violação de idempotência (dashboard/backend/routes/triggers.py — 2 pontos) Mesmo bug do PR evolution-foundation#78 estendido a este PR. Dois pontos no triggers.py: a) `webhook_receiver` (~linha 389): mesma correção do PR evolution-foundation#78 — só absorve o erro se a mensagem do driver menciona `uq_trigger_idem` ou `idempotency_key` E `idem_key` está set; caso contrário loga com contexto completo e re-raise. b) Endpoint `/triggers/executions/<id>/replay` (~linha 476): cria nova execution reusando `ex.idempotency_key`. Tinha o mesmo catch genérico, mascarando outros IntegrityError como "idempotent_skip". Aplicada a mesma proteção usando `ex.idempotency_key` no check. (Sourcery não citou este segundo ponto, mas é o mesmo bug — corrigido por simetria.) 2. `_classify_error` — tighten typing - Define alias `ErrorCategory = Literal["transient", "permanent"]`. - Atualiza return type pra refletir o que a função realmente retorna. - Atualiza comentário em `TriggerExecution.error_category` (models.py) que listava 4 categorias (validation/unknown) que nunca eram emitidas pelo classificador — consumidores não vão mais depender de valores inexistentes. 3. `trigger_stats` — parametriza SQL - `datetime('now', '-{days} days')` interpolado vira `:cutoff` bindado (cutoff calculado em Python com `timedelta`). - `WHERE trigger_id IN ({id_list})` vira `IN :wpp_ids` com `bindparam("wpp_ids", expanding=True)` — SQLAlchemy expande pra `(:id_1, :id_2, ...)` e binda cada valor. - Inputs hoje são "seguros" (ints + days constrained), mas parametrizar é mais robusto contra regressões e mais legível. 4. `confirmReplay` (Triggers.tsx) — error handling estruturado - Cria `ApiError` em `lib/api.ts` carregando `status` (número HTTP) e `code` (string do JSON do backend, ex.: `rate_limited`). - `buildError` agora retorna `ApiError` em vez de `Error` simples. - `Triggers.tsx::confirmReplay` checa `e instanceof ApiError && (e.status === 429 || e.code === 'rate_limited')` em vez de `msg.includes('429')`. Mais resiliente a mudanças no formato da mensagem (ex.: tradução, prefixo). - `.message` preservado no mesmo formato (`"<status> <text>: <detail>"`) pra não quebrar os 3 outros call sites que usam `msg.includes(...)` (Backups.tsx, StepBrainConnect.tsx, RestoreSelectRepo.tsx) — migração desses fica pra um PR dedicado. Sintaxe Python validada com `python3 -m py_compile`. Typecheck do frontend deixado pro CI do PR (deps não instaladas localmente).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Context
First of a 3-PR series implementing the WhatsApp retry pattern. Tracks an in-flight feature that adds idempotency to WhatsApp message delivery via the Evolution Go integration.
This PR establishes the foundation: schema migration for
idempotency_keyand silent dedup on the trigger handler. PR-2 (backoff + jitter) and PR-3 (DLQ + UI Replay + instrumentation) follow.Changes
dashboard/backend/models.py:idempotency_keycolumn totriggerstable (nullable, unique index)dashboard/backend/app.py:dashboard/backend/routes/triggers.py:Verification
Series
Summary by Sourcery
Add idempotency support and foundational retry metadata for trigger executions, enabling silent deduplication of duplicate WhatsApp webhook events.
New Features:
Enhancements: