Conversation
|
Warning Rate limit exceeded
Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 30 minutes and 51 seconds. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (9)
📝 WalkthroughWalkthroughThis PR introduces a complete archive workflow feature enabling users to create, review, schedule, and execute data archival operations. It spans frontend UI components for archive management, backend REST API endpoints, database schema extensions for scheduling and execution modes, archive scheduling and execution logic, and comprehensive E2E and unit tests. Changes
Sequence DiagramssequenceDiagram
actor User
participant Frontend as Frontend<br/>(Browser)
participant API as REST API<br/>(Backend)
participant DB as Database
participant Queue as Task Queue<br/>(Scheduler)
User->>Frontend: Create new archive
Frontend->>API: GET /metadata
API->>DB: Fetch instances, groups
DB-->>API: Metadata
API-->>Frontend: Groups, instances, methods
Frontend->>API: GET /approval-preview?groupId=X
API->>DB: Fetch approval chain
DB-->>API: Reviewers, audit settings
API-->>Frontend: Preview data
User->>Frontend: Submit archive form
Frontend->>API: POST /archive (one-time or scheduled)
API->>DB: Create ArchiveConfig
API->>DB: Create WorkflowAudit
DB-->>API: Success
API->>Queue: Schedule if scheduled mode
Queue-->>API: Schedule armed
API-->>Frontend: Archive ID, redirect
sequenceDiagram
actor DBA
participant Frontend as Frontend<br/>(Browser)
participant API as REST API<br/>(Backend)
participant DB as Database
participant Queue as Task Queue<br/>(Scheduler)
participant Archive as Archive<br/>Execution
DBA->>Frontend: Open archive detail
Frontend->>API: GET /archive/:id
API->>DB: Fetch ArchiveConfig + logs
DB-->>API: Detail + execution logs
API-->>Frontend: Display workflow state
DBA->>Frontend: Submit approval decision
Frontend->>API: POST /archive/:id/reviews
API->>DB: Update workflow status
alt If Approved & Scheduled
API->>DB: Calculate next_run_at
API->>Queue: Arm schedule
Queue-->>API: Schedule active
end
DB-->>API: Updated
API-->>Frontend: Success
alt If manual Run Now
DBA->>Frontend: Click run-now
Frontend->>API: POST /archive/:id/run
API->>Queue: Queue execution
Queue->>Archive: Execute archive task
Archive->>DB: Record execution log
DB-->>Archive: Log entry created
Archive-->>Queue: Done
Queue-->>API: Queued
API-->>Frontend: Success message
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~55 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
| action, request.user, data["audit_remark"] | ||
| ) | ||
| except AuditException as exc: | ||
| raise serializers.ValidationError({"errors": str(exc)}) |
There was a problem hiding this comment.
Actionable comments posted: 6
🧹 Nitpick comments (6)
sql/models.py (1)
1021-1058: Consider model-level validation forexecution_mode='scheduled'invariants.When
execution_mode='scheduled', valid configurations requireschedule_time(andschedule_weekdayswhenschedule_frequency='weekly'). Currently this is only enforced at runtime insql/archiver.py::calculate_next_archive_run/schedule_archive, so invalid rows can be persisted and only fail later when the scheduler tries to arm them. Consider adding aclean()method (and/orCheckConstraints) so that bad combinations are rejected at save time rather than surfacing as runtime exceptions.Minor consistency nit:
schedule_weekdaysusesdefault=""while the other optional schedule fields usenull=True, default=None. Aligning these would simplify "is unset?" checks.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sql/models.py` around lines 1021 - 1058, Add model-level validation to prevent invalid schedule combinations being saved: implement a clean() method on the model that checks when execution_mode == "scheduled" that schedule_time is set and, if schedule_frequency == "weekly", that schedule_weekdays is set (raise ValidationError on violations); optionally add database-level CheckConstraint(s) mirroring these invariants to guard at DB level. Update save() to call full_clean() or rely on Django admin/forms to invoke clean(), and adjust the field definition of schedule_weekdays to use null=True, default=None (matching schedule_time/schedule_frequency) so "unset" checks are consistent. Reference symbols: execution_mode, schedule_time, schedule_weekdays, schedule_frequency, clean(), save(), and note calculate_next_archive_run/schedule_archive currently enforce these at runtime and should rely on the model validation instead.sql_api/api_archive.py (2)
659-697:get_queryset/getdo not paginateNonesafely and bypassListAPIView's standard flow.Overriding
get(self, request)on aListAPIViewand callingself.paginate_queryset(...)directly works, but skips the normallist()machinery (e.g., filter backends wired throughfilter_querysetcontinue to work because you call it explicitly, but any future addition offilter_backendsor mixin behavior will silently be bypassed). Also,paginate_querysetreturnsNonewhen pagination is disabled (e.g., client sends no page params andCustomizedPaginationreturns None). In that caseserializer = self.get_serializer(None, many=True)will error andself.get_paginated_response(...)requires a paginator-paged result.Two tidier options:
- Remove the custom
getoverride entirely and rely onListAPIView.list(just keepget_queryset,_require_archive_module_accesscan move toinitial/check_permissionsvia a permission class).- If you need the module-access gate, call
_require_archive_module_access(request.user)at the start and thenreturn super().get(request, *args, **kwargs).The same shape applies to
ArchiveLogList.getat lines 1005–1009.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sql_api/api_archive.py` around lines 659 - 697, The custom get override bypasses ListAPIView.list and can call paginate_queryset which may return None; remove the get method and let ListAPIView.handle listing by keeping get_queryset as-is, or alternatively call _require_archive_module_access(request.user) then return super().get(request, *args, **kwargs) to preserve DRF’s list flow; if you need the access gate earlier, move the check into initial()/check_permissions or implement a Permission class used by this view (also apply the same change to ArchiveLogList.get).
783-795: Duplicated approve→arm-schedule logic between create and review paths (and the legacy Django view).Lines 783–795 in
ArchiveListCreate.postand lines 865–882 inArchiveReviewCreate.postimplement nearly the same sequence (mirrorworkflow.statusfrom audit, flipstateon PASSED, computenext_run_atfor scheduled, save, thenschedule_archiveorcancel_archive_schedule). The legacy Django viewarchive_auditinsql/archiver.py(lines 766–781) replicates the same logic a third time. Any divergence (e.g., adding alast_approved_byfield, adjusting cancel semantics) will require three coordinated edits and is an easy source of drift.Consider extracting a helper (e.g.
apply_archive_audit_outcome(workflow, current_status)insql/archiver.py) that encapsulates:
- Setting
stateandnext_run_atbased oncurrent_status.- Saving with
update_fields.- Calling
schedule_archive/cancel_archive_schedule.Then call it from both DRF views and the legacy template view.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sql_api/api_archive.py` around lines 783 - 795, Duplicate approve→schedule/cancel logic is repeated in ArchiveListCreate.post, ArchiveReviewCreate.post and archive_audit; extract a single helper (e.g. apply_archive_audit_outcome(workflow, current_status)) in sql/archiver.py that: sets workflow.status/state based on current_status, computes and assigns next_run_at when execution_mode == ARCHIVE_EXECUTION_SCHEDULED, saves the workflow using update_fields, and calls schedule_archive or cancel_archive_schedule as appropriate; then replace the duplicated blocks in ArchiveListCreate.post and ArchiveReviewCreate.post and the legacy archive_audit view to call this new helper with the workflow and audit.current_status.sql/archiver.py (2)
184-215:schedule_archiveunconditionally flipsstate=True, andcancel_archive_scheduledoes not resetstate.
schedule_archivewritesstate=Truevia.update()on lines 204–209 regardless of the caller's intent, and mutatesarchive_info.statein place. Today all call sites already intendstate=True, but it couples "arm the schedule" with "enable the workflow" and bypasses modelsave()signals/validators. Conversely,cancel_archive_schedule(lines 213–215) only clearsnext_run_atand does not touchstate, so a caller wishing to disable must remember to update state separately (asarchive_switchandarchive_auditcurrently do).Consider making
schedule_archiveonly setnext_run_at(leavingstateto the caller) and keepingcancel_archive_schedulesymmetric; this would also remove the duplicate state writes in callers likearchive_switch(lines 892–907) andarchive_audit(lines 766–781).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sql/archiver.py` around lines 184 - 215, Currently schedule_archive forces ArchiveConfig.state=True and mutates archive_info.state in-place while cancel_archive_schedule only clears next_run_at; make these symmetric by having schedule_archive only set next_run_at (remove the state=True in the ArchiveConfig.objects.filter(...).update(...) call and remove archive_info.state assignment) so state is left to the caller, and keep cancel_archive_schedule as-is (only clearing next_run_at) so callers control enabling/disabling via their own state writes; reference functions: schedule_archive, cancel_archive_schedule, ArchiveConfig, and archive_info.
129-142: DELETE SQL built via string interpolation with a user-authoredWHEREfragment.Ruff S608 fires here, and while the trust model is explicit (requester drafts the condition, DBA approves), a few hardening notes worth considering before merge:
sqlparse.split()is not a security boundary. It is best-effort and has historically split some statements differently across versions; a crafted condition containing unusual comment/terminator sequences may not always produce exactly one statement. The follow-up regex check only asserts the statement starts withDELETE FROM, so an attacker‑controlled condition that sqlparse still returns as a single statement (e.g.,1=1 UNION SELECT ... --style tricks, or nested comments) is not detected here — you are relying entirely on DBA review.ARCHIVE_SAFE_IDENTIFIER_PATTERNpermits only[A-Za-z0-9_$], which rejects legitimate quoted identifiers (e.g. MySQL backtick-escaped names, PG schema-qualifiedschema.table). Scheduled/DML archives against tables needing schema qualification or mixed-case identifiers will be blocked. Since_execute_dml_archivetargets arbitrary source DBs viaquery_engine.execute, consider quoting via the engine's identifier quoting helper rather than a regex whitelist.archive_info.src_db_nameis not validated/quoted here, butquery_engine.execute(db_name=...)is called with it as a connection-level database selector; worth confirming the engine layer treats it safely and not via string interpolation.Not blocking given the approval gate, but worth documenting the trust assumptions in the function docstring.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sql/archiver.py` around lines 129 - 142, The build_archive_delete_sql function currently composes a DELETE via string interpolation (table name checked by ARCHIVE_SAFE_IDENTIFIER_PATTERN, condition from render_archive_condition) and then relies on sqlparse.split + a regex to validate it; instead, either use the query engine's identifier-quoting helper to quote/escape archive_info.src_table_name (replace the manual ARCHIVE_SAFE_IDENTIFIER_PATTERN check and interpolate the quoted identifier), ensure archive_info.src_db_name is validated/quoted or passed safely through the engine API (not string-interpolated), and harden the condition validation (reject any semicolons, comment terminators, and dangerous keywords like UNION/SELECT/INSERT/UPDATE/CREATE or otherwise parse an AST) rather than only using sqlparse.split; if you intentionally accept the current trust model, add a clear docstring on build_archive_delete_sql documenting the DBA-review trust assumptions and the exact guarantees required from render_archive_condition and src_db_name.frontend/src/views/ArchiveCreateView.vue (1)
273-290: Silent groupId reset on instance change can confuse the user.When the user has already selected a resource group and then picks an instance that isn't associated with that group,
form.groupIdis wiped on line 277 without any feedback. The downstream effects are correct (approval preview is cleared by thegroupIdwatcher), but from the user's perspective their group selection silently disappears. Consider either:
- Showing a small inline notice ("Resource group reset because it is not associated with the selected instance."), or
- Filtering the instance dropdown to only show instances compatible with the already-selected group (symmetric to how
eligibleGroupsnarrows groups by instance).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@frontend/src/views/ArchiveCreateView.vue` around lines 273 - 290, The code silently clears form.groupId when the newly selected instance isn't in eligibleInstances (see selectedGroupId, eligibleInstances, and form.groupId) which confuses users; either (A) surface an inline notice by setting a reactive flag (e.g., groupResetNotice) before clearing form.groupId and display that message in the template so users see "Resource group reset because it is not associated with the selected instance.", or (B) prevent the situation by filtering the instance selection to only show instances compatible with the currently selected group (reuse eligibleGroups/eligibleInstances logic to constrain the instance dropdown) so the user cannot pick an incompatible instance in the first place; update the groupId watcher/instance change handling (where form.groupId is set to '') to implement one of these approaches and clear the notice after user action.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@frontend/tests/e2e/support/workflow-helpers.ts`:
- Around line 170-183: The test currently polls the whole document body in
waitForArchiveExecutionState; narrow the selector to the execution-state badge
by reading a stable test id (e.g., the same pattern as "archive-detail-status"
or add a new data-testid like "archive-execution-state") instead of body. Update
waitForArchiveExecutionState to call pollArchiveDetail with a locator that
targets page.locator('[data-testid="archive-execution-state"]') (or the exact
test id you add) and check its textContent for the expected state; ensure
ArchiveDetailView renders that data-testid on the state badge so the test reads
that element only.
In `@sql_api/api_archive.py`:
- Around line 776-781: The code currently echoes raw AuditException text into
API responses in the audit creation paths (call to
audit_handler.create_audit()), which can leak internals; change both exception
handlers that catch AuditException to log the full exception server‑side (use
logger.exception(...)) and raise a serializers.ValidationError with a generic
client message like "Failed to create approval flow. Contact admin." and
re-raise using "from exc" to preserve the cause chain (e.g. raise
serializers.ValidationError({"errors": "Failed to create approval flow. Contact
admin."}) from exc). Ensure this pattern replaces the current uses of str(exc)
in the handlers around audit_handler.create_audit().
- Around line 902-929: The POST handler performs non-atomic checks and writes an
"Archive Queued" audit then enqueues work, creating a TOCTOU and
duplicate-enqueue race; fix by wrapping the permission/approval/state check plus
audit creation and enqueue in a DB transaction that locks the ArchiveConfig row
(use select_for_update on ArchiveConfig inside a transaction in post),
re-checking archive_config.status == WorkflowStatus.PASSED and
archive_config.state before calling Audit.add_log and queue_archive_execution;
additionally, ensure sql/archiver.archive() enforces a concurrency gate (e.g.,
check/set a pending/locked flag under select_for_update or return early if
already queued) so two quick "Run Now" requests cannot both proceed—refer to the
post method, queue_archive_execution, ArchiveConfig, Audit.add_log and the
archive() function in sql/archiver.py when making these changes.
In `@sql/archiver.py`:
- Around line 460-494: The callback currently always reschedules a scheduled
archive even when task.success is False and also doesn't handle
ArchiveConfig.DoesNotExist from ArchiveConfig.objects.get(id=archive_id); change
archive_task_callback so that after fetching archive_info (wrap
ArchiveConfig.objects.get in try/except DoesNotExist and log/return) you
short-circuit rescheduling when the run was scheduled and failed (i.e., if
trigger == "scheduled" and not task.success, do not call
calculate_next_archive_run/schedule_archive), and additionally add one of the
suggested failure-handling actions: increment a failure counter on ArchiveConfig
(e.g., consecutive_failures) and disable scheduling after a threshold or call
notify_for_audit/another notifier when task.success is False to surface
operator-visible alerts; update references in ArchiveConfig
(state/status/next_run_at) and Audit.add_log usage accordingly.
- Around line 449-457: queue_archive_execution can enqueue duplicate archive
jobs because archive() releases its row lock before performing deletes; fix by
ensuring a strong idempotency gate is established inside archive() while still
holding the transaction used for select_for_update: for example, within the same
transaction that runs the select_for_update in archive() create or update a
short-lived ArchiveRun row (unique per archive_id) or set a dedicated running
flag/state on the Archive row before committing so a concurrent task will see
non-runnable state and abort; alternatively implement an advisory lock held for
the lifetime of archive() execution or add enqueue-time deduplication in
queue_archive_execution by checking for an existing in-progress
ArchiveRun—update the functions archive(), queue_archive_execution and any
Archive model logic accordingly so the guard is created atomically with the
state check.
In `@sql/models.py`:
- Around line 1086-1091: Change the model default for archive_method or add a
data migration so historical ArchiveLog rows reflect the API behavior: either
set archive_method = models.CharField(..., default="dml") in the model (replace
the current default "pt_archiver") or create a RunPython data migration in
migration 0011 that updates ArchiveLog.archive_method values to
ArchiveConfig.ARCHIVE_METHOD_DML (or other legacy-detection logic) instead of
blanket "pt_archiver"; ensure ArchiveConfig.ARCHIVE_METHOD_CHOICES and any code
referencing ARCHIVE_METHOD_DML remain consistent with the chosen approach.
---
Nitpick comments:
In `@frontend/src/views/ArchiveCreateView.vue`:
- Around line 273-290: The code silently clears form.groupId when the newly
selected instance isn't in eligibleInstances (see selectedGroupId,
eligibleInstances, and form.groupId) which confuses users; either (A) surface an
inline notice by setting a reactive flag (e.g., groupResetNotice) before
clearing form.groupId and display that message in the template so users see
"Resource group reset because it is not associated with the selected instance.",
or (B) prevent the situation by filtering the instance selection to only show
instances compatible with the currently selected group (reuse
eligibleGroups/eligibleInstances logic to constrain the instance dropdown) so
the user cannot pick an incompatible instance in the first place; update the
groupId watcher/instance change handling (where form.groupId is set to '') to
implement one of these approaches and clear the notice after user action.
In `@sql_api/api_archive.py`:
- Around line 659-697: The custom get override bypasses ListAPIView.list and can
call paginate_queryset which may return None; remove the get method and let
ListAPIView.handle listing by keeping get_queryset as-is, or alternatively call
_require_archive_module_access(request.user) then return super().get(request,
*args, **kwargs) to preserve DRF’s list flow; if you need the access gate
earlier, move the check into initial()/check_permissions or implement a
Permission class used by this view (also apply the same change to
ArchiveLogList.get).
- Around line 783-795: Duplicate approve→schedule/cancel logic is repeated in
ArchiveListCreate.post, ArchiveReviewCreate.post and archive_audit; extract a
single helper (e.g. apply_archive_audit_outcome(workflow, current_status)) in
sql/archiver.py that: sets workflow.status/state based on current_status,
computes and assigns next_run_at when execution_mode ==
ARCHIVE_EXECUTION_SCHEDULED, saves the workflow using update_fields, and calls
schedule_archive or cancel_archive_schedule as appropriate; then replace the
duplicated blocks in ArchiveListCreate.post and ArchiveReviewCreate.post and the
legacy archive_audit view to call this new helper with the workflow and
audit.current_status.
In `@sql/archiver.py`:
- Around line 184-215: Currently schedule_archive forces
ArchiveConfig.state=True and mutates archive_info.state in-place while
cancel_archive_schedule only clears next_run_at; make these symmetric by having
schedule_archive only set next_run_at (remove the state=True in the
ArchiveConfig.objects.filter(...).update(...) call and remove archive_info.state
assignment) so state is left to the caller, and keep cancel_archive_schedule
as-is (only clearing next_run_at) so callers control enabling/disabling via
their own state writes; reference functions: schedule_archive,
cancel_archive_schedule, ArchiveConfig, and archive_info.
- Around line 129-142: The build_archive_delete_sql function currently composes
a DELETE via string interpolation (table name checked by
ARCHIVE_SAFE_IDENTIFIER_PATTERN, condition from render_archive_condition) and
then relies on sqlparse.split + a regex to validate it; instead, either use the
query engine's identifier-quoting helper to quote/escape
archive_info.src_table_name (replace the manual ARCHIVE_SAFE_IDENTIFIER_PATTERN
check and interpolate the quoted identifier), ensure archive_info.src_db_name is
validated/quoted or passed safely through the engine API (not
string-interpolated), and harden the condition validation (reject any
semicolons, comment terminators, and dangerous keywords like
UNION/SELECT/INSERT/UPDATE/CREATE or otherwise parse an AST) rather than only
using sqlparse.split; if you intentionally accept the current trust model, add a
clear docstring on build_archive_delete_sql documenting the DBA-review trust
assumptions and the exact guarantees required from render_archive_condition and
src_db_name.
In `@sql/models.py`:
- Around line 1021-1058: Add model-level validation to prevent invalid schedule
combinations being saved: implement a clean() method on the model that checks
when execution_mode == "scheduled" that schedule_time is set and, if
schedule_frequency == "weekly", that schedule_weekdays is set (raise
ValidationError on violations); optionally add database-level CheckConstraint(s)
mirroring these invariants to guard at DB level. Update save() to call
full_clean() or rely on Django admin/forms to invoke clean(), and adjust the
field definition of schedule_weekdays to use null=True, default=None (matching
schedule_time/schedule_frequency) so "unset" checks are consistent. Reference
symbols: execution_mode, schedule_time, schedule_weekdays, schedule_frequency,
clean(), save(), and note calculate_next_archive_run/schedule_archive currently
enforce these at runtime and should rely on the model validation instead.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 77b0d411-8a21-4de2-b16d-a64ede76ff46
📒 Files selected for processing (16)
frontend/src/App.vuefrontend/src/lib/api.tsfrontend/src/router/index.tsfrontend/src/views/ArchiveCreateView.vuefrontend/src/views/ArchiveDetailView.vuefrontend/src/views/ArchivesView.vuefrontend/tests/e2e/archive-workflow.spec.tsfrontend/tests/e2e/support/workflow-helpers.tssql/archiver.pysql/local_demo.pysql/migrations/0011_archiveconfig_archive_method_and_more.pysql/models.pysql/test_archiver.pysql_api/api_archive.pysql_api/tests.pysql_api/urls.py
| archive_method = models.CharField( | ||
| "Archive Method", | ||
| max_length=20, | ||
| choices=ArchiveConfig.ARCHIVE_METHOD_CHOICES, | ||
| default="pt_archiver", | ||
| ) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
fd -t f '0011_archiveconfig_archive_method_and_more.py' | xargs catRepository: jruszo/Datamingle
Length of output: 2680
🏁 Script executed:
# Search for ArchiveLog model definition to understand the context
rg "class ArchiveLog" -A 20Repository: jruszo/Datamingle
Length of output: 3410
🏁 Script executed:
# Search for references to archive_method and archive execution patterns
rg "archive_method|pt_archiver|dml.*archive" -i --type py -B 2 -A 2Repository: jruszo/Datamingle
Length of output: 16159
🏁 Script executed:
# Look for any data migration or comments about legacy behavior
find . -type f -name "*.py" -path "*/migrations/*" | xargs grep -l "ArchiveLog\|archive" | head -5Repository: jruszo/Datamingle
Length of output: 156
🏁 Script executed:
# Search for any comments or documentation about archive execution
rg "pt.archiver|rendered.*dml|archive.*method" -i --type py | head -20Repository: jruszo/Datamingle
Length of output: 1504
🏁 Script executed:
# Check for any data migration steps in migration 0011
cat sql/migrations/0011_archiveconfig_archive_method_and_more.py | grep -i "runpython\|runsql"Repository: jruszo/Datamingle
Length of output: 43
🏁 Script executed:
# Examine the API serializer defaults more closely
rg "ARCHIVE_METHOD_DML|default.*=.*ARCHIVE" sql_api/api_archive.py -B 2 -A 2Repository: jruszo/Datamingle
Length of output: 971
🏁 Script executed:
# Check if there are any comments or documentation about legacy archive behavior
rg "legacy|historical|backward|compat" -i sql/ --type py | grep -i archiveRepository: jruszo/Datamingle
Length of output: 215
Update model default for archive_method to "dml" to align with API behavior, or add a data migration to intelligently handle historical records.
The migration backfills all pre-existing ArchiveLog rows with default="pt_archiver", but the API serializer defaults to ARCHIVE_METHOD_DML and all execution code explicitly sets the method. This mismatch suggests historical archives likely used DML by default, not pt-archiver. Either update the model default to "dml" to match the API, or add a data migration (RunPython) in 0011 that intelligently backfills historical records based on legacy behavior rather than a blanket default.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@sql/models.py` around lines 1086 - 1091, Change the model default for
archive_method or add a data migration so historical ArchiveLog rows reflect the
API behavior: either set archive_method = models.CharField(..., default="dml")
in the model (replace the current default "pt_archiver") or create a RunPython
data migration in migration 0011 that updates ArchiveLog.archive_method values
to ArchiveConfig.ARCHIVE_METHOD_DML (or other legacy-detection logic) instead of
blanket "pt_archiver"; ensure ArchiveConfig.ARCHIVE_METHOD_CHOICES and any code
referencing ARCHIVE_METHOD_DML remain consistent with the chosen approach.
Summary
Testing
docker exec datamingle-app python manage.py test sql_api.tests.ArchiveApiTests sql.test_archiver.ArchiveExecutionHelpersTestE2E_START_FRONTEND=1 npm run e2e:test -- archive-workflow.spec.tsE2E_START_FRONTEND=1 npm run e2e:test -- workflow-smoke.spec.tsSummary by CodeRabbit
Release Notes