[phase-31 4/4] PostgreSQL metastore — migration + compaction columns#6245
[phase-31 4/4] PostgreSQL metastore — migration + compaction columns#6245
Conversation
quickwit/quickwit-metastore/migrations/postgresql/27_add-compaction-metadata.up.sql
Show resolved
Hide resolved
quickwit/quickwit-metastore/src/metastore/file_backed/file_backed_index/mod.rs
Outdated
Show resolved
Hide resolved
quickwit/quickwit-metastore/src/metastore/file_backed/file_backed_index/mod.rs
Show resolved
Hide resolved
quickwit/quickwit-metastore/src/metastore/postgres/metastore.rs
Outdated
Show resolved
Hide resolved
c8836d6 to
eccf657
Compare
3bbfb71 to
95c3596
Compare
6eeaecd to
f2113e5
Compare
179ccd2 to
ed6d687
Compare
f2113e5 to
0d561b4
Compare
ed6d687 to
a4d0d36
Compare
0d561b4 to
8ba9201
Compare
8ba9201 to
0761d11
Compare
a4d0d36 to
f05d4e7
Compare
df6e699 to
76b703a
Compare
c23e666 to
ee0c5a4
Compare
ee0c5a4 to
605708e
Compare
- Migration 27: add maturity_timestamp, delete_opstamp, node_id columns and publish_timestamp trigger to match the splits table (Paul's review) - ListMetricsSplitsQuery: adopt FilterRange<i64> for time_range (matching log-side pattern), single time_range field for both read and compaction paths, add node_id/delete_opstamp/update_timestamp/create_timestamp/ mature filters to close gaps with ListSplitsQuery - Use SplitState enum instead of stringly-typed Vec<String> for split_states - StoredMetricsSplit: add create_timestamp, node_id, delete_opstamp, maturity_timestamp so file-backed metastore can filter on them locally - File-backed filter: use FilterRange::overlaps_with() for time range and window intersection, apply all new filters matching log-side predicate - Postgres: intersection semantics for window queries, FilterRange-based SQL generation for all range filters - Fix InsertableMetricsSplit.window_duration_secs from Option<i32> to i32 - Rename two-letter variables (ws, sf, dt) throughout Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f7905198f7
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| /// Update timestamp (Unix epoch seconds). | ||
| pub update_timestamp: i64, | ||
| /// Create timestamp (Unix epoch seconds). | ||
| pub create_timestamp: i64, |
There was a problem hiding this comment.
Add default for new create_timestamp field
StoredMetricsSplit now requires create_timestamp during deserialization, but previously persisted file-backed index JSON does not contain this key. When a node upgrades and reloads an index with existing metrics_splits, serde will fail to decode those entries, preventing metastore state from loading. Marking this field with a serde default (like the other newly added fields) is needed for backward-compatible reads.
Useful? React with 👍 / 👎.
Co-authored-by: Matthew Kim <matthew.kim@datadoghq.com>
Co-authored-by: Matthew Kim <matthew.kim@datadoghq.com>
Resolve merge conflicts by taking main's versions of otel_metrics.rs and arrow_metrics.rs (the PR didn't modify these files — conflicts came from the base branch divergence). Kept PR's table_config module export in quickwit-parquet-engine/src/lib.rs. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Pre-existing splits were serialized before the parquet_file field was added, so their JSON doesn't contain it. Adding #[serde(default)] makes deserialization fall back to empty string for old splits. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When the commit timeout fires and the accumulator contains only zero-column batches, union_fields is empty and concat_batches fails with "must either specify a row count or at least one column". Now flush_internal treats empty union_fields the same as empty pending_batches — resets state and returns None. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Resolve Cargo.lock/Cargo.toml merge conflicts - P1 (sort column lookup): Already addressed by sort fields tag_ prefix fix — sort field names now match Parquet column names - P2 (window_start at epoch 0): Remove time_range.start_secs > 0 guard so window_start is computed for all batches when window_duration > 0 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Resolve writer.rs conflict: keep META-07 self-describing roundtrip test - P1 (create_timestamp serde): Add #[serde(default)] to StoredMetricsSplit.create_timestamp for backward-compatible reads of pre-existing file-backed index JSON - P1 (compaction window overlap): No change needed — Bound::Included vs Bound::Excluded already handles half-open interval semantics correctly, and the edge case (zero duration) is impossible - fields.rs: No change — Matt noted it resolves with wide schema rebase Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Summary
Add compaction metadata to the PostgreSQL metastore with migration 27 and updated SQL queries (Phase 31 Metadata Foundation, PR 4 of 4).
Stacks on
gtt/phase-31-writer-wiring(PR #6244).What's included
Migration 27 (
27_add-compaction-metadata.{up,down}.sql):window_start,window_duration_secs,sort_fields,num_merge_ops,row_keys,zonemap_regexesidx_metrics_splits_compaction_scopeon(index_uid, sort_fields, window_start) WHERE split_state = 'Published'stage_metrics_splits:
list_metrics_splits:
Bug fixes (pre-existing on upstream-10b-parquet-actors):
StageMetricsSplitsRequestExtimportindex_idvsindex_uidtype mismatches in publish, mark, and delete functionsIndexUidbinding to sqlx (use.to_string())ListMetricsSplitsResponseExt::try_from_splitstrait disambiguationVerification
cargo build -p quickwit-metastore --features quickwit-metastore/postgres✅Test plan
make test-metrics-e2e🤖 Generated with Claude Code