Skip to content

feat(sql): Postgres-style EXPLAIN (...) option list#21768

Draft
adriangb wants to merge 3 commits intoapache:mainfrom
pydantic:explain-postgres-option-list
Draft

feat(sql): Postgres-style EXPLAIN (...) option list#21768
adriangb wants to merge 3 commits intoapache:mainfrom
pydantic:explain-postgres-option-list

Conversation

@adriangb
Copy link
Copy Markdown
Contributor

Which issue does this PR close?

  • Closes #.

(Follow-up to #21160, which introduced per-category metric filtering via session config. This PR lets users reach those knobs inline from the EXPLAIN statement.)

Rationale for this change

#21160 added metric categories (Rows, Bytes, Timing, Uncategorized) and a verbosity level (Summary, Dev) to DataFusion's metrics, exposed today only via session config:

  • datafusion.explain.analyze_categories
  • datafusion.explain.analyze_level

Users have to SET these out-of-band before running EXPLAIN ANALYZE, which is awkward for ad-hoc debugging. Postgres solves this with its parenthesized option list:

EXPLAIN (ANALYZE, BUFFERS, VERBOSE, SETTINGS, WAL) SELECT ... ;

This PR adds the same ergonomics to DataFusion, mapping option names to DataFusion's existing semantics rather than Postgres's buffer/WAL model.

What changes are included in this PR?

Parser. On dialects whose supports_explain_with_utility_options() returns true (the default GenericDialect, PostgreSqlDialect, DuckDbDialect, etc.), DFParser::parse_explain delegates to sqlparser's pub fn parse_utility_options() and feeds the result through a new ExplainStatementOptions::from_utility_options. The legacy keyword form (EXPLAIN ANALYZE VERBOSE FORMAT tree ...) is unchanged.

Normalized option type. A new ExplainStatementOptions in datafusion-common captures the knobs parsed from either form. Argument parsing reuses existing ExplainFormat::from_str, ExplainAnalyzeCategories::from_str, and MetricType::from_str.

Options accepted:

Option Argument Effect
ANALYZE bool, default T Same as keyword ANALYZE
VERBOSE bool, default T Same as keyword VERBOSE
FORMAT ident/string indent / tree / pgjson / graphviz
METRICS string 'all', 'none', or comma-separated rows,bytes,timing,uncategorized
LEVEL ident/string summary or dev
TIMING bool Sugar: toggles inclusion of the timing category
SUMMARY bool Sugar: TRUE → summary, FALSE → dev
COSTS bool Per-statement show_statistics override (not valid with ANALYZE)

Postgres-only options (BUFFERS, WAL, SETTINGS, GENERIC_PLAN, MEMORY) return a helpful unsupported-option error.

Logical plan. Analyze gains analyze_level: Option<MetricType> and analyze_categories: Option<ExplainAnalyzeCategories>. Explain gains show_statistics: Option<bool>. None means "fall back to session config" — existing callers are unchanged.

Physical planner. handle_analyze and handle_explain prefer statement-level overrides over session config before constructing AnalyzeExec / ExplainExec. AnalyzeExec itself needs no change — it already accepts the filters from #21160.

Proto (follow-up, see TODOs in datafusion/proto/src/logical_plan/mod.rs): the new override fields are not yet serialized. They default to None on the remote side, matching pre-PR behavior; round-trip tests still pass.

Are these changes tested?

Yes:

  • Unit tests in datafusion/sql/src/parser.rs cover legacy keyword form on PostgreSQL dialect, each option form (bare, = val, ON/OFF, quoted), unknown-option errors, dialect gating (the parenthesized form is rejected under a dialect that doesn't enable it), and the error path for unsupported Postgres-only options.
  • Integration tests in datafusion/core/tests/sql/explain_analyze.rsexplain_analyze_paren_metrics_filtering, explain_analyze_paren_level_overrides_session_config, explain_analyze_paren_metrics_overrides_session_config, explain_paren_buffers_rejected.
  • sqllogictest fixtures in datafusion/sqllogictest/test_files/explain.slt covering the parenthesized form, round-trip with the legacy form, and each error path.

Ran cargo fmt --all and cargo clippy --all-targets --all-features -- -D warnings (clean). Two pre-existing test failures on main (test_display_pg_json snapshot and a pgjson SLT case at explain.slt:642) are unrelated to this change — verified by running them against a clean checkout of the same base commit.

Are there any user-facing changes?

Yes — new syntax. User-facing docs updated at docs/source/user-guide/explain-usage.md with a new section describing the option list and the dialect gate. No breaking changes: the legacy keyword form continues to work exactly as before.

🤖 Generated with Claude Code

@github-actions github-actions Bot added documentation Improvements or additions to documentation sql SQL Planner logical-expr Logical plan and expressions core Core DataFusion crate sqllogictest SQL Logic Tests (.slt) common Related to common crate proto Related to proto crate labels Apr 21, 2026
adriangb and others added 3 commits April 21, 2026 16:01
Extends DataFusion's `EXPLAIN` to accept a Postgres-style parenthesized
option list alongside the existing keyword form, on dialects that
enable it (the default `GenericDialect`, `PostgreSqlDialect`,
`DuckDbDialect`, etc.).

This surfaces the metric-category and verbosity knobs introduced in
PR apache#21160 (currently only reachable via `SET`) directly in the
statement, matching Postgres's one-liner ergonomics:

    EXPLAIN (ANALYZE, VERBOSE, METRICS 'rows,bytes', LEVEL dev) SELECT ...

Options recognized: `ANALYZE`, `VERBOSE`, `FORMAT`, `METRICS`, `LEVEL`,
`TIMING`, `SUMMARY`, `COSTS`. Statement-level values override the
corresponding session config. Postgres-only options that DataFusion
does not model (`BUFFERS`, `WAL`, `SETTINGS`, `GENERIC_PLAN`, `MEMORY`)
return a clear unsupported-option error rather than silently accepting
them.

The legacy keyword form (`EXPLAIN ANALYZE VERBOSE FORMAT tree ...`) is
unchanged.

Parser delegates to sqlparser's `parse_utility_options()` under the
dialect gate; a new `ExplainStatementOptions` struct in
`datafusion-common` normalizes both forms into a single representation
that flows through `explain_to_plan` into the `Analyze` / `Explain`
logical plan nodes. `handle_analyze` / `handle_explain` in the
physical planner prefer statement-level overrides over session config
before constructing `AnalyzeExec` / `ExplainExec`.

Proto serialization of the new fields is left as a follow-up (TODO
comments in `datafusion/proto/src/logical_plan/mod.rs`); fields
default to `None` on the other side, matching prior behavior.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
DataFusion has long accepted parentheses wrapping the query after
`EXPLAIN` (e.g. `EXPLAIN (SELECT ...)` and
`EXPLAIN (q1 EXCEPT q2) UNION ALL (q3 EXCEPT q4)`). The initial cut of
the Postgres-style option-list parser treated every leading `(` as an
option list, breaking those cases.

Disambiguate by peeking one token past the `(`: if it starts a query
(`SELECT`, `WITH`, `VALUES`, `TABLE`, `INSERT`, `UPDATE`, `DELETE`,
`MERGE`, or another `(`), fall through to the legacy parser.

Adds `explain_paren_grouping_query_is_not_mistaken_for_options` to
cover the regression set that CI surfaced (`references.slt`,
`union.slt`).

Also reformats `docs/source/user-guide/explain-usage.md` to satisfy
`prettier 2.7.1` (column-width alignment only).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`parse_explain` is public and its doc linked to the private
`token_starts_query` helper, which makes rustdoc error out. Drop the
link; the prose already conveys the disambiguation logic.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@adriangb adriangb force-pushed the explain-postgres-option-list branch from 8303099 to e7db7bb Compare April 21, 2026 21:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

common Related to common crate core Core DataFusion crate documentation Improvements or additions to documentation logical-expr Logical plan and expressions proto Related to proto crate sql SQL Planner sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant