fix(docs): read capability flags from engine specs in database docs generator#39449
Conversation
✅ Deploy Preview for superset-docs-preview ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #39449 +/- ##
==========================================
- Coverage 64.48% 64.47% -0.01%
==========================================
Files 2566 2566
Lines 133926 133929 +3
Branches 31096 31097 +1
==========================================
Hits 86357 86357
- Misses 46074 46077 +3
Partials 1495 1495
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
The StarRocks catalog flag fix (supports_catalog, supports_dynamic_catalog) is now handled by the separate generator fix PR (#39449), which regenerates databases.json from Python engine spec class attributes. Removing the manual edit here avoids a merge conflict and keeps the source of truth in the generator. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
ff8724e to
12d6b2f
Compare
There was a problem hiding this comment.
Pull request overview
Updates the database documentation generation pipeline so capability flags (catalog/dynamic catalog, dynamic schema, query cancellation, etc.) are derived from the Python engine spec implementations (including inheritance), reducing drift between docs and the engine spec source of truth.
Changes:
- Extend
generate_yaml_docs()to emitsupports_dynamic_catalogand avoid base-spec overwrite whenengine_namecollides. - Enhance the docs generator fallback AST extraction to resolve capability flags/method-based capabilities with inheritance, and stop preserving capability flags from the existing
databases.json. - Regenerate
docs/src/data/databases.jsonwith updated capability flags.
Reviewed changes
Copilot reviewed 2 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
superset/db_engine_specs/lib.py |
Prevents base specs from overwriting product spec docs and adds supports_dynamic_catalog to YAML docs output. |
docs/scripts/generate-database-docs.mjs |
Improves fallback extraction to compute capability flags from engine specs (attrs + method overrides) and limits merge-preservation to runtime-only diagnostics. |
docs/src/data/databases.json |
Regenerated output reflecting updated capabilities from engine specs. |
Code Review Agent Run #a31bc0Actionable Suggestions - 0Review Details
Bito Usage GuideCommands Type the following command in the pull request comment and save the comment.
Refer to the documentation for additional commands. Configuration This repository uses Documentation & Help |
Code Review Agent Run #3cbd95Actionable Suggestions - 0Review Details
Bito Usage GuideCommands Type the following command in the pull request comment and save the comment.
Refer to the documentation for additional commands. Configuration This repository uses Documentation & Help |
There was a problem hiding this comment.
Code Review Agent Run #bdcab5
Actionable Suggestions - 1
-
docs/scripts/generate-database-docs.mjs - 1
- Logic Bug in AST Parsing · Line 144-177
Review Details
-
Files reviewed - 1 · Commit Range:
1800a72..7cd1baf- docs/scripts/generate-database-docs.mjs
-
Files skipped - 0
-
Tools
- Whispers (Secret Scanner) - ✔︎ Successful
- Detect-secrets (Secret Scanner) - ✔︎ Successful
Bito Usage Guide
Commands
Type the following command in the pull request comment and save the comment.
-
/review- Manually triggers a full AI review. -
/pause- Pauses automatic reviews on this pull request. -
/resume- Resumes automatic reviews. -
/resolve- Marks all Bito-posted review comments as resolved. -
/abort- Cancels all in-progress reviews.
Refer to the documentation for additional commands.
Configuration
This repository uses Superset You can customize the agent settings here or contact your Bito workspace admin at evan@preset.io.
Documentation & Help
Code Review Agent Run #014beaActionable Suggestions - 0Review Details
Bito Usage GuideCommands Type the following command in the pull request comment and save the comment.
Refer to the documentation for additional commands. Configuration This repository uses Documentation & Help |
…enerator Update the database docs generator to extract capability flag values directly from Python engine spec class attributes (supports_catalog, supports_dynamic_schema, etc.) and method overrides (cancel_query, estimate_statement_cost, impersonate_user, has_implicit_cancel, etc.) using AST parsing in the fallback path. Previously the generator hardcoded False for all capability flags and relied on mergeWithExistingDiagnostics to preserve manual edits from the existing databases.json. This made it impossible to keep flags in sync with the Python source. Changes: - Fallback AST path now extracts cap flags via AST with proper inheritance resolution (mirrors lib.py has_custom_method logic) - mergeWithExistingDiagnostics now only preserves score/max_score/ time_grains (Flask-context-only fields), not capability flags - lib.py generate_yaml_docs now includes supports_dynamic_catalog in its output, and skips base specs that would overwrite a real product spec's flags for the same engine_name - Regenerated databases.json with corrected flag values Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Python attribute lookup picks the first base that defines the attr; the previous left-to-right update() order caused later bases to override earlier ones, contradicting MRO. Reversing the iteration makes the leftmost base the final writer, which matches Python semantics for the common single-chain hierarchies in db_engine_specs. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Cap attrs assigned via expressions (e.g. DruidEngineSpec.allows_joins =
is_feature_enabled("DRUID_JOINS")) can't be statically evaluated, so
they previously silently fell back to the BaseEngineSpec default —
causing generated docs to disagree with runtime behavior.
Now the Python extractor tracks unresolvable assignments per class,
propagates them through the MRO walk, and emits an
_unresolved_cap_fields marker on the database entry. The JS layer
uses that marker to prefer the value from the previously-generated
databases.json (which was produced with Flask context and reflects
real runtime values) and strips the marker before writing output.
Verified against druid.py: extraction correctly emits
"_unresolved_cap_fields": ["joins"], and the existing JSON's
"joins": false is preserved rather than overwritten with the base
default of true.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
diagnose() in lib.py only counts cancel_query being overridden OR has_implicit_cancel() returning True (equivalent, statically, to overriding has_implicit_cancel since the base returns False). The extractor additionally counted get_cancel_query_id, which could mark cancellation as supported even when cancel_query wasn't overridden. Removed get_cancel_query_id from CAP_METHODS and the query_cancelation check so the generated docs match diagnose(). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
diagnose() in lib.py calls spec.has_implicit_cancel() and uses the return value; the extractor was treating any override as cancel support, regardless of what it returns. An override that explicitly returns False (e.g. ImpalaEngineSpec) should NOT enable query_cancelation — only a cancel_query override should, per has_custom_method(spec, "cancel_query") or spec.has_implicit_cancel(). Added static_return_bool() to statically resolve simple "return <bool>" methods. has_implicit_cancel overrides that explicitly return False are now excluded from direct_methods; True / unresolvable / complex bodies still count (conservative). For current specs the outcome is unchanged (Impala/Hive have cancel_query overrides; Presto/Hive return True), but the heuristic now matches diagnose() exactly and won't misclassify future specs that override has_implicit_cancel to False without a cancel_query override. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
A later bare string literal in a function body is not a docstring and should count as non-trivial logic. Track whether the docstring has been skipped so subsequent string-constant expression statements fall through to the conservative 'other_logic' branch.
693bfaf to
b8123f8
Compare
Code Review Agent Run #95cb7aActionable Suggestions - 0Filtered by Review RulesBito filtered these suggestions based on rules created automatically for your feedback. Manage rules.
Review Details
Bito Usage GuideCommands Type the following command in the pull request comment and save the comment.
Refer to the documentation for additional commands. Configuration This repository uses Documentation & Help |
…enerator (apache#39449) Co-authored-by: Superset Dev <dev@superset.apache.org> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
SUMMARY
The database docs generator (
docs/scripts/generate-database-docs.mjs) previously hardcoded all capability flags toFalsedefaults and relied onmergeWithExistingDiagnosticsto preserve manually-edited values from the committeddatabases.json. This made it impossible to keep capability flags in sync with the Python engine spec source of truth, and caused flags likesupports_catalog,supports_dynamic_schema,query_cancelation, etc. to drift silently.Changes:
docs/scripts/generate-database-docs.mjssupports_dynamic_schema,supports_catalog,supports_dynamic_catalog,disable_ssh_tunneling,supports_file_upload,allows_joins,allows_subqueries) directly from each engine spec class, with proper inheritance resolutioncancel_query/get_cancel_query_id/has_implicit_cancel→query_cancelation;estimate_statement_cost/estimate_query_cost→query_cost_estimation;impersonate_user/update_impersonation_config→user_impersonationmergeWithExistingDiagnosticsnow only preservesscore,max_score, andtime_grains(fields that require Flask context to generate), not capability flags — so regeneration always reflects the current Python sourcesuperset/db_engine_specs/lib.pygenerate_yaml_docs()now includessupports_dynamic_catalogin its output (was previously missing)enginevalue) when they would overwrite a real product spec's data for the sameengine_name(fixesPostgresBaseEngineSpecoverwritingPostgresEngineSpecflags)docs/src/data/databases.jsonsupports_catalog/supports_dynamic_catalognow correct for PostgreSQL, StarRocks, Presto, Trino, Hive, Spark SQL, Snowflake, CockroachDB, and others;query_cancelationnow reflects actual method overridesBEFORE/AFTER SCREENSHOTS OR ANIMATED GIF
N/A — changes affect database docs capability flag display, not visual UI.
TESTING INSTRUCTIONS
npm run gen-db-docsfrom thedocs/directorydocs/src/data/databases.jsonmatches the checked-in file (flags should be stable across runs)catalog: true,dynamic_catalog: truecatalog: false,dynamic_catalog: falsecatalog: true,dynamic_catalog: trueADDITIONAL INFORMATION