Skip to content

[GLUTEN-10613][VL] Add experimental method to ConfigBuilder and document experimental config#10659

Merged
philo-he merged 2 commits intoapache:mainfrom
zjuwangg:m_ImproveConfig
Sep 15, 2025
Merged

[GLUTEN-10613][VL] Add experimental method to ConfigBuilder and document experimental config#10659
philo-he merged 2 commits intoapache:mainfrom
zjuwangg:m_ImproveConfig

Conversation

@zjuwangg
Copy link
Copy Markdown
Contributor

@zjuwangg zjuwangg commented Sep 9, 2025

What changes are proposed in this pull request?

Add experimental method to ConfigBuilder and add experimental config to doc as in #10613

How was this patch tested?

NA

@github-actions github-actions bot added CORE works for Gluten Core VELOX DOCS labels Sep 9, 2025
@github-actions
Copy link
Copy Markdown

github-actions bot commented Sep 9, 2025

#10613

@github-actions
Copy link
Copy Markdown

github-actions bot commented Sep 9, 2025

Run Gluten Clickhouse CI on x86

@zjuwangg
Copy link
Copy Markdown
Contributor Author

zjuwangg commented Sep 9, 2025

@zhouyuan @philo-he please help review this PR of improving config doc.

Copy link
Copy Markdown
Member

@zhztheplayer zhztheplayer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the API added, we may be able to implement a warning for the case that experimental configurations are set?

@zjuwangg
Copy link
Copy Markdown
Contributor Author

With the API added, we may be able to implement a warning for the case that experimental configurations are set?

yeah, I think so.

Copy link
Copy Markdown
Member

@philo-he philo-he left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One small suggestion. Thanks.

val VELOX_BROADCAST_BUILD_RELATION_USE_OFFHEAP =
buildConf("spark.gluten.velox.offHeapBroadcastBuildRelation.enabled")
.internal()
.experimental()
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we always assume experimental configs fall under the internal category, I think it would be better to add an assertion check in the experimental API to ensure they have been marked as internal. Otherwise, I'm concerned that developers might forget to mark experimental configs as internal, potentially making experimental configs public unintentionally.

Copy link
Copy Markdown
Contributor Author

@zjuwangg zjuwangg Sep 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@philo-he I have also thought add check here during development. But experimental and internal behavior should be orthogonality. And if we add check, we must first call internal and then call experimental which also is confusing.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your reply. My point is we should establish a clear convention to developers and users. It may sound good to make internal and experimental independent.

We can view internal configs as those intended for developers or advanced users only. They can also be experimental—for example, some unstable tracing configs for debugging can be experimental. On the other hand, not all experimental configs are internal. If they can be exposed to end users for early evaluation even though still under experimental, they shouldn't be marked as internal.

Possible rule

  1. Public/internal: Who should see/use it
  2. Experimental: How reliable it is (can be marked as either public or internal based on the first item)

Based on this rule, many experimental configs can be marked as public if they can be exposed to end users.

If this makes sense, we need to update the comments for the internal API, where experimental configs are previously viewed as internal always. And we also need to note that some experimental configs may fall under public category and some under internal category. Then, I feel it may be not good to have a dedicated section to document experimental configs, considering possible duplication. Instead, it may be better to just keep two sections in document: public and internal.

Actually, I am not quite sure which convention is clearer to users and developers. It seems always putting experimental in internal category is clearer? Discussion is welcome. Thanks.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@philo-he @zjuwangg could we move this to a discussion topic for more attention in the community? this is a important update on adding new features, bigger than the purpose of update the documents in this patch.
To me the experimental section would be better to be public otherwise it maybe difficult for people to test it

Copy link
Copy Markdown
Contributor Author

@zjuwangg zjuwangg Sep 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possible rule

Public/internal: Who should see/use it
Experimental: How reliable it is (can be marked as either public or internal based on the first item)

I second this idea.

Then, I feel it may be not good to have a dedicated section to document experimental configs, considering possible duplication

Maybe we also can only list public(but not experimental) config and experimental section.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just update internal() comments and follow following rule to mark config and gen doc.

Possible rule

Public/internal: Who should see/use it
Experimental: How reliable it is (can be marked as either public or internal based on the first item)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zhouyuan, just created a discussion to bring this to the community's attention: #10686

@github-actions
Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

@philo-he philo-he changed the title [GLUTEN-10613][VL] Add experimental method to ConfigBuilder and add experimental config section to doc [GLUTEN-10613][VL] Add experimental method to ConfigBuilder and document experimental config Sep 15, 2025
Copy link
Copy Markdown
Member

@philo-he philo-he left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Thanks.

@philo-he philo-he merged commit e646fe7 into apache:main Sep 15, 2025
57 checks passed
philo-he pushed a commit to philo-he/gluten that referenced this pull request Sep 16, 2025
zhouyuan pushed a commit that referenced this pull request Sep 16, 2025
* [GLUTEN-10450][VL] Reclassify internal/public configs and remove internal configs from doc (#10603)

Remove internal configurations from gluten doc as described (#10450)

* [GLUTEN-10613][VL] Add `experimental` method to ConfigBuilder and document experimental configs (#10659)

* Update config doc

* [VL] Fix Arrow URL typo (#10641)

* Fix enhanced failure

---------

Co-authored-by: Terry Wang <zjuwangg@foxmail.com>
Co-authored-by: Joey <joey.ljy@alibaba-inc.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CORE works for Gluten Core DOCS VELOX

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants