Skip to content

Exposition of KMeans param object for PQ#2005

Open
lowener wants to merge 6 commits intorapidsai:mainfrom
lowener:26.06-pq-kmeans
Open

Exposition of KMeans param object for PQ#2005
lowener wants to merge 6 commits intorapidsai:mainfrom
lowener:26.06-pq-kmeans

Conversation

@lowener
Copy link
Copy Markdown
Contributor

@lowener lowener commented Apr 9, 2026

Closes #1999

Signed-off-by: Mickael Ide <mide@nvidia.com>
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot bot commented Apr 9, 2026

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@lowener lowener added breaking Introduces a breaking change feature request New feature or request C++ labels Apr 9, 2026
lowener added 2 commits April 9, 2026 22:08
Signed-off-by: Mickael Ide <mide@nvidia.com>
@cjnolet cjnolet moved this to In Progress in Unstructured Data Processing Apr 10, 2026
@lowener lowener marked this pull request as ready for review April 15, 2026 16:41
@lowener lowener requested review from a team as code owners April 15, 2026 16:41
Copy link
Copy Markdown
Contributor

@tarang-jain tarang-jain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for this! I just added a comment about one small thing. Apart from that, I have no issues with this PR.

std::min<uint32_t>(in_params.max_train_points_per_pq_code,
n_rows * in_params.pq_kmeans_trainset_fraction / pq_n_centers),
std::min<uint32_t>(in_params.max_train_points_per_vq_cluster,
n_rows * in_params.vq_kmeans_trainset_fraction / in_params.vq_n_centers)};
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could vq_n_centers be set to zero. Perhaps we can just add a guard to avoid zero division.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

breaking Introduces a breaking change C++ feature request New feature or request

Projects

Status: In Progress

Development

Successfully merging this pull request may close these issues.

[FEA] Expose kmeans param object for PQ preprocessing

3 participants