Skip to content

docs: expand the FTS index doc explaining the training process and multiple partitions#5988

Merged
westonpace merged 2 commits intolance-format:mainfrom
westonpace:doc/expand-on-fts-index-spec
Feb 24, 2026
Merged

docs: expand the FTS index doc explaining the training process and multiple partitions#5988
westonpace merged 2 commits intolance-format:mainfrom
westonpace:doc/expand-on-fts-index-spec

Conversation

@westonpace
Copy link
Copy Markdown
Member

No description provided.

@github-actions
Copy link
Copy Markdown
Contributor

ACTION NEEDED
Lance follows the Conventional Commits specification for release automation.

The PR title and description are used as the merge commit message. Please update your PR title and description to match the specification.

For details on the error please inspect the "PR Title Check" action.

@github-actions
Copy link
Copy Markdown
Contributor

Code Review

This is a documentation-only PR adding explanations of the FTS index training process, configuration, and distributed training. I verified the documented values against the implementation.

Technical accuracy check:

  • ✅ Environment variable defaults match code: LANCE_FTS_NUM_SHARDS (num compute CPUs), LANCE_FTS_PARTITION_SIZE (256 MiB), LANCE_FTS_TARGET_SIZE (4096 MiB)
  • ✅ Fragment mask formula (fragment_id as u64) << 32 matches code comment at builder.rs:106
  • ✅ Per-partition metadata filename pattern part_<id>_metadata.lance matches code at builder.rs:1102
  • skip_merge behavior accurately described

No P0/P1 issues found. The documentation is accurate, well-organized, and provides valuable operational guidance for users configuring FTS indexes at scale.

LGTM 👍

Copy link
Copy Markdown
Contributor

@prrao87 prrao87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Could you also please add a note in the quickstart page on FTS pointing to this page? This page is buried under the scalar index page and might be hard to find unless a user specifically searches for FTS (in which case they might stumble across the quickstart page, so it's best to lead the reader here for advanced usage).

At the end of quickstart/full-text-search/, we could add an H2 header with the following text:

## Advanced usage

For advanced usage instructions with different tokenizers and more technical details on the index training process, visit the [full-text index](../format/table/index/scalar/fts/) section.

@westonpace
Copy link
Copy Markdown
Member Author

Could you also please add a note in the quickstart page on FTS pointing to this page?

@prrao87 done. I also fixed a number of broken links.

@westonpace westonpace changed the title doc: expand the FTS index doc explaining the training process and multiple partitions docs: expand the FTS index doc explaining the training process and multiple partitions Feb 24, 2026
@github-actions github-actions Bot added the documentation Improvements or additions to documentation label Feb 24, 2026
@westonpace westonpace merged commit 0e53078 into lance-format:main Feb 24, 2026
5 of 6 checks passed
wjones127 pushed a commit to wjones127/lance that referenced this pull request Feb 25, 2026
wjones127 pushed a commit to wjones127/lance that referenced this pull request Feb 25, 2026
wjones127 pushed a commit that referenced this pull request Feb 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants