Skip to content

feat: support sql api for dataset#4086

Merged
yanghua merged 14 commits intolance-format:mainfrom
yanghua:feature-sql
Jul 16, 2025
Merged

feat: support sql api for dataset#4086
yanghua merged 14 commits intolance-format:mainfrom
yanghua:feature-sql

Conversation

@yanghua
Copy link
Copy Markdown
Collaborator

@yanghua yanghua commented Jun 25, 2025

Closes #3992

@github-actions github-actions Bot added the enhancement New feature or request label Jun 25, 2025
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Jun 25, 2025

Codecov Report

Attention: Patch coverage is 89.43662% with 15 lines in your changes missing coverage. Please review.

Project coverage is 80.25%. Comparing base (eb5aafe) to head (f940194).
Report is 23 commits behind head on main.

Files with missing lines Patch % Lines
rust/lance/src/dataset/sql.rs 91.24% 7 Missing and 5 partials ⚠️
rust/lance/src/dataset.rs 40.00% 0 Missing and 3 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4086      +/-   ##
==========================================
+ Coverage   80.12%   80.25%   +0.12%     
==========================================
  Files         294      297       +3     
  Lines      103595   105211    +1616     
  Branches   103595   105211    +1616     
==========================================
+ Hits        83008    84432    +1424     
- Misses      17546    17706     +160     
- Partials     3041     3073      +32     
Flag Coverage Δ
unittests 80.25% <89.43%> (+0.12%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@yanghua yanghua marked this pull request as ready for review June 25, 2025 13:55
@yanghua yanghua requested review from jackye1995, westonpace and wjones127 and removed request for westonpace June 25, 2025 13:55
Comment thread rust/lance/src/dataset.rs Outdated
Comment thread rust/lance/src/dataset.rs
Comment thread rust/lance/src/dataset.rs Outdated
Copy link
Copy Markdown
Member

@westonpace westonpace left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @jackye1995 that SQL tends to be more of a database/catalog-level operation and not a table operation.

However, as a convenience (to enable things like sorting and aggregation) it's probably ok as a sort of utility API.

Comment thread rust/lance/src/dataset.rs Outdated
Comment thread rust/lance/src/dataset.rs Outdated
Comment thread rust/lance/src/dataset.rs Outdated
Comment thread rust/lance/src/dataset.rs Outdated
@yanghua
Copy link
Copy Markdown
Collaborator Author

yanghua commented Jun 26, 2025

The background of this requirement is that we need to let users run SQLs on the lightweight sample lance datasets.

The current way lacks flexibility. We use lance scanner to filter an arrow dataset, then forward it to DuckDB to run a SQL on it.

@yanghua yanghua marked this pull request as draft June 26, 2025 13:31
@yanghua yanghua marked this pull request as draft June 26, 2025 13:31
@Jay-ju
Copy link
Copy Markdown
Contributor

Jay-ju commented Jul 2, 2025

@yanghua I have a question. Can the SQL mode here support fts and fast_search?

@Jay-ju
Copy link
Copy Markdown
Contributor

Jay-ju commented Jul 9, 2025

@yanghua @jackye1995 Can this be updated? I want to do something based on this PR.

@yanghua
Copy link
Copy Markdown
Collaborator Author

yanghua commented Jul 9, 2025

Will refactor the PR, a little busy, sorry.

@jackye1995
Copy link
Copy Markdown
Contributor

@Jay-ju can't you just use the DataFusion integration directly? https://lancedb.github.io/lance/integrations/datafusion/

Comment thread rust/lance-core/src/error.rs Outdated
Comment thread rust/lance/src/dataset.rs Outdated
Comment thread rust/lance/src/dataset/sql.rs
Comment thread rust/lance/src/dataset/sql.rs
Comment thread rust/lance/src/dataset/sql.rs
@yanghua yanghua marked this pull request as ready for review July 11, 2025 10:53
@yanghua
Copy link
Copy Markdown
Collaborator Author

yanghua commented Jul 11, 2025

@jackye1995 please take a look when you have time.

Comment thread rust/lance/src/dataset/sql.rs Outdated
Comment thread rust/lance/src/dataset/sql.rs Outdated
Comment thread rust/lance/src/dataset/sql.rs Outdated
Comment thread rust/lance/src/dataset/sql.rs Outdated
Comment thread rust/lance/src/dataset/sql.rs Outdated
Comment thread rust/lance/src/dataset/sql.rs Outdated
Comment thread rust/lance/src/dataset/sql.rs Outdated
Comment thread rust/lance/src/dataset/sql.rs Outdated
@yanghua
Copy link
Copy Markdown
Collaborator Author

yanghua commented Jul 15, 2025

@jackye1995 Addressed your comments, please take a look.

Comment thread rust/lance/src/dataset/sql.rs Outdated
Comment thread rust/lance/src/dataset/sql.rs Outdated
Comment thread rust/lance/src/dataset/sql.rs Outdated
Copy link
Copy Markdown
Contributor

@jackye1995 jackye1995 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me! just 1 minor comment for the mod

Comment thread rust/lance/src/dataset.rs Outdated
@yanghua yanghua merged commit 146176e into lance-format:main Jul 16, 2025
27 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support SQL api for dataset

6 participants