feat: support sql api for dataset#4086
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #4086 +/- ##
==========================================
+ Coverage 80.12% 80.25% +0.12%
==========================================
Files 294 297 +3
Lines 103595 105211 +1616
Branches 103595 105211 +1616
==========================================
+ Hits 83008 84432 +1424
- Misses 17546 17706 +160
- Partials 3041 3073 +32
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
westonpace
left a comment
There was a problem hiding this comment.
I agree with @jackye1995 that SQL tends to be more of a database/catalog-level operation and not a table operation.
However, as a convenience (to enable things like sorting and aggregation) it's probably ok as a sort of utility API.
|
The background of this requirement is that we need to let users run SQLs on the lightweight sample lance datasets. The current way lacks flexibility. We use lance scanner to filter an arrow dataset, then forward it to DuckDB to run a SQL on it. |
|
@yanghua I have a question. Can the SQL mode here support fts and fast_search? |
|
@yanghua @jackye1995 Can this be updated? I want to do something based on this PR. |
|
Will refactor the PR, a little busy, sorry. |
|
@Jay-ju can't you just use the DataFusion integration directly? https://lancedb.github.io/lance/integrations/datafusion/ |
|
@jackye1995 please take a look when you have time. |
|
@jackye1995 Addressed your comments, please take a look. |
jackye1995
left a comment
There was a problem hiding this comment.
looks good to me! just 1 minor comment for the mod
Closes #3992