-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Expose ExecutionContext.register_csv to the python bindings #524
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
python/src/context.rs
Outdated
| name: &str, | ||
| path: &str, | ||
| has_header: bool, | ||
| delimiter: &[u8], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here we should have a Schema argument exposed as well, but I noticed that FFI hasn't been implemented for Schema and DataType objects in arrow-rs. We should probably expose all of the ArrowSchema based structs there first, then convert pyarrow objects using the C interface rather than calling out to python functions (like the datatype python bindings are currently implemented).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. I only learnt that the Schema and DataType have a c data interface recently. This likely requires some refactoring on the arrow-rs, as it assumes that metadata do not require a specific in-memory alignment, and yet the c data interface makes such requirement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should be able to accept pyarrow.Schema objects once apache/arrow-rs#439 gets merged.
Codecov Report
@@ Coverage Diff @@
## master #524 +/- ##
==========================================
- Coverage 76.03% 76.00% -0.04%
==========================================
Files 157 157
Lines 26990 27001 +11
==========================================
Hits 20521 20521
- Misses 6469 6480 +11
Continue to review full report at Codecov.
|
alamb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this one ready to go @jorgecarleitao ?
|
ping @jorgecarleitao |
|
@jorgecarleitao / @kszucs -- what is the plan for this PR? |
|
Since apache/arrow-rs#439 has been merged I can expose the |
|
Cool -- I am just trying to shepherd PRs that look like they got stale |
|
The test appear to be failing due to #818 |
| }) | ||
| } | ||
|
|
||
| pub fn to_rust_schema(ob: &PyAny) -> PyResult<Schema> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copied from https://github.com/apache/arrow-rs/blob/master/arrow-pyarrow-integration-testing/src/lib.rs#L136
Eventually we could add an optional module to arrow-rs where we implement the PyO3 conversion traits for arrow-rs <-> pyarrow interoperability for easier downstream integration.
|
@alamb this should be good to go, though we should revisit the FFI bindings in arrow-rs and a potential |
alamb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @kszucs - I reviewed the test carefully and skimmed the code. Looks great to me. Thank you so much
| for table in ["csv", "csv1", "csv2"]: | ||
| result = ctx.sql(f"SELECT COUNT(int) FROM {table}").collect() | ||
| result = pa.Table.from_batches(result) | ||
| assert result.to_pydict() == {"COUNT(int)": [4]} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Which issue does this PR close?
Depends on #493
Rationale for this change
What changes are included in this PR?
Are there any user-facing changes?