-
Notifications
You must be signed in to change notification settings - Fork 4k
GH-34577 [Python] Expose eol and null_string csv WriteOptions #46976
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
GH-34577 [Python] Expose eol and null_string csv WriteOptions #46976
Conversation
|
|
|
Thank you for submitting a PR!
|
34366df to
dd3ae8f
Compare
58c1bd0 to
6907545
Compare
|
The failures are connected, see: ______________________________ test_write_options ______________________________
def test_write_options():
cls = WriteOptions
opts = cls()
> check_options_class(
cls, include_header=[True, False], delimiter=[',', '\t', '|'],
eol=['\n', '\r\n'], null_string=['', 'NA'],
quoting_style=['needed', 'none', 'all_valid'])
opt/conda/envs/arrow/lib/python3.9/site-packages/pyarrow/tests/test_csv.py:362:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
opt/conda/envs/arrow/lib/python3.9/site-packages/pyarrow/tests/test_csv.py:111: in check_options_class
opts = cls(**non_defaults)
pyarrow/_csv.pyx:1388: in pyarrow._csv.WriteOptions.__init__
???
pyarrow/_csv.pyx:1446: in pyarrow._csv.WriteOptions.null_string.__set__
???
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
> ???
E TypeError: expected bytes, NoneType found |
6907545 to
f95d305
Compare
MartinNowak
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was on vacation and it took a bit longer to get back to this. Hope it runs through now 🤞 @AlenkaF.
python/pyarrow/_csv.pyx
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
???
E TypeError: expected bytes, NoneType found
It found this glorious typo 🤦. Only have semi-working syntax highlighting and didn't have enough time to figure out how to run the tests locally :/.
|
The tests are passing 👍 pyarrow._csv.WriteOptions
-> pyarrow._csv.WriteOptions(include_header=None, *, batch_size=None, delimiter=None, eol=None, null_string=None, quoting_style=None)
PR01: Parameters {'delimiter', 'null_string', 'eol', 'batch_size', 'quoting_style'} not documented
PR04: Parameter "")" has no type |
f95d305 to
08fa441
Compare
python/pyarrow/_csv.pyx
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tests are passing 👍
But the docstrings need an update:
Needed to escape the backslash to render in final doc rather than to break the doc comment.
|
The macOS failures seem to be apache/orc#2357 and were not present previously f95d305 @AlenkaF. |
raulcd
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The macOS failures are fixed on main, can you rebase so they are fixed on CI, please?
|
@github-actions crossbow submit -g python |
|
Revision: 08fa4412afc3bbb2e1599a4ad9557fe8a0f3a075 Submitted crossbow builds: ursacomputing/crossbow @ actions-c05e9be928 |
08fa441 to
220c3fc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR, sorry I took a little to review.
This seems to only be testing that the options class can be generated but is not indeed testing the options, on test_csv.py we have other tests that indeed test the options like:
arrow/python/pyarrow/tests/test_csv.py
Line 1929 in 697f501
| def test_write_read_round_trip(): |
or
arrow/python/pyarrow/tests/test_csv.py
Line 1973 in 697f501
| def test_write_quoting_style(): |
Can we test the options are indeed working as expected when writing the CSV?
Rationale for this change
Expose remaining csv write options to pyarrow.
What changes are included in this PR?
Adding
eolandnull_stringto csv.WriteOptions.Are these changes tested?
Yes, testing of setters included.
Are there any user-facing changes?