-
-
Notifications
You must be signed in to change notification settings - Fork 19.4k
CLN: memory-mapping code #44766
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CLN: memory-mapping code #44766
Conversation
| df = tm.makeDataFrame() | ||
| df.to_csv(path, mode="w+b") | ||
| tm.assert_frame_equal(df, pd.read_csv(path, index_col=0)) | ||
| def test_binary_mode(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this test and test_warning_missing_utf_bom had nothing to do with mmap but were in its test class.
|
|
||
| # add one entry with a sepcial character | ||
| encoding_ = encoding or "utf-8" | ||
| leonardo = "Léonardo".encode(encoding_, errors="ignore").decode(encoding_) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This more strict test version of the test would have failed on master with the python engine.
| GH 23254. | ||
| """ | ||
| encoding = "iso8859_1" | ||
| data = BytesIO(" 1 A Ä 2\n".encode(encoding)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test wasn't using memory_map because it silently failed.
|
Hello @twoertwein! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found: There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻 Comment last updated at 2021-12-05 23:25:04 UTC |
|
Using TextIOWrapper seems to be slower than the previous solution :( Will revert most of the changes in this PR. |
Rebased on #44761
No need for
codecs.getincrementaldecoderasio.TextWrapperIOwill do that (and we can useio.TextWrapperIObecause mmap is wrapped inside_IOWrapper).io.TextWrapperIOalso provides__next__for us :)Probably will need some benchmarking with utf-8/non-utf8 files.