Skip to content

Conversation

@jakirkham
Copy link
Member

Fixes #669

LMDBStore had its own way of handling text before ensure_text was added to Numcodecs. With that utility function now available, we can simplify this a bit and avoid some subtleties of the LMDB implementation.

Similarly DBMStore had some of its own text handling issues that can be solved the same way.

Finally inline all str.encode calls to make it clearer what is happening in both LMDBStore and DBMStore.

TODO:

  • Add unit tests and/or doctests in docstrings
  • Add docstrings and API docs for any new/modified user-facing classes and functions
  • New/modified features documented in docs/tutorial.rst
  • Changes documented in docs/release.rst
  • AppVeyor and Travis CI passes
  • Test coverage is 100% (Coveralls passes)

@jakirkham jakirkham mentioned this pull request Dec 3, 2020
@jakirkham jakirkham changed the title Simp text handling in DB Stores Simplify text handling in DB Stores Dec 3, 2020
@Carreau
Copy link
Contributor

Carreau commented Dec 3, 2020

Would that bump the minimum version of lmdb ?

@codecov
Copy link

codecov bot commented Dec 3, 2020

Codecov Report

Merging #670 (ed1ffca) into master (a5dfc3b) will not change coverage.
The diff coverage is 100.00%.

@@            Coverage Diff            @@
##            master      #670   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           27        27           
  Lines        10094     10087    -7     
=========================================
- Hits         10094     10087    -7     
Impacted Files Coverage Δ
zarr/storage.py 100.00% <100.00%> (ø)

@jakirkham
Copy link
Member Author

Would that bump the minimum version of lmdb ?

Shouldn't affect it.

Basically ensure_text leverages the Python Buffer Protocol to coerce the data (bytes or other object) to a NumPy array without copying. Effectively this normalizes the data we are working with so we don't need to know about LMDB Buffer objects or whatever other object might come up ( MongoDB had a similar issue #401 ). Then we perform the decode step on the NumPy array, which we already know to work.

@Carreau Carreau added this to the v2.7 milestone Dec 3, 2020
@Carreau
Copy link
Contributor

Carreau commented Dec 3, 2020

Thanks for the explanation; let's get that in.

@Carreau Carreau merged commit 1464d90 into zarr-developers:master Dec 3, 2020
@jakirkham jakirkham deleted the simp_txt_coding_db_stores branch December 3, 2020 23:23
@jakirkham
Copy link
Member Author

Thanks Matthias! 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Failures on Python 3.10

2 participants