Allow np.object dtypes into virtualfile_from_vectors#684
Merged
Conversation
Loosen the check in `virtualfile_from_vectors` to allow for any string-like dtype (np.str, np.object) by performing the check using `pd.api.types.is_string_dtype()`. The array is then converted (if needed) to a proper np.str dtype before giving it to put_strings.
Merged
seisman
approved these changes
Nov 7, 2020
Member
seisman
left a comment
There was a problem hiding this comment.
Great! I believe it solves a long-term headache when I use pandas with PyGMT.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description of proposed changes
Loosen the check in
virtualfile_from_vectorsto allow for any string-like dtype (np.str, np.object) by performing the check usingpd.api.types.is_string_dtype(). The array is then converted (if needed) to a propernp.strdtype before giving it toput_strings.Why is this needed ?
This is one step in enabling text input into modules like:
meca, see Wrap meca #516 (comment), ff164e6velo, see Wrap velo #525 (comment)Those modules rely on
pandas.DataFrameinputs, but a 'str' column in pandas is typically stored as an 'object' dtype (see https://stackoverflow.com/questions/21018654/strings-in-a-dataframe-but-dtype-is-object), unless users take due care to store them in the new pandas.StringDtype. Either way, when we convert these pandas.Series objects to a numpy array, their dtype becomesnp.objectrather thannp.str(hence why our code needs to handle np.object too).After this PR is merged, we can do something like:
Fixes #
Reminders
make formatandmake checkto make sure the code follows the style guide.doc/api/index.rst.Notes
/formatin the first line of a comment to lint the code automatically