Skip to content

Fix handling of numpy dtype in base.assert_xy() #232

@marty-larocque

Description

@marty-larocque

What is Happening:

Currently, base.assert_xy() uses numpy.testing.assert_array_max_ulp() to compare xy data. The assert_array_max_ulp() function requires 'array-like' parameters. Basically, this means that it requires parameters that numpy can easily convert to an array with a numeric datatype.

Generally, we would expect any parameter we pass to be a numpy.ndarray with dtype=numpy.float64.

However, we occationally see cases where the parameter we pass is a numpy.ndarray with dtype=object and that object is of type numpy.float64.

This is a subtle but important difference; in the second case, assert_array_max_ulp() fails and throws a ValueError. As far as numpy is concerned, the array is not numeric in the second case.

Why this is happening:

matplotlib stores xy data, in dataframes. That's also how we store the data in base.assert_xy(). But because of the confusing nuances of how matplotlib decides to store this data, it does not always get converted to numpy arrays the same way. There are multiple ways to get the values of a DataFrame as a numpy array:

  • Just passing a DataFrame to the function and letting numpy convert it for us
  • np.array(df)
  • df.values
  • df.to_numpy()

All of these methods are susceptible to returning a numpy.ndarray with dtype=object.

How to fix this:

The only reliable way I've found to get a numeric numpy array is to use the following method:

  • df.to_numpy(dtype=np.float64)

This conversion should be done before passing the data to assert_array_max_ulp()

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions