Skip to content

test: update doctest to new pandas behavior#5788

Merged
westonpace merged 2 commits intolance-format:mainfrom
westonpace:test/pandas-3-compat-2
Jan 22, 2026
Merged

test: update doctest to new pandas behavior#5788
westonpace merged 2 commits intolance-format:mainfrom
westonpace:test/pandas-3-compat-2

Conversation

@westonpace
Copy link
Copy Markdown
Member

No description provided.

@github-actions
Copy link
Copy Markdown
Contributor

Review: LGTM ✓

This is a minimal doctest fix to accommodate pandas 3 behavior changes:

  • NoneNaN display format
  • Column spacing adjustments

No issues identified.

0 1 a x
1 2 x y
2 3 y z
3 4 z NaN
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does pandas use NaN to represent None now?
That's a little bit weird... esp column c is with type string

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a little bit weird... esp column c is with type string

+10000, it almost feels like a bug to me, not sure if there are any related discussions in pandas about it

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll investigate

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

>>> import pandas as pd
>>> pd.DataFrame({"x": ["x", "y", "z"]})
   x
0  x
1  y
2  z
>>> pd.DataFrame({"x": ["x", "y", None]})
     x
0    x
1    y
2  NaN

Maybe a pandas bug but 🤷 . I'll see if there are any tickets on pandas repo.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is intentional:

The main characteristic of the new string data type:

Inferred by default for string data (instead of object dtype)

The str dtype can only hold strings (or missing values), in contrast to object dtype. (setitem with non string fails)

The missing value sentinel is always NaN (np.nan) and follows the same missing value semantics as the other default dtypes.

@Xuanwo
Copy link
Copy Markdown
Collaborator

Xuanwo commented Jan 22, 2026

Oh, please feel free to cherry pick this PR: #5789

@westonpace westonpace merged commit 60a0e20 into lance-format:main Jan 22, 2026
11 checks passed
jackye1995 pushed a commit to jackye1995/lance that referenced this pull request Jan 23, 2026
majin1102 pushed a commit to majin1102/lance that referenced this pull request Jan 23, 2026
jackye1995 pushed a commit that referenced this pull request Jan 23, 2026
Co-authored-by: Xuanwo <github@xuanwo.io>
vivek-bharathan pushed a commit to vivek-bharathan/lance that referenced this pull request Feb 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants