Skip to content

Conversation

@arw2019
Copy link
Contributor

@arw2019 arw2019 commented Oct 17, 2020

  • closes #xxxx
  • tests added / passed
  • passes black pandas
  • passes git diff upstream/master -u -- "*.py" | flake8 --diff
  • whatsnew entry

The goal here is to make progress on some of the more uncontroversial/easier content of #28283. I'll split off into another PR if this gets too big

@arw2019 arw2019 force-pushed the GH28283-duplicated branch from 0cc664f to b1747d2 Compare October 17, 2020 03:42
@arw2019 arw2019 marked this pull request as draft October 17, 2020 04:00
@jreback jreback added the metadata _metadata, .attrs label Oct 17, 2020
@jreback
Copy link
Contributor

jreback commented Oct 17, 2020

cc @TomAugspurger

lgtm.

@arw2019 arw2019 marked this pull request as ready for review October 17, 2020 16:32
@arw2019 arw2019 changed the title WIP: call __finalize__ in more methods call __finalize__ in more methods Oct 17, 2020
return stack(self, level, dropna=dropna)
result = stack(self, level, dropna=dropna)

return result.__finalize__(self, method="stack")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this falls into the name issue I spotted in #37037. Essentially, if df has a column named name we try to set result.name = self.name, i.e. the series name.

In [10]: df = pd.DataFrame({"A": [1, 2], "name": [3, 4]})

In [11]: df.stack().__finalize__(df)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-11-899f15844169> in <module>
----> 1 df.stack().__finalize__(df)

~/Envs/dask-dev/lib/python3.8/site-packages/pandas/core/generic.py in __finalize__(self, other, method, **kwargs)
   5157         if isinstance(other, NDFrame):
   5158             for name in self._metadata:
-> 5159                 object.__setattr__(self, name, getattr(other, name, None))
   5160         return self
   5161

~/Envs/dask-dev/lib/python3.8/site-packages/pandas/core/series.py in name(self, value)
    464     def name(self, value):
    465         if value is not None and not is_hashable(value):
--> 466             raise TypeError("Series.name must be a hashable type")
    467         object.__setattr__(self, "_name", value)
    468

TypeError: Series.name must be a hashable type

I'll think the solution is to only iterate over the intersection of self._metadata and other._metadata. I'll put up a PR.

@TomAugspurger
Copy link
Contributor

#37205 handles the dataframe.name issue.

I don't know if we want to add a test case for all of those, but one or two might not hurt?

@arw2019
Copy link
Contributor Author

arw2019 commented Oct 17, 2020

#37205 handles the dataframe.name issue.

I don't know if we want to add a test case for all of those, but one or two might not hurt?

Great!

I can add the tests here and then probably we would want to merge #37205 before this goes in

@jreback jreback added this to the 1.2 milestone Oct 20, 2020
@jreback
Copy link
Contributor

jreback commented Oct 20, 2020

can you merge master and ping on green as merged @TomAugspurger patch

@arw2019 arw2019 force-pushed the GH28283-duplicated branch from b1747d2 to 8475562 Compare October 20, 2020 06:06
@arw2019
Copy link
Contributor Author

arw2019 commented Oct 20, 2020

Green

@jreback jreback merged commit 2f55283 into pandas-dev:master Oct 20, 2020
@jreback
Copy link
Contributor

jreback commented Oct 20, 2020

thanks @arw2019

JulianWgs pushed a commit to JulianWgs/pandas that referenced this pull request Oct 26, 2020
kesmit13 pushed a commit to kesmit13/pandas that referenced this pull request Nov 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

metadata _metadata, .attrs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants