Skip to content

Conversation

@bashtage
Copy link
Contributor

@bashtage bashtage changed the title Iterator cat BUG/ENH: Improve categorical construction when using the iterator in StataReadet May 12, 2020
@bashtage bashtage changed the title BUG/ENH: Improve categorical construction when using the iterator in StataReadet BUG/ENH: Improve categorical construction when using the iterator in StataReader May 12, 2020
@bashtage bashtage force-pushed the iterator-cat branch 4 times, most recently from 44b73a4 to 51dcc83 Compare May 12, 2020 16:27
@jreback jreback added Bug Categorical Categorical Data Type IO Stata read_stata, to_stata labels May 12, 2020
@bashtage bashtage force-pushed the iterator-cat branch 2 times, most recently from 383c122 to 4472717 Compare May 12, 2020 21:53
@jreback jreback added this to the 1.1 milestone May 13, 2020
Kevin Sheppard and others added 4 commits June 2, 2020 15:25
Return categoricals with the same categories if possible when reading
data through an interator.
Warn if not possible.

closes pandas-dev#31544
Restrict iterator to StataReaders constructed with a positive chunksize
Check the label ordering does not cause any issues
@jreback jreback merged commit 035e1fe into pandas-dev:master Jun 4, 2020
@jreback
Copy link
Contributor

jreback commented Jun 4, 2020

thanks @bashtage

if self._chunksize is None:
raise ValueError(
"chunksize must be set to a positive integer to use as an iterator."
)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This breaks when read_stata() is passed iterator=True but no chunksize, #37280

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Bug Categorical Categorical Data Type IO Stata read_stata, to_stata

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Reading with read_stata in chunks messes up categories

3 participants