Skip to content

Conversation

@GYHHAHA
Copy link
Contributor

@GYHHAHA GYHHAHA commented Dec 9, 2021

n = int(1e6)
df = pd.DataFrame({"A": [.0]*n})
arr = df.to_records(index=False)

In [1]: %timeit pd.DataFrame(arr)
3.64 s ± 77.3 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) <- master
2.23 ms ± 421 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) <- PR

@GYHHAHA
Copy link
Contributor Author

GYHHAHA commented Dec 9, 2021

How to add test case for performance pr? I have no experience on this.

@jbrockmendel
Copy link
Member

How to add test case for performance pr? I have no experience on this.

for this you can just show some %timeit results comparing the PR to the status quo

@GYHHAHA
Copy link
Contributor Author

GYHHAHA commented Dec 9, 2021

added @jbrockmendel

Copy link
Member

@jbrockmendel jbrockmendel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jreback jreback added this to the 1.4 milestone Dec 9, 2021
@jreback jreback added Performance Memory or execution speed performance Reshaping Concat, Merge/Join, Stack/Unstack, Explode labels Dec 9, 2021
@jreback jreback merged commit a2316f3 into pandas-dev:master Dec 9, 2021
@jreback
Copy link
Contributor

jreback commented Dec 9, 2021

thanks @GYHHAHA

@GYHHAHA GYHHAHA deleted the patch-1 branch December 10, 2021 03:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Performance Memory or execution speed performance Reshaping Concat, Merge/Join, Stack/Unstack, Explode

Projects

None yet

Development

Successfully merging this pull request may close these issues.

PERF: dataframe construction from recarray is slow

3 participants