Skip to content

Conversation

@phofl
Copy link
Collaborator

@phofl phofl commented Feb 2, 2024

cc @rjzamora

this cuts runtime in half on my machine, not super fancy but just avoids as much work as possible without changing the logic

Copy link
Member

@rjzamora rjzamora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @phofl ! Seems good to me, just some minor questions.

continue
seen.add(p._name)

column_union.extend(p._projection_columns)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worthwhile to have this in a if len(p.columns) > 0: block like before?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is a very very rare case so I thought removing it is more efficient than keeping it, it’s only relevant for index projections, so the new logic should be more efficient in almost all cases

Comment on lines -3204 to +3205
column_union = sorted(
flattened_columns,
key=lambda x: x[0] if isinstance(x, tuple) else x or MinType(),
)
column_union = sorted(flattened_columns)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did something change here that I'm not seeing, or was the key= never necessary?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic changed a little bit, tuples now end up in the except branch. Since tuples are rare, it’s better to slow them down than to slow the default path down with the check

@phofl phofl merged commit 3c1992f into dask:main Feb 6, 2024
@phofl phofl deleted the assign_perf branch February 6, 2024 10:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants