Skip to content

Conversation

@jorisvandenbossche
Copy link
Member

No description provided.

@github-actions
Copy link

github-actions bot commented Jul 1, 2020

Copy link
Member Author

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The approach here is to also determine field_names_ in HivePartitioningFactory after inspecting (for DirectoryPartitioningFactory, those field names are passed to the constructor). So that we can then trim the schema and have the dictionaries match the order of the schema.

However, thinking of it now: there might still be a problem if the user specified the full dataset schema so no inspection happens .. So we might need to think of a better solution.

(I should also add some C++ tests)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should probably guard here against the case that field_names_ was not yet updated (if Finish is called without Inspect being called), with empty vector?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolutely, the first line of this method should just call

auto field_names = FieldNames();

and replace occurrences of the private member.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no FieldNames() method on the PartitioningFactory (only the impl has one, but that is not accessible here; that's the reason I added the field_names_ private member to store those)

jorisvandenbossche and others added 3 commits July 8, 2020 22:05
Co-authored-by: Benjamin Kietzman <bengilgit@gmail.com>
Co-authored-by: Benjamin Kietzman <bengilgit@gmail.com>
Copy link
Member

@wesm wesm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants