Skip to content

Conversation

@kharoc
Copy link
Contributor

@kharoc kharoc commented Aug 2, 2021

Create a from_pydict function in RecordBatch class.
Create unit test for from_pydict

@github-actions
Copy link

github-actions bot commented Aug 2, 2021

v = mapping[field.name]
except KeyError:
try:
v = mapping[tobytes(field.name)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure allowing bytes keys is useful.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apparently we do support that in Table.from_pydict (but I agree this doesn't seem needed ..)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to bytes function was removed.

Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kharoc thanks for the PR!

Since the implementation is almost identical between Recordbatch and Table, I think we should try to share the code. Both classes don't share a base class at the moment, so maybe a helper function is the easiest (the helper function can return the arguments to be passed to RecordBatch/Table.from_arrays)

v = mapping[field.name]
except KeyError:
try:
v = mapping[tobytes(field.name)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apparently we do support that in Table.from_pydict (but I agree this doesn't seem needed ..)

assert result.equals(expected)


def test_recordbatch_from_pydict():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you move this next to the existing test for table (to keep the similar tests close to each other). With some parametrization, we might also be able to reduce the duplication (the actual test doesn't seem to rely on anything RecordBatch/Table specific)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's solved. Parametrization was added.

@jorisvandenbossche
Copy link
Member

@kharoc can you also take a look at my non-inline comment (#10854 (review)) about avoiding duplication between the Table and RecordBatch implementation?

@jorisvandenbossche jorisvandenbossche changed the title ARROW-13089:[Python]Allow creating RecordBatch from Python dict ARROW-13089: [Python] Allow creating RecordBatch from Python dict Aug 4, 2021
Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update, looks good!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants