I looked through the open issues and in our API, but didn't directly find something about selecting a subset of columns of a table.
Assume you have a table like:
table = pa.table({'a': [1, 2], 'b': [.1, .2], 'c': ['a', 'b']})
You can select a single column with table.column('a') or table['a'] to get a chunked array. You can add, append, remove and replace columns (with add_column, append_column, remove_column, set_column).
But an easy way to get a subset of the columns (without the manuall removing the ones you don't want one by one) doesn't seem possible.
I would propose something like:
Reporter: Joris Van den Bossche / @jorisvandenbossche
Assignee: Joris Van den Bossche / @jorisvandenbossche
PRs and other links:
Note: This issue was originally created as ARROW-8314. Please see the migration documentation for further details.
I looked through the open issues and in our API, but didn't directly find something about selecting a subset of columns of a table.
Assume you have a table like:
You can select a single column with
table.column('a')ortable['a']to get a chunked array. You can add, append, remove and replace columns (withadd_column,append_column,remove_column,set_column).But an easy way to get a subset of the columns (without the manuall removing the ones you don't want one by one) doesn't seem possible.
I would propose something like:
Reporter: Joris Van den Bossche / @jorisvandenbossche
Assignee: Joris Van den Bossche / @jorisvandenbossche
PRs and other links:
Note: This issue was originally created as ARROW-8314. Please see the migration documentation for further details.