Skip to content

Conversation

@fjetter
Copy link
Member

@fjetter fjetter commented Jul 12, 2023

I don't have any proper measurements but this is an attempt of fixing #7990 by not converting anything to bytes prematurely and passing pyarrow tables around as far as possible. Haven't tested this with string data or anything like this so I don't know if the event loop survives this.

The most troubling aspect of this is that pa.Table.nbytes actually takes a surprising amount of time and this is currently where I loose performance (also note: the daks.sizeof implementation for pa.Table is misleading for this use case. The sizeof basically returns pa.Table.get_total_buffer_size which is vastly overestimating the size of our slices since the slices are just views of the larger tables. This seems to be the bottleneck of this implementation atm

cc @hendrikmakait

@github-actions
Copy link
Contributor

Unit Test Results

See test report for an extended history of previous test failures. This is useful for diagnosing flaky tests.

  20 files  ±         0  20 suites  ±0   6m 45s ⏱️ - 11h 23m 35s
  12 tests  -   3 704    0 ✔️  -   3 604    11 💤  -      95  0  - 6    1 🔥 +  1 
180 runs   - 35 762    0 ✔️  - 34 187  164 💤  - 1 584  0  - 7  16 🔥 +16 

For more details on these errors, see this check.

Results for commit 7fc6900. ± Comparison against base commit 6c37907.

This pull request removes 3704 tests.
distributed.cli.tests.test_dask_scheduler
distributed.cli.tests.test_dask_scheduler ‑ test_dashboard
distributed.cli.tests.test_dask_scheduler ‑ test_dashboard_allowlist
distributed.cli.tests.test_dask_scheduler ‑ test_dashboard_non_standard_ports
distributed.cli.tests.test_dask_scheduler ‑ test_dashboard_port_zero
distributed.cli.tests.test_dask_scheduler ‑ test_defaults
distributed.cli.tests.test_dask_scheduler ‑ test_hostport
distributed.cli.tests.test_dask_scheduler ‑ test_idle_timeout
distributed.cli.tests.test_dask_scheduler ‑ test_interface
distributed.cli.tests.test_dask_scheduler ‑ test_multiple_protocols
…

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant