Skip to content

make it possible to manually set chunks when loading dask arrays#1477

Merged
ax3l merged 3 commits intoopenPMD:devfrom
pordyna:topic-daskChunks
Aug 17, 2023
Merged

make it possible to manually set chunks when loading dask arrays#1477
ax3l merged 3 commits intoopenPMD:devfrom
pordyna:topic-daskChunks

Conversation

@pordyna
Copy link
Contributor

@pordyna pordyna commented Jul 14, 2023

Make it possible to override the default chunking behavior when loading to dask arrays. For PIConGPU the default creates a chunk for each GPU with its local domain. When using static load balancing with gridDist in PIConGPU, this results in very uneven chunks. It looks like, for many operations, it makes sense to chunk along the outermost direction only (reading contiguous domains is faster). With this PR this can be achieved with .to_daks_array(chunks={0 : 'auto', 1: -1, 2: -1})

@pordyna pordyna force-pushed the topic-daskChunks branch from fb5da8b to 1fe460d Compare July 14, 2023 13:06
@pordyna pordyna force-pushed the topic-daskChunks branch from 0c924d9 to b6f0bcf Compare July 14, 2023 13:08
@franzpoeschel franzpoeschel requested a review from ax3l July 14, 2023 14:21
@ax3l ax3l self-assigned this Aug 9, 2023
Copy link
Member

@ax3l ax3l left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, great idea!!

I added a few suggestions to finalize the PR :)

Copyright and typos in setting chunks for dask arrays.

Co-authored-by: Axel Huebl <axel.huebl@plasma.ninja>
@pordyna
Copy link
Contributor Author

pordyna commented Aug 14, 2023

Thanks @ax3l, I applied your suggestions.

Copy link
Member

@ax3l ax3l left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! :)

@ax3l ax3l added this to the 0.16.0 milestone Aug 17, 2023
@ax3l ax3l enabled auto-merge (squash) August 17, 2023 03:31
@ax3l ax3l merged commit 99daca7 into openPMD:dev Aug 17, 2023
eschnett added a commit to eschnett/openPMD-api that referenced this pull request Sep 5, 2023
* dev:
  Fix CMake: HDF5 Libs are PUBLIC (openPMD#1520)
  Fix `chmod` in `download_samples.sh` (openPMD#1518)
  CI: Old CTest (openPMD#1519)
  Python: Fix ODR Violation (openPMD#1521)
  replace extent in weighting and displacement (openPMD#1510)
  CMake: Warn and Continue on Empty HDF5_VERSION (openPMD#1512)
  Replace openPMD_Datatypes global with function (openPMD#1509)
  Streaming examples: Set WAN as default transport (openPMD#1511)
  TOML Backend (openPMD#1436)
  make it possible to manually set chunks when loading dask arrays (openPMD#1477)
  [pre-commit.ci] pre-commit autoupdate (openPMD#1504)
  Optional debugging output for AbstractIOHandlerImpl::flush() (openPMD#1495)
  Python: 3.8+ (openPMD#1502)

# Conflicts:
#	.github/workflows/linux.yml
#	src/binding/python/Series.cpp
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api: new additions to the API frontend: Python3

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants