Memory saving output

We realized that most `eko` operations do not depend on multiple `Q2` at the same time.

The full `OperatorGrid` is a rank-5 tensor, by using only one `Q2` at a time we reduce the problem to a rank-4 tensor, with much less memory consumption.

In order to get a usable and flexible structure, a few capabilities are needed for the new structure:
- [x] separate `Q2` storage: it will be `{q2}.npy`
- [x] lazy loading: loads one (or a subset of) `Q2` at a time, and drop them once consumed
  - even the subset might be useful, for cases in which the consumer only accepts a rank-5 tensor
- [ ] merge separately computed (but input compatible) outputs
- [ ] split a single output into multiple ones
  - *not strictly needed, but it's dual to the former one, and so nice to have*

This new structure will be the upgraded version of the current `Output` object.

Moreover, a few more things might be implemented as related, and later used to support separate computation of `Q2` elements:
- [ ] replace/upgrade `OperatorGrid`
  - we need a manager for the computation, but it has not to hold the data, that has to be dumped as soon as possible
  - the easiest is to have the current `OperatorGrid` to hold the reference to a new `Output` object, and store everything in there
  - everything will include threshold operators, and partial `Q2` results (those obtained before combining with threshold operators), together with full results
- [ ] support separately threshold operators and partial `Q2` in `Output`
  - they are also dumped on disk, with their own names: `thresholds.npz` (containing all the threshold elements) and `{q2}.part.npy`

The idea is that, an object supporting these features, can be computed separately, in a completely independent way (1 process for thresholds, plus one for each `Q2`, for example), and then merged together.
In order to make it easier to merge and compute the final one:
- the `thresholds.npz` is never removed from the saved output
  - if other `{q2}.part.npy` come later, they can always be consumed
  - [ ] unless explicitly stated: provide an `optimize()` or `clean()` method, to get rid of it
- everything can be merged together, with or without thresholds, but it's simply checking the input compatibility, and adding the arrays to the archive
- if it contains both thresholds and partial objects, compute final ones
  - [ ] provide a `combine()` method
  - partial objects are removed after combination

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory saving output #96

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Memory saving output #96

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions