Skip to content

Memory blowup from expression alignment and .sum() on masked variables #14

@FBumann

Description

@FBumann

Two core memory issues in linopy's expression system

Profiling and reproducers on branch feature/memory-usage-issues (see dev-scripts/).


Issue 1: merge() creates dense Cartesian product for disjoint dimensions

What happens: Adding two expressions with different dimension names (e.g. (node, line, time) + (vehicle, route, time)) calls merge()xr.concat(..., join="outer"), which broadcasts to the full Cartesian product of all dimensions.

Where in the code: expressions.py merge() function, line ~2138: kwargs.setdefault("join", "outer"), then line ~2141: xr.concat(...).

Impact: Two expressions with 150K + 80K elements produce a 600M-element dense array (520x blowup). Profiled at 18 GB peak from 13 MB input.

Reproducer (dev-scripts/story2.py):

import linopy
m = linopy.Model()
x = m.add_variables(coords=[range(50), range(30), range(100)],
                    dims=["node", "line", "time"], name="x")
y = m.add_variables(coords=[range(40), range(20), range(100)],
                    dims=["vehicle", "route", "time"], name="y")
total = 2 * x + 3 * y   # → 18 GB dense array, 99.96% fill values

What needs to change: merge() (or __add__) should detect when expressions have disjoint coordinate dimensions and avoid the dense cross-product. The solver only needs sparse (var_id, coeff) pairs — the dense intermediate serves no purpose.

PRs #12 and #13 both address this with lazy/deferred expression containers. Key remaining gap: add_constraints() still forces materialization.


Issue 2: _sum() stacks dead terms from masked variables

What happens: A variable created with a sparse mask has labels = -1 for inactive positions. When .sum(dim) is called, it stacks the entire dimension into _term via .stack() — including all masked-out entries where vars == -1. These dead terms then propagate through every downstream operation.

Where in the code: expressions.py _sum(), lines ~1172-1176:

ds = (
    data[["coeffs", "vars"]]
    .reset_index(dim, drop=True)
    .rename({TERM_DIM: STACKED_TERM_DIM})
    .stack({TERM_DIM: [STACKED_TERM_DIM] + dim}, create_index=False)
)

This .stack() blindly includes all entries along dim, regardless of whether vars == -1.

Impact: With 300 contributors, 20 effects, 30 active per effect: _term becomes 300 instead of 30, with 90% dead terms. Every downstream op (-, merge, add_constraints) pays for the 90% waste. At real-world scale (277 contributors, 22 effects, 2190 timesteps) this contributed to 44+ GB OOM.

Reproducer (dev-scripts/story1.py):

import numpy as np, xarray as xr, linopy
m = linopy.Model()
mask = xr.DataArray(np.zeros((300, 20), dtype=bool), dims=["contributor", "effect"])
rng = np.random.default_rng(42)
for e in range(20):
    mask.values[rng.choice(300, 30, replace=False), e] = True
var = m.add_variables(coords=[range(300), range(20), range(2000)],
                      dims=["contributor", "effect", "time"],
                      name="share", mask=mask)
expr = var.sum("contributor")
print(expr.sizes["_term"])                   # 300 — should be ~30
print((expr.data.vars.values == -1).mean())  # 90% dead

What needs to change: _sum() should filter or compact dead terms (vars == -1) before or after the .stack(). The challenge is that the mask varies per slice along the remaining dimensions (each effect has a different set of active contributors), so a simple pre-filter isn't possible — but post-stack compaction (dropping vars == -1 terms) or a per-slice approach could work.

Neither PR #12 nor #13 addresses this issue.


Profiling details

Branch feature/memory-usage-issues contains:

  • dev-scripts/MEMORY_OPTIMIZATION_STORIES.md — use case descriptions
  • dev-scripts/story{1,2,3,4}.py — reproducers with tracemalloc
  • dev-scripts/story{1,2,3,4}_profile.md — scalene + tracemalloc findings

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions