Two core memory issues in linopy's expression system
Profiling and reproducers on branch feature/memory-usage-issues (see dev-scripts/).
Issue 1: merge() creates dense Cartesian product for disjoint dimensions
What happens: Adding two expressions with different dimension names (e.g. (node, line, time) + (vehicle, route, time)) calls merge() → xr.concat(..., join="outer"), which broadcasts to the full Cartesian product of all dimensions.
Where in the code: expressions.py merge() function, line ~2138: kwargs.setdefault("join", "outer"), then line ~2141: xr.concat(...).
Impact: Two expressions with 150K + 80K elements produce a 600M-element dense array (520x blowup). Profiled at 18 GB peak from 13 MB input.
Reproducer (dev-scripts/story2.py):
import linopy
m = linopy.Model()
x = m.add_variables(coords=[range(50), range(30), range(100)],
dims=["node", "line", "time"], name="x")
y = m.add_variables(coords=[range(40), range(20), range(100)],
dims=["vehicle", "route", "time"], name="y")
total = 2 * x + 3 * y # → 18 GB dense array, 99.96% fill values
What needs to change: merge() (or __add__) should detect when expressions have disjoint coordinate dimensions and avoid the dense cross-product. The solver only needs sparse (var_id, coeff) pairs — the dense intermediate serves no purpose.
PRs #12 and #13 both address this with lazy/deferred expression containers. Key remaining gap: add_constraints() still forces materialization.
Issue 2: _sum() stacks dead terms from masked variables
What happens: A variable created with a sparse mask has labels = -1 for inactive positions. When .sum(dim) is called, it stacks the entire dimension into _term via .stack() — including all masked-out entries where vars == -1. These dead terms then propagate through every downstream operation.
Where in the code: expressions.py _sum(), lines ~1172-1176:
ds = (
data[["coeffs", "vars"]]
.reset_index(dim, drop=True)
.rename({TERM_DIM: STACKED_TERM_DIM})
.stack({TERM_DIM: [STACKED_TERM_DIM] + dim}, create_index=False)
)
This .stack() blindly includes all entries along dim, regardless of whether vars == -1.
Impact: With 300 contributors, 20 effects, 30 active per effect: _term becomes 300 instead of 30, with 90% dead terms. Every downstream op (-, merge, add_constraints) pays for the 90% waste. At real-world scale (277 contributors, 22 effects, 2190 timesteps) this contributed to 44+ GB OOM.
Reproducer (dev-scripts/story1.py):
import numpy as np, xarray as xr, linopy
m = linopy.Model()
mask = xr.DataArray(np.zeros((300, 20), dtype=bool), dims=["contributor", "effect"])
rng = np.random.default_rng(42)
for e in range(20):
mask.values[rng.choice(300, 30, replace=False), e] = True
var = m.add_variables(coords=[range(300), range(20), range(2000)],
dims=["contributor", "effect", "time"],
name="share", mask=mask)
expr = var.sum("contributor")
print(expr.sizes["_term"]) # 300 — should be ~30
print((expr.data.vars.values == -1).mean()) # 90% dead
What needs to change: _sum() should filter or compact dead terms (vars == -1) before or after the .stack(). The challenge is that the mask varies per slice along the remaining dimensions (each effect has a different set of active contributors), so a simple pre-filter isn't possible — but post-stack compaction (dropping vars == -1 terms) or a per-slice approach could work.
Neither PR #12 nor #13 addresses this issue.
Profiling details
Branch feature/memory-usage-issues contains:
dev-scripts/MEMORY_OPTIMIZATION_STORIES.md — use case descriptions
dev-scripts/story{1,2,3,4}.py — reproducers with tracemalloc
dev-scripts/story{1,2,3,4}_profile.md — scalene + tracemalloc findings
Two core memory issues in linopy's expression system
Profiling and reproducers on branch
feature/memory-usage-issues(seedev-scripts/).Issue 1:
merge()creates dense Cartesian product for disjoint dimensionsWhat happens: Adding two expressions with different dimension names (e.g.
(node, line, time)+(vehicle, route, time)) callsmerge()→xr.concat(..., join="outer"), which broadcasts to the full Cartesian product of all dimensions.Where in the code:
expressions.pymerge()function, line ~2138:kwargs.setdefault("join", "outer"), then line ~2141:xr.concat(...).Impact: Two expressions with 150K + 80K elements produce a 600M-element dense array (520x blowup). Profiled at 18 GB peak from 13 MB input.
Reproducer (
dev-scripts/story2.py):What needs to change:
merge()(or__add__) should detect when expressions have disjoint coordinate dimensions and avoid the dense cross-product. The solver only needs sparse(var_id, coeff)pairs — the dense intermediate serves no purpose.PRs #12 and #13 both address this with lazy/deferred expression containers. Key remaining gap:
add_constraints()still forces materialization.Issue 2:
_sum()stacks dead terms from masked variablesWhat happens: A variable created with a sparse
maskhaslabels = -1for inactive positions. When.sum(dim)is called, it stacks the entire dimension into_termvia.stack()— including all masked-out entries wherevars == -1. These dead terms then propagate through every downstream operation.Where in the code:
expressions.py_sum(), lines ~1172-1176:This
.stack()blindly includes all entries alongdim, regardless of whethervars == -1.Impact: With 300 contributors, 20 effects, 30 active per effect:
_termbecomes 300 instead of 30, with 90% dead terms. Every downstream op (-,merge,add_constraints) pays for the 90% waste. At real-world scale (277 contributors, 22 effects, 2190 timesteps) this contributed to 44+ GB OOM.Reproducer (
dev-scripts/story1.py):What needs to change:
_sum()should filter or compact dead terms (vars == -1) before or after the.stack(). The challenge is that the mask varies per slice along the remaining dimensions (each effect has a different set of active contributors), so a simple pre-filter isn't possible — but post-stack compaction (droppingvars == -1terms) or a per-slice approach could work.Neither PR #12 nor #13 addresses this issue.
Profiling details
Branch
feature/memory-usage-issuescontains:dev-scripts/MEMORY_OPTIMIZATION_STORIES.md— use case descriptionsdev-scripts/story{1,2,3,4}.py— reproducers with tracemallocdev-scripts/story{1,2,3,4}_profile.md— scalene + tracemalloc findings