fix: improve weight offloading to handle plain tensor attrs and use to_empty() by quic-rishinr · Pull Request #952 · quic/efficient-transformers

quic-rishinr · 2026-04-28T15:58:25Z

fix: improve weight offloading to handle plain tensor attrs and use to_empty()

Replace manual storage resizing with to_empty(device="meta") for
parameters/buffers and explicitly handle plain tensor attributes (e.g.
stacked expert weights in MoE models) that are not registered as
parameters or buffers. This ensures all tensors are properly moved to
the meta device, reducing memory usage after ONNX export.

Add unit tests for plain tensor attribute clearing

…o_empty() Signed-off-by: Rishin Raj <rishinr@qti.qualcomm.com>

quic-rishinr requested review from abhishek-singh591 and ochougul April 28, 2026 15:58

fix: improve weight offloading to handle plain tensor attrs and use t…

31ee8a3

…o_empty() Signed-off-by: Rishin Raj <rishinr@qti.qualcomm.com>

quic-rishinr force-pushed the mem_optim_v2 branch from 67e71d9 to 31ee8a3 Compare April 29, 2026 09:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: improve weight offloading to handle plain tensor attrs and use to_empty()#952

fix: improve weight offloading to handle plain tensor attrs and use to_empty()#952
quic-rishinr wants to merge 1 commit intoquic:mainfrom
quic-rishinr:mem_optim_v2

quic-rishinr commented Apr 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

quic-rishinr commented Apr 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant