Skip to content

chore: add tests to 1941#1959

Merged
akoumpa merged 5 commits intomainfrom
huiyingl/pr-1941-cover-extract-layers
Apr 21, 2026
Merged

chore: add tests to 1941#1959
akoumpa merged 5 commits intomainfrom
huiyingl/pr-1941-cover-extract-layers

Conversation

@HuiyingLi
Copy link
Copy Markdown
Contributor

What does this PR do ?

Add a one line overview of what this PR aims to accomplish.

Changelog

  • Add specific line by line info of high level changes in this PR.

Before your PR is "Ready for review"

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • Did you write any new necessary tests?
  • Did you add or update any necessary documentation?

If you haven't finished some of the above items you can still open "Draft" PR.

Additional Information

  • Related to # (issue)

…ividual layers

_reduce_attrs returns ModuleList objects as single items; extending layers
with them meant AC code never found self_attn/mlp on a ModuleList and
silently skipped all checkpointing. Flatten any ModuleList results so
layers contains individual decoder layers, matching the heuristic path.

Signed-off-by: khazic <khazzz1c@gmail.com>
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented Apr 21, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@HuiyingLi HuiyingLi changed the title Huiyingl/pr 1941 cover extract layers chore: add tests to 1941 Apr 21, 2026
codecov/patch flagged #1941 at 14.28% (1/7 diff lines hit): every existing
test mocks _extract_model_layers, so the new _extend_layers helper and the
two modified call sites were unexecuted. Add six tests over the real
function covering: class-keyed single FQN (GPT2), string-keyed arm
(NemotronH name match), multi-FQN (Qwen2.5-VL), non-ModuleList element
kept as a single entry (ModuleDict post-PP-split shape), and both
ModuleList/ModuleDict fallback branches as regression guards.

Uses Cls.__new__ + nn.Module.__init__ to produce instances whose
type(model) matches the exact class in MODEL_CLS_TO_LAYERS (identity
lookup — subclasses miss the dict) without HF's config-dependent
__init__.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: HuiyingLi <willwin.lee@gmail.com>
@HuiyingLi HuiyingLi force-pushed the huiyingl/pr-1941-cover-extract-layers branch from 5b2bafb to 390e3a8 Compare April 21, 2026 18:39
HuiyingLi and others added 2 commits April 21, 2026 11:40
Fix E741 and apply ruff format on the test file.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: HuiyingLi <willwin.lee@gmail.com>
@HuiyingLi
Copy link
Copy Markdown
Contributor Author

/claude review

claude[bot]
claude Bot previously approved these changes Apr 21, 2026
Copy link
Copy Markdown
Contributor

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@HuiyingLi
Copy link
Copy Markdown
Contributor Author

/ok to test 523bd78

#1941 flattens each ModuleList returned from _reduce_attrs, so
_extract_model_layers now yields individual decoder modules instead
of the containing ModuleLists. Update the two fallback/None-safety
assertions added in #1859 to isinstance-check the inner nn.Linear.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: HuiyingLi <willwin.lee@gmail.com>
@HuiyingLi
Copy link
Copy Markdown
Contributor Author

/ok to test ba3e9d7

@akoumpa akoumpa merged commit a966a5a into main Apr 21, 2026
57 checks passed
@akoumpa akoumpa deleted the huiyingl/pr-1941-cover-extract-layers branch April 21, 2026 22:41
linnanwang pushed a commit that referenced this pull request Apr 24, 2026
* fix: flatten ModuleList in _extract_model_layers so AC applies to individual layers

_reduce_attrs returns ModuleList objects as single items; extending layers
with them meant AC code never found self_attn/mlp on a ModuleList and
silently skipped all checkpointing. Flatten any ModuleList results so
layers contains individual decoder layers, matching the heuristic path.

Signed-off-by: khazic <khazzz1c@gmail.com>

* test: cover _extract_model_layers flatten branches

codecov/patch flagged #1941 at 14.28% (1/7 diff lines hit): every existing
test mocks _extract_model_layers, so the new _extend_layers helper and the
two modified call sites were unexecuted. Add six tests over the real
function covering: class-keyed single FQN (GPT2), string-keyed arm
(NemotronH name match), multi-FQN (Qwen2.5-VL), non-ModuleList element
kept as a single entry (ModuleDict post-PP-split shape), and both
ModuleList/ModuleDict fallback branches as regression guards.

Uses Cls.__new__ + nn.Module.__init__ to produce instances whose
type(model) matches the exact class in MODEL_CLS_TO_LAYERS (identity
lookup — subclasses miss the dict) without HF's config-dependent
__init__.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: HuiyingLi <willwin.lee@gmail.com>

* style: ruff format + rename ambiguous `l` loop var

Fix E741 and apply ruff format on the test file.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: HuiyingLi <willwin.lee@gmail.com>

* test(qwen3_5): expect flattened per-layer modules after #1941

#1941 flattens each ModuleList returned from _reduce_attrs, so
_extract_model_layers now yields individual decoder modules instead
of the containing ModuleLists. Update the two fallback/None-safety
assertions added in #1859 to isinstance-check the inner nn.Linear.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: HuiyingLi <willwin.lee@gmail.com>

---------

Signed-off-by: khazic <khazzz1c@gmail.com>
Signed-off-by: HuiyingLi <willwin.lee@gmail.com>
Co-authored-by: khazic <khazzz1c@gmail.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants