[BugFix] fix dynamic c8 in v1 loader#5562
Merged
yuanlehome merged 1 commit intoPaddlePaddle:developfrom Dec 15, 2025
Merged
Conversation
|
Thanks for your contribution! |
Contributor
There was a problem hiding this comment.
Pull request overview
该 PR 修复了 v1 loader 中处理动态 c8(cache 量化)的一个 bug。具体来说,修复了在使用 block_wise_fp8 量化类型时,process_weights_after_loading 方法错误地尝试访问不存在的 scale 参数的问题。
关键改动
- 在
process_weights_after_loading方法中添加条件检查,仅在非 block_wise 量化类型时才处理 cache scale 参数 - 该修复使 v1 loader 的行为与 v0 loader(
process_loaded_weights方法)保持一致
Comment on lines
+266
to
+270
| if "block_wise" not in layer.cache_quant_type_str: | ||
| if layer.cache_k_scale._is_initialized(): | ||
| layer.cache_k_out_scale.set_value(1 / layer.cache_k_scale) | ||
| if layer.cache_v_scale._is_initialized(): | ||
| layer.cache_v_out_scale.set_value(1 / layer.cache_v_scale) |
There was a problem hiding this comment.
新增的针对 block_wise_fp8 量化类型的条件分支缺少测试覆盖。建议在 tests/quantization/test_kv_cache.py 中添加测试用例,验证:
- 当 cache_quant_type_str 包含 "block_wise" 时,process_weights_after_loading 不会尝试访问或设置 cache_k_out_scale 和 cache_v_out_scale
- 当 cache_quant_type_str 不包含 "block_wise" 时,正常执行现有逻辑
这将确保该修复在未来的代码变更中保持正确性。
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #5562 +/- ##
==========================================
Coverage ? 60.87%
==========================================
Files ? 329
Lines ? 41161
Branches ? 6275
==========================================
Hits ? 25055
Misses ? 14213
Partials ? 1893
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
yuanlehome
added a commit
that referenced
this pull request
Dec 15, 2025
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Motivation
fix dynamic c8 in v1 loader
Modifications
Usage or Command
Accuracy Tests
Checklist
[FDConfig],[APIServer],[Engine],[Scheduler],[PD Disaggregation],[Executor],[Graph Optimization],[Speculative Decoding],[RL],[Models],[Quantization],[Loader],[OP],[KVCache],[DataProcessor],[BugFix],[Docs],[CI],[Optimization],[Feature],[Benchmark],[Others],[XPU],[HPU],[GCU],[DCU],[Iluvatar],[Metax]]pre-commitbefore commit.releasebranch, make sure the PR has been submitted to thedevelopbranch, then cherry-pick it to thereleasebranch with the[Cherry-Pick]PR tag.