[KVCache] Support kv cache scale load #4624

Sunny-bot1 · 2025-10-28T09:47:30Z

Motivation

support static Cfp8 scale load

Modifications

add load_cache_scale for kv_cache_quant_type == "float8_e4m3fn"

Usage or Command

put kv_cache_scale.json in model_path
add quantization_config like:

    "quantization_config":{
        "dense_quant_type":"block_wise_fp8",
        "moe_quant_type":"block_wise_fp8",
        "kv_cache_quant_type":"float8_e4m3fn",
        "quantization":"mix_quant"
    }

Accuracy Tests

Checklist

Add at least a tag in the PR title.
- Tag list: [[FDConfig],[APIServer],[Engine], [Scheduler], [PD Disaggregation], [Executor], [Graph Optimization], [Speculative Decoding], [RL], [Models], [Quantization], [Loader], [OP], [KVCache], [DataProcessor], [BugFix], [Docs], [CI], [Optimization], [Feature], [Benchmark], [Others], [XPU], [HPU], [GCU], [DCU], [Iluvatar], [Metax]]
- You can add new tags based on the PR content, but the semantics must be clear.
Format your code, run pre-commit before commit.
Add unit tests. Please write the reason in this PR if no unit tests.
Provide accuracy results.
If the current PR is submitting to the release branch, make sure the PR has been submitted to the develop branch, then cherry-pick it to the release branch with the [Cherry-Pick] PR tag.

paddle-bot · 2025-10-28T09:47:36Z

Thanks for your contribution!

cache scale load

8db7346

Merge branch 'develop' into kv_scale_dev

5555287

YuanRisheng added the skip-ci: coverage label Oct 31, 2025

zhoutianzi666 approved these changes Oct 31, 2025

View reviewed changes

zhoutianzi666 merged commit 9b18f0b into PaddlePaddle:develop Oct 31, 2025
36 of 38 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[KVCache] Support kv cache scale load #4624

[KVCache] Support kv cache scale load #4624

Uh oh!

Sunny-bot1 commented Oct 28, 2025 •

edited

Loading

Uh oh!

paddle-bot bot commented Oct 28, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[KVCache] Support kv cache scale load #4624

[KVCache] Support kv cache scale load #4624

Uh oh!

Conversation

Sunny-bot1 commented Oct 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Modifications

Usage or Command

Accuracy Tests

Checklist

Uh oh!

paddle-bot bot commented Oct 28, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Sunny-bot1 commented Oct 28, 2025 •

edited

Loading