[Fearture] Support cache kv cache for output tokens by rainyfly · Pull Request #4535 · PaddlePaddle/FastDeploy

rainyfly · 2025-10-22T03:19:46Z

Motivation

In prefix caching, support cache kv cache for output tokens.

Modifications

Enable cahing output tokens by default if enable prefix caching.

Usage or Command

How to enable:
--enable-output-caching

How to disable:
--no-enable-output-caching

Accuracy Tests

None

Checklist

None

paddle-bot · 2025-10-22T03:19:56Z

Thanks for your contribution!

…into support_cache_output

Copilot

Pull request overview

This pull request adds support for caching KV cache for output tokens when prefix caching is enabled in the V1 scheduler. The feature aims to improve cache efficiency by allowing the system to cache generated output tokens in addition to input prompt tokens.

Key Changes:

Added enable_output_caching configuration flag to control output token caching behavior
Implemented automatic caching of output tokens at block boundaries in the token processor
Added test coverage for the new caching functionality

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 7 comments.

Show a summary per file

File	Description
`fastdeploy/engine/args_utils.py`	Added `enable_output_caching` CLI argument and configuration field (default: True)
`fastdeploy/config.py`	Added `enable_output_caching` field to `CacheConfig` class with documentation
`fastdeploy/engine/sched/resource_manager_v1.py`	Implemented `cache_output_tokens()` method to update cache blocks for output tokens
`fastdeploy/output/token_processor.py`	Integrated output caching logic to automatically cache tokens at block boundaries
`tests/v1/test_schedule_output.py`	Added `test_caching_output()` test case to verify output token caching behavior
`tests/output/test_process_batch_output.py`	Updated mock `CacheConfig` class to include new caching configuration fields
`tests/output/test_get_save_output_v1.py`	Updated mock `CacheConfig` class to include new caching configuration fields

Important Notes:

PR Title Issue: The title contains a spelling error - "Fearture" should be "Feature"
Code Issues Found: Several bugs were identified in the implementation and test code, including inconsistent flag checking and incorrect return value handling in tests

fastdeploy/engine/sched/resource_manager_v1.py

tests/v1/test_schedule_output.py

fastdeploy/config.py

tests/v1/test_schedule_output.py

codecov-commenter · 2025-12-02T12:07:21Z

Codecov Report

❌ Patch coverage is 63.63636% with 4 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (develop@209006e). Learn more about missing BASE report.

Files with missing lines	Patch %	Lines
fastdeploy/output/token_processor.py	0.00%	2 Missing and 1 partial ⚠️
fastdeploy/engine/sched/resource_manager_v1.py	75.00%	0 Missing and 1 partial ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             develop    #4535   +/-   ##
==========================================
  Coverage           ?   58.60%           
==========================================
  Files              ?      325           
  Lines              ?    40283           
  Branches           ?     6100           
==========================================
  Hits               ?    23606           
  Misses             ?    14792           
  Partials           ?     1885

Flag	Coverage Δ
GPU	`58.60% <63.63%> (?)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

[Fearture] Support cache kv cache for output tokens

6189fea

rainyfly added 6 commits November 3, 2025 19:40

Merge branch 'develop' of https://github.com/PaddlePaddle/FastDeploy …

c2deecb

…into support_cache_output

fix bug

a937385

fix ci bug

c5e74ae

improve coverage

6ef2789

enable output caching by default

a45e16e

Merge branch 'develop' of https://github.com/PaddlePaddle/FastDeploy …

c6a1dba

…into support_cache_output

Copilot AI review requested due to automatic review settings December 2, 2025 10:52

Copilot started reviewing on behalf of rainyfly December 2, 2025 10:52 View session

Copilot finished reviewing on behalf of rainyfly December 2, 2025 10:55

Copilot AI reviewed Dec 2, 2025

View reviewed changes

rainyfly and others added 2 commits December 3, 2025 11:44

fix ci

f6075e5

Merge branch 'develop' into support_cache_output

e68524a

Jiang-Jia-Jun approved these changes Dec 4, 2025

View reviewed changes

Jiang-Jia-Jun merged commit 3878a99 into PaddlePaddle:develop Dec 4, 2025
15 of 19 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Fearture] Support cache kv cache for output tokens#4535

[Fearture] Support cache kv cache for output tokens#4535
Jiang-Jia-Jun merged 9 commits intoPaddlePaddle:developfrom
rainyfly:support_cache_output

rainyfly commented Oct 22, 2025 •

edited

Loading

Uh oh!

paddle-bot bot commented Oct 22, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov-commenter commented Dec 2, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

rainyfly commented Oct 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Modifications

Usage or Command

Accuracy Tests

Checklist

Uh oh!

paddle-bot bot commented Oct 22, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov-commenter commented Dec 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

rainyfly commented Oct 22, 2025 •

edited

Loading

codecov-commenter commented Dec 2, 2025 •

edited

Loading