[Feature] Add check health in FD#5534
Conversation
|
Thanks for your contribution! |
There was a problem hiding this comment.
Pull request overview
This PR adds health monitoring functionality for the token processor to detect and prevent hang situations. The implementation introduces a health check command that can be called externally to verify if the token processor is operating correctly.
Key changes:
- Added
healthy()method to check token processor health status based on timestamp tracking - Implemented timestamp monitoring before and after batch processing
- Added
check_healthcommand handler in the internal adapter
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 7 comments.
| File | Description |
|---|---|
| fastdeploy/envs.py | Adds FD_TOKEN_PROCESSOR_HEALTH_TIMEOUT configuration with 120-second default |
| fastdeploy/output/token_processor.py | Implements health monitoring with timestamps and healthy() method to detect hung states |
| fastdeploy/splitwise/internal_adapter_utils.py | Adds check_health command handler that invokes the token processor's healthy() method |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #5534 +/- ##
==========================================
Coverage ? 60.73%
==========================================
Files ? 329
Lines ? 41161
Branches ? 6274
==========================================
Hits ? 24998
Misses ? 14273
Partials ? 1890
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Motivation
防止 token processor hang 死,添加探活接口由于探知。
Modifications
新增服务层探活 check 项:token processor组件
Usage or Command
None
Accuracy Tests
None
Checklist
[FDConfig],[APIServer],[Engine],[Scheduler],[PD Disaggregation],[Executor],[Graph Optimization],[Speculative Decoding],[RL],[Models],[Quantization],[Loader],[OP],[KVCache],[DataProcessor],[BugFix],[Docs],[CI],[Optimization],[Feature],[Benchmark],[Others],[XPU],[HPU],[GCU],[DCU],[Iluvatar],[Metax]]pre-commitbefore commit.releasebranch, make sure the PR has been submitted to thedevelopbranch, then cherry-pick it to thereleasebranch with the[Cherry-Pick]PR tag.