-
Notifications
You must be signed in to change notification settings - Fork 690
[Log] Add trace log and add loggingInstrumentor tool #4692
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Log] Add trace log and add loggingInstrumentor tool #4692
Conversation
|
Thanks for your contribution! |
a7a07eb to
afc83d9
Compare
|
/re-run run_tests_with_coverage |
|
/re-run run_tests_with_coverage |
02155ff to
67e6730
Compare
cd4e314 to
a25b7c9
Compare
f21552e to
126f89c
Compare
|
/re-run ci_xpu |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds comprehensive tracing and logging capabilities to the FastDeploy system for tracking request lifecycle events and improving observability.
- Introduces a new trace logging module with event-based logging across preprocessing, scheduling, inference, and postprocessing stages
- Adds OpenTelemetry logging instrumentation support to integrate with distributed tracing systems
- Implements a custom log formatter that supports structured attributes and OTEL span/trace ID injection
Reviewed Changes
Copilot reviewed 19 out of 21 changed files in this pull request and generated 19 comments.
Show a summary per file
| File | Description |
|---|---|
| fastdeploy/trace/trace_logger.py | New trace logging function that records events with request context |
| fastdeploy/trace/constants.py | Defines logging event names, stage names, and event-to-stage mappings |
| fastdeploy/logger/formatters.py | Adds CustomFormatter with attribute expansion and OTEL field support |
| fastdeploy/logger/logger.py | Implements get_trace_logger method for creating trace-specific loggers |
| fastdeploy/utils.py | Initializes global trace_logger instance |
| fastdeploy/metrics/trace_util.py | Integrates OpenTelemetry logging instrumentation |
| fastdeploy/output/token_processor.py | Adds trace logging for token generation events |
| fastdeploy/entrypoints/engine_client.py | Adds trace logging at preprocessing start |
| fastdeploy/engine/common_engine.py | Adds trace logging throughout scheduling and resource allocation |
| fastdeploy/entrypoints/openai/serving_completion.py | Adds trace logging for completion endpoint postprocessing end |
| fastdeploy/entrypoints/openai/serving_chat.py | Adds trace logging for chat endpoint postprocessing end |
| requirements*.txt | Adds opentelemetry-instrumentation-logging dependency |
| tests/trace/test_trace_logger.py | Tests for trace_print function |
| tests/trace/test_constants.py | Tests for event and stage enumerations |
| tests/output/test_token_processor_trace_print.py | Tests trace logging in token processor |
| tests/logger/test_logger.py | Tests for get_trace_logger method |
| tests/logger/test_formatters.py | Extensive tests for CustomFormatter and ColoredFormatter |
Comments suppressed due to low confidence (3)
fastdeploy/logger/formatters.py:59
- Except block directly handles BaseException.
except:
fastdeploy/logger/formatters.py:125
- Except block directly handles BaseException.
except:
fastdeploy/trace/trace_logger.py:23
- Except block directly handles BaseException.
except:
6dd3637 to
5c49942
Compare
24573ae to
845b899
Compare
|
/re-run run_tests_with_coverage |
|
/re-run run_ce_cases |
|
/re-run run_tests_with_coverage |
|
/re-run run_tests_logprob |
1 similar comment
|
/re-run run_tests_logprob |
|
/re-run run_tests_with_coverage |
|
/re-run run_tests_with_coverage |
Motivation
目前推理阶段缺乏细粒度的时间打点数据,无法支撑对推理内部阶段的耗时分布查询。因此,需要对 FastDeploy 推理阶段进行细化划分,并增加日志打点。
此外,现有日志系统存在以下问题:
为了解决无法快速定位的问题,引入 OpenTelemetry LoggingInstrumentor 工具,将 日志(Logs) 与 追踪(Traces) 关联起来,从而提升系统的可观测性与调试能力。
Modifications
1. 新增 Trace Logger
2. 新增自定义 Formatter
3. 引入 LoggingInstrumentor
4. FastDeploy 阶段划分与打点
在 FastDeploy 各主要阶段插入日志打点,以支持耗时分析与追踪。
打点事件与阶段对应表:
5. 打点工具类实现
为了规范化和自动化日志追踪信息的记录,定义了以下核心组件:
核心枚举类 (Enums)
这些枚举定义了 FastDeploy 请求处理流程中的标准打点事件和阶段,是实现细粒度追踪的基础。
LoggingEventName:
StageName:
EVENT_TO_STAGE_MAP:
trace_logger打印函数 (print)
Usage or Command
打点示例:
Accuracy Tests
打印示例(未开启trace):
打印示例(开启 Trace):
Checklist
[FDConfig],[APIServer],[Engine],[Scheduler],[PD Disaggregation],[Executor],[Graph Optimization],[Speculative Decoding],[RL],[Models],[Quantization],[Loader],[OP],[KVCache],[DataProcessor],[BugFix],[Docs],[CI],[Optimization],[Feature],[Benchmark],[Others],[XPU],[HPU],[GCU],[DCU],[Iluvatar],[Metax]]pre-commitbefore commit.releasebranch, make sure the PR has been submitted to thedevelopbranch, then cherry-pick it to thereleasebranch with the[Cherry-Pick]PR tag.