[Question] 聊天界面LLM输出比直接调用输出慢很多

### What happened? / 实际发生了什么？

在聊天的界面的流式输出情况下，同一个模型每秒输出的token数比直接用python的openai库调用慢很多，目测这个webui输出的速度是直接调用的一半到三分之一。为什么？

### Expected behavior / 预期行为

每秒token数和直接openai.OpenAI.chat.completions.create的流式输出速度应该相当。

### Steps to reproduce / 复现步骤

直接聊天……

### Desktop version / 桌面端版本

v4.23.1

### Installation channel / 安装来源

GitHub Release installer

### OS / 操作系统

Windows

### Architecture / 架构

amd64

### Upstream AstrBot ref used by desktop build (optional) / 桌面构建使用的上游 AstrBot Ref（可选）

_No response_

### Logs, screenshots, and additional context / 日志、截图与补充信息

无……

### Willing to submit a PR? / 是否愿意提交 PR？

- [ ] I am willing to submit a PR to fix this issue. / 我愿意提交 PR 修复此问题。

### Code of Conduct

- [x] I agree to follow the project's Code of Conduct. / 我同意遵守项目行为准则。

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Question] 聊天界面LLM输出比直接调用输出慢很多 #120

What happened? / 实际发生了什么？

Expected behavior / 预期行为

Steps to reproduce / 复现步骤

Desktop version / 桌面端版本

Installation channel / 安装来源

OS / 操作系统

Architecture / 架构

Upstream AstrBot ref used by desktop build (optional) / 桌面构建使用的上游 AstrBot Ref（可选）

Logs, screenshots, and additional context / 日志、截图与补充信息

Willing to submit a PR? / 是否愿意提交 PR？

Code of Conduct

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Question] 聊天界面LLM输出比直接调用输出慢很多 #120

Description

What happened? / 实际发生了什么？

Expected behavior / 预期行为

Steps to reproduce / 复现步骤

Desktop version / 桌面端版本

Installation channel / 安装来源

OS / 操作系统

Architecture / 架构

Upstream AstrBot ref used by desktop build (optional) / 桌面构建使用的上游 AstrBot Ref（可选）

Logs, screenshots, and additional context / 日志、截图与补充信息

Willing to submit a PR? / 是否愿意提交 PR？

Code of Conduct

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions