功能建议：在调用大模型时支持关闭模型的思考（think）过程

### Description / 描述

功能建议：在调用大模型时支持关闭模型的思考（think）过程

问题描述
我通过 Ollama 部署并调用 qwen3.5 模型时，在模型配置中明确添加了 think: false 配置项，期望关闭模型的思考过程以减少 Token 消耗、提升响应速度。
但实际调用后发现，模型依然会花费大量时间执行思考逻辑，该配置未生效。经排查代码发现：
1.这里选择ollama为模型供应商，但发出的请求是openai格式的，所以导致think: false 配置项失效。
2.我使用脚本把openai格式的请求转化为ollama并添加think: false 配置项，可以正常交流使用，但是碰见工具调用就会卡住。

核心诉求
希望在调用大模型时，增加对 think 参数的透传支持：当用户配置 think: false 时，能让 Qwen3.5 模型跳过思考过程，直接生成最终回答，从而真正实现 Token 节省和响应提速。

复现步骤
基于 Ollama 部署 qwen3.5 :9b模型，并在配置中设置 think: false；
观察到模型仍执行思考逻辑，响应耗时无优化，Token 消耗未减少。

总结
核心问题：Ollama 的 OpenAI 兼容接口未透传 think 参数，导致 think: false 配置对 qwen3.5 模型无效；
功能诉求：在该兼容接口中增加 think 参数支持，实现关闭 qwen3.5 思考过程的能力；
预期收益：减少 Token 消耗，提升 qwen3.5 模型的响应速度。

<img width="1647" height="787" alt="Image" src="https://github.com/user-attachments/assets/525ef599-d152-4807-9a90-7c37947716f1" />

<img width="1115" height="555" alt="Image" src="https://github.com/user-attachments/assets/0c54b7ca-6d06-4c64-9c70-2f6ada9d1300" />

### Use Case / 使用场景

_No response_

### Willing to Submit PR? / 是否愿意提交PR？

- [ ] Yes, I am willing to submit a PR. / 是的，我愿意提交 PR。

### Code of Conduct

- [x] I have read and agree to abide by the project's [Code of Conduct](https://docs.github.com/zh/site-policy/github-terms/github-community-code-of-conduct). /


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

功能建议：在调用大模型时支持关闭模型的思考（think）过程 #5769

Description / 描述

Use Case / 使用场景

Willing to Submit PR? / 是否愿意提交PR？

Code of Conduct

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

功能建议：在调用大模型时支持关闭模型的思考（think）过程 #5769

Description

Description / 描述

Use Case / 使用场景

Willing to Submit PR? / 是否愿意提交PR？

Code of Conduct

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions