Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Dec 9, 2025

Motivation

Cherry-pick fix from PR #5469. The limit_thinking kernels incorrectly used AND (&&) logic for early return conditions, requiring both current_limit_think_status match AND stop_flags[bid] to be true. This prevented proper early return when only one condition was met.

Modifications

Changed condition logic from AND to OR in 4 CUDA kernel files:

  • custom_ops/gpu_ops/limit_thinking_content_length_v1.cu
  • custom_ops/gpu_ops/limit_thinking_content_length_v2.cu
  • custom_ops/gpu_ops/speculate_decoding/speculate_limit_thinking_content_length_v1.cu
  • custom_ops/gpu_ops/speculate_decoding/speculate_limit_thinking_content_length_v2.cu

Before:

if (current_limit_think_status == 2 && stop_flags[bid]) {
    return;  // Only returns if BOTH conditions true
}

After:

if (current_limit_think_status == 2 || stop_flags[bid]) {
    return;  // Returns if EITHER condition true
}

Updated Chinese comments from "且" (AND) to "或者" (OR) to match corrected logic.

Usage or Command

No API changes. Bug fix only affects internal kernel behavior.

Accuracy Tests

No unit tests added - cherry-pick of verified fix from #5469.

Checklist

  • Add at least a tag in the PR title.
  • Format your code, run pre-commit before commit.
  • Add unit tests. Please write the reason in this PR if no unit tests.
  • Provide accuracy results.
  • If the current PR is submitting to the release branch, make sure the PR has been submitted to the develop branch, then cherry-pick it to the release branch with the [Cherry-Pick] PR tag.
Original prompt

This section details on the original issue you should resolve

<issue_title>把这个 #5469 改动也给develop以及release/online/20251131 分别提交一下</issue_title>
<issue_description></issue_description>

Comments on the Issue (you are @copilot in this section)


💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@paddle-bot
Copy link

paddle-bot bot commented Dec 9, 2025

Thanks for your contribution!

@paddle-bot paddle-bot bot added the contributor External developers label Dec 9, 2025
Copilot AI and others added 2 commits December 9, 2025 13:43
Co-authored-by: yuanlehome <23653004+yuanlehome@users.noreply.github.com>
Co-authored-by: yuanlehome <23653004+yuanlehome@users.noreply.github.com>
@yuanlehome yuanlehome marked this pull request as ready for review December 9, 2025 13:45
Copilot AI review requested due to automatic review settings December 9, 2025 13:45
Copilot AI changed the title [WIP] Apply changes from PR 5469 to develop and release branches [BugFix] Fix limit_thinking early return logic in CUDA kernels Dec 9, 2025
Copilot AI requested a review from yuanlehome December 9, 2025 13:46
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

这个PR将来自PR #5469的bug修复应用到develop和release分支。主要修复了4个GPU CUDA内核中的条件逻辑错误,将AND操作符改为OR操作符,并更新了相应的中文注释以匹配新的逻辑。

  • 修复了思考内容长度限制的early-exit条件,从 && (AND) 改为 || (OR)
  • 在所有4个相关文件中一致地应用了此修复
  • 更新了中文注释以准确反映OR逻辑

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.

File Description
custom_ops/gpu_ops/limit_thinking_content_length_v1.cu 修复early-exit条件逻辑 (status==2的情况)
custom_ops/gpu_ops/limit_thinking_content_length_v2.cu 修复early-exit条件逻辑 (status==3的情况)
custom_ops/gpu_ops/speculate_decoding/speculate_limit_thinking_content_length_v1.cu 修复speculative decoding版本的early-exit条件逻辑 (status==2)
custom_ops/gpu_ops/speculate_decoding/speculate_limit_thinking_content_length_v2.cu 修复speculative decoding版本的early-exit条件逻辑 (status==3)

@yuanlehome yuanlehome merged commit e38709b into develop Dec 10, 2025
26 of 38 checks passed
@yuanlehome yuanlehome deleted the copilot/update-develop-and-release-branches branch December 16, 2025 07:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

contributor External developers

Projects

None yet

Development

Successfully merging this pull request may close these issues.

把这个 https://github.com/PaddlePaddle/FastDeploy/pull/5469 改动也给develop以及release/online/20251131 分别提交一下

4 participants