Skip to content

Conversation

@DrRyanHuang
Copy link
Collaborator

@DrRyanHuang DrRyanHuang commented Oct 29, 2025

Motivation

per_token_quant_padding ERNIE45T 300B 模型 FP8 转静过程中遇到自定义算子缺少 InferShape / InferDtype 的问题:

call paddle.api : static_op_per_token_quant_padding 
terminate called after throwing an instance of 'common::enforce::EnforceNotMet'
  what():  (Unavailable) Your custom operator contains multiple outputs. 

We only allow a custom operator that contains only one input and only one output without setting the InferShapeFn/InferDtypeFn. 
At this time, the input shape/dtype will be directly set to the output shape/dtype.

Please set the InferShapeFn/InferDtypeFn of custom operator by 
	.SetInferShapeFn(PD_INFER_SHAPE(...)) / .SetInferDtypeFn(PD_INFER_DTYPE(...))

  [Hint: Expected OpMetaInfoHelper::GetOutputs(custom_op_meta).size() == 1UL, 
but received OpMetaInfoHelper::GetOutputs(custom_op_meta).size():2 != 1UL:1.] 
(at /workspace/Paddle/paddle/fluid/framework/custom_operator_utils.h:219)

Modifications

per_token_quant_padding 添加 InferShape / InferDtype 函数

Usage or Command

NO NEED

Accuracy Tests

NO NEED

Checklist

  • Add at least a tag in the PR title.
    • Tag list: [[FDConfig],[APIServer],[Engine], [Scheduler], [PD Disaggregation], [Executor], [Graph Optimization], [Speculative Decoding], [RL], [Models], [Quantization], [Loader], [OP], [KVCache], [DataProcessor], [BugFix], [Docs], [CI], [Optimization], [Feature], [Benchmark], [Others], [XPU], [HPU], [GCU], [DCU], [Iluvatar], [Metax]]
    • You can add new tags based on the PR content, but the semantics must be clear.
  • Format your code, run pre-commit before commit.
  • Add unit tests. Please write the reason in this PR if no unit tests.
  • Provide accuracy results.
  • If the current PR is submitting to the release branch, make sure the PR has been submitted to the develop branch, then cherry-pick it to the release branch with the [Cherry-Pick] PR tag.

@paddle-bot
Copy link

paddle-bot bot commented Oct 29, 2025

Thanks for your contribution!

@DrRyanHuang
Copy link
Collaborator Author

DrRyanHuang commented Oct 29, 2025

📢 建议分两个 commit review 该 PR 的时候:

@SigureMo
Copy link
Member

这个 format,能单独一个 PR 全 format 一下么 @DrRyanHuang @gongshaotian

@gongshaotian
Copy link
Collaborator

这个 format,能单独一个 PR 全 format 一下么 @DrRyanHuang @gongshaotian

有尝试全format,gpu算子会出现编译问题,PR就临时close了,后面分批提PR format吧

@qingqing01 qingqing01 merged commit e25c067 into PaddlePaddle:develop Oct 30, 2025
26 of 27 checks passed
@DrRyanHuang DrRyanHuang deleted the add_InferShape_Type4per_token_quant_padding branch October 30, 2025 02:30
DrRyanHuang added a commit to cattidea/FastDeploy that referenced this pull request Oct 30, 2025
…addle#4667)

* add InferShape&InferDtype for per_token_quant_padding

* fix codestyle
Jiang-Jia-Jun pushed a commit that referenced this pull request Oct 31, 2025
…4683)

* add InferShape&InferDtype for per_token_quant_padding

* fix codestyle
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants