Skip to content

Conversation

@DrRyanHuang
Copy link
Collaborator

@DrRyanHuang DrRyanHuang commented Nov 10, 2025

Motivation

动态图运行 Qwen2-0.5B 报错:

RuntimeError: (PreconditionNotMet) Tensor's dimension is out of bound.Tensor's dimension must be equal or less than the size of its memory.But received Tensor's dimension is 272269312, memory's size is 0.
  [Hint: Expected numel() * SizeOf(dtype()) <= memory_size(), but received numel() * SizeOf(dtype()):272269312 > memory_size():0.] (at /workspace/Paddle/paddle/phi/core/dense_tensor_impl.cc:48)
  [operator < linear > error]

定位到是 self.lm_head.weight 权重没有初始化

Qwen2-0.5B / 1.5B 等小模型为了减少总参数量,提升小模型核心部分的参数占比,开启 tie_word_embeddings=True 将输入 Embedding 参数与输出 lm_head 的参数共享

#3502 没有考虑 Qwen2-0.5B / 1.5B 这类小模型,没有像 #3255 一样给 lm_head 做初始化,故本PR添加

Modifications

给 Qwen2 组网部分添加 lm_head 的权重初始化

Usage or Command

MODEL=/workspace/MODELS/Qwen2-0.5B-Instruct
rm -rf log/*

export export FD_USE_MACHETE=0
export CUDA_VISIBLE_DEVICES=0
export PORT=39905

python -m fastdeploy.entrypoints.openai.api_server \
  --model $MODEL \
  --metrics-port 39717 \
  --port $PORT \
  --engine-worker-queue-port 39719 \
  --max-model-len 32768 \
  --max-num-seqs 256 \
  --kv-cache-ratio 0.75 \
  --tensor-parallel-size 1 \
  --graph-optimization-config '{"graph_opt_level": 0, "use_cudagraph": false, "full_cuda_graph": false}' \

Accuracy Tests

Checklist

  • Add at least a tag in the PR title.
    • Tag list: [[FDConfig],[APIServer],[Engine], [Scheduler], [PD Disaggregation], [Executor], [Graph Optimization], [Speculative Decoding], [RL], [Models], [Quantization], [Loader], [OP], [KVCache], [DataProcessor], [BugFix], [Docs], [CI], [Optimization], [Feature], [Benchmark], [Others], [XPU], [HPU], [GCU], [DCU], [Iluvatar], [Metax]]
    • You can add new tags based on the PR content, but the semantics must be clear.
  • Format your code, run pre-commit before commit.
  • Add unit tests. Please write the reason in this PR if no unit tests.
  • Provide accuracy results.
  • If the current PR is submitting to the release branch, make sure the PR has been submitted to the develop branch, then cherry-pick it to the release branch with the [Cherry-Pick] PR tag.

@paddle-bot
Copy link

paddle-bot bot commented Nov 10, 2025

Thanks for your contribution!

@DrRyanHuang DrRyanHuang changed the title [BugFix][BugFix] Add tie_word_embeddings for lmhead [BugFix][Models] Add tie_word_embeddings for lmhead Nov 10, 2025
Copy link
Collaborator

@gongshaotian gongshaotian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@EmmonsCurse EmmonsCurse merged commit 07a82af into PaddlePaddle:develop Nov 11, 2025
36 of 43 checks passed
@DrRyanHuang DrRyanHuang deleted the fix_qwen2_tie_word_embeddings branch November 11, 2025 02:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants