-
Notifications
You must be signed in to change notification settings - Fork 91
Pull requests: jd-opensource/xllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
refactor: remove MTP-specific function requirement from non-MTP models.
#509
opened Dec 9, 2025 by
yingxudeng
Loading…
bugfix: support multiple models or multiple model version with independent instance.
#505
opened Dec 9, 2025 by
liujinguang0125
Loading…
bugfix: fix the issue of missing MMData input during engine ->worker transfer via brpc format.
#501
opened Dec 8, 2025 by
magicheng0816
Loading…
refactor: implement Programmatic Dependent Launch (PDL) support in Device class for cuda device.
#500
opened Dec 8, 2025 by
XuZhang99
Loading…
bugfix: fix the issue of ineffective input embedding transmission.
#490
opened Dec 5, 2025 by
magicheng0816
Loading…
refactor: separate the weight loading in the npu layer class.
#489
opened Dec 5, 2025 by
Clement-Wang26
Loading…
feat: add wrappers for ATB and ACLNN fused operators.
#474
opened Dec 2, 2025 by
yingxudeng
Loading…
refactor: separate mlu and cuda version Qwen model implementation.
cuda
#468
opened Dec 1, 2025 by
XuZhang99
Loading…
refactor: optimize unique token count preparation of batch input builder.
#449
opened Nov 27, 2025 by
RobbieLeung
Loading…
[WIP] feat: support loading model weights and forward overlap.
#441
opened Nov 26, 2025 by
Clement-Wang26
Loading…
feat: support Qwen2-VL & GME-Qwen2-VL model on npu device.
#399
opened Nov 18, 2025 by
xanecdotex
Loading…
feat: enable torch_npu graph mode for Qwen-3 dense with TP support.
#325
opened Nov 6, 2025 by
yingxudeng
Loading…
ProTip!
Find all pull requests that aren't related to any open issues with -linked:issue.