feat: add qwen2vl qnn support by UbiquitousLearning · Pull Request #291 · UbiquitousLearning/mllm

UbiquitousLearning · 2025-06-09T07:35:25Z

No description provided.

commit efde6d0d014b647b8ceea59441aef1bd3ac424c0 Author: yirongjie <yirj0809@gmail.com> Date: Tue May 27 16:09:16 2025 +0000 fix: merge commit fe7fb476717e99df2eac23ab7fd1088e03cf8b3c Merge: f52bb32e 20e94c0 Author: yirongjie <yirj0809@gmail.com> Date: Tue May 27 16:09:08 2025 +0000 Merge branch 'main' of https://github.com/yirongjie/mllm commit f52bb32e5dbf4edcd4998d664ae071a1b5c8dbbb Author: yirongjie <yirj0809@gmail.com> Date: Tue May 27 12:25:08 2025 +0000 fix: merge from qnn-qwen2vl; commit 6f6c2442f750363c6789e7717861ea3a216cf356 Author: yirongjie <yirj0809@gmail.com> Date: Tue May 27 12:24:17 2025 +0000 Squashed commit of the following: commit 4862c76 Author: oreomaker <zh002919@outlook.com> Date: Thu May 15 14:59:37 2025 +0800 refact: use hvx qnn silu(faster); usable showui npu version commit 5df1b07 Author: oreomaker <zh002919@outlook.com> Date: Wed May 14 22:10:52 2025 +0800 feat: qnn dequantize_add hvx op commit c813f55 Author: oreomaker <zh002919@outlook.com> Date: Tue May 13 09:50:06 2025 +0800 chore: format qnn op package code commit ea215f0 Author: oreomaker <zh002919@outlook.com> Date: Mon May 12 11:34:38 2025 +0800 feat: free act tensors after qnn vit embedding commit e4f5011 Author: oreomaker <zh002919@outlook.com> Date: Mon May 12 11:14:30 2025 +0800 chore: remove save data in modeling qwen2vlnpu commit 2dcb677 Author: oreomaker <zh002919@outlook.com> Date: Mon May 12 10:48:34 2025 +0800 fix: seperate weights for embedding-lmhead when using rotated qwen2vl/showui commit 4847318 Author: oreomaker <zh002919@outlook.com> Date: Sun May 11 21:16:59 2025 +0800 fix: cpu tensor free bug(todo: handle tensor free) commit 799b673 Author: xudaliang <xudaliang@pku.edu.cn> Date: Sat May 10 22:51:11 2025 +0800 feat : new qwen2_vl model. commit dd1817d Author: xudaliang <xudaliang@pku.edu.cn> Date: Sat May 10 22:50:35 2025 +0800 feat : support qwen2-vl rotation model with fp bias. commit 305dc5c Author: oreomaker <zh002919@outlook.com> Date: Thu May 8 21:37:35 2025 +0800 feat: runnable qwen2vl qnn showui(2*256) commit 8e14815 Author: oreomaker <zh002919@outlook.com> Date: Thu May 8 21:36:33 2025 +0800 fix: pre processing of qwen2vl commit e041296 Author: oreomaker <zh002919@outlook.com> Date: Thu May 8 21:34:07 2025 +0800 refact: qwen vl npu modeling using closetFactor view(64->8x8) feat: get_position_id padding in Qwen2VL_ImagePatchAndEmbedding commit 5b17204 Author: oreomaker <zh002919@outlook.com> Date: Thu May 8 21:29:13 2025 +0800 feat: vit(visual_xx) tensor reuse for qnn (noted as: QNN VLM trick) commit 7c42658 Author: oreomaker <zh002919@outlook.com> Date: Thu May 8 21:26:49 2025 +0800 feat: finish cpu pipeline mrope commit 0962c00 Author: oreomaker <zh002919@outlook.com> Date: Tue May 6 11:39:29 2025 +0800 feat: pipeline multimodal rope commit 5317933 Author: oreomaker <zh002919@outlook.com> Date: Tue May 6 11:38:10 2025 +0800 refactor: use old&fast qnn silu commit 5bd14de Author: oreomaker <zh002919@outlook.com> Date: Mon Apr 28 21:10:48 2025 +0800 feat: runnable qwen 2 vl npu commit 1df6eed Author: oreomaker <zh002919@outlook.com> Date: Sun Apr 27 10:13:44 2025 +0800 refactor: tensor.to(QNN) commit d3d29c4 Author: oreomaker <zh002919@outlook.com> Date: Sat Apr 26 21:22:52 2025 +0800 chore: remove saveData in qwen2vl modeling commit c40e0c0 Author: oreomaker <zh002919@outlook.com> Date: Sat Apr 26 20:51:16 2025 +0800 feat: add qnn retrieve context info log commit 175d3a2 Author: oreomaker <zh002919@outlook.com> Date: Sat Apr 26 20:46:14 2025 +0800 fix: qwen 2 vl npu input tensor backend(correct version) commit 871e920 Author: oreomaker <zh002919@outlook.com> Date: Fri Apr 25 09:50:05 2025 +0800 fix: quantize i16 arm neon macro commit a2b802c Author: xudaliang <xudaliang@pku.edu.cn> Date: Wed Apr 23 18:33:26 2025 +0800 fix : Qwen2-VL prefill bugs: 1.FP32 KVCache. 2.LMHead does not execute. commit 8c66604 Author: oreomaker <zh002919@outlook.com> Date: Fri Apr 18 15:35:03 2025 +0800 fix: restore qwen2.5 modeling commit f138beb Author: oreomaker <zh002919@outlook.com> Date: Fri Apr 18 15:28:35 2025 +0800 fix: restore debug change commit 09e12ce Merge: d725942 9b271a9 Author: oreomaker <zh002919@outlook.com> Date: Fri Apr 18 13:39:10 2025 +0800 Merge branch 'debug-qwen2.5' of github.com:liang1232018/mllm into debug-qwen2.5 commit d725942 Author: oreomaker <zh002919@outlook.com> Date: Fri Apr 18 13:39:04 2025 +0800 dev: qnn sigmoid version silu feat: qnn backend f16 type input commit 9b271a9 Author: xudaliang <xudaliang@pku.edu.cn> Date: Fri Apr 18 13:24:52 2025 +0800 fix : linear W8A8 bias uint8 type bug commit 793a6c6 Author: xudaliang <xudaliang@pku.edu.cn> Date: Fri Apr 18 13:23:49 2025 +0800 fix : Shadow linear triger condition. commit 4e24bca Author: oreomaker <zh002919@outlook.com> Date: Wed Apr 16 20:53:07 2025 +0800 qwen 2.5 debug commit 4d74756 Author: oreomaker <zh002919@outlook.com> Date: Wed Apr 16 20:52:33 2025 +0800 fix: shadow linear commit 5866e2b Author: oreomaker <zh002919@outlook.com> Date: Tue Apr 15 22:17:12 2025 +0800 qwen 2.5 debug commit 29e9b92 Author: oreomaker <zh002919@outlook.com> Date: Mon Apr 14 09:28:45 2025 +0800 fix: remove shadow linear if(round_value) logic commit a61e837 Author: oreomaker <zh002919@outlook.com> Date: Sun Apr 13 22:03:45 2025 +0800 feat: int16 qkv for qwen2.5 vl npu commit 566f21d Author: xudaliang <xudaliang@pku.edu.cn> Date: Sun Apr 13 18:45:06 2025 +0800 fix : modeling input quantize to I8, but dequantize with I16 bug. commit 60639d0 Author: xudaliang <xudaliang@pku.edu.cn> Date: Sun Apr 13 18:44:18 2025 +0800 fix : LLaMADequantize INT16 to FP32 shuffle order bugs. commit a5cc652 Author: xudaliang <xudaliang@pku.edu.cn> Date: Sun Apr 13 17:31:10 2025 +0800 fix : LLaMAQuantize FP32 to INT16 round scale error. commit f139822 Author: oreomaker <zh002919@outlook.com> Date: Sat Apr 12 22:24:30 2025 +0800 fix: qnn int 16 linear bias(use int8 bias scale) commit 8831811 Author: oreomaker <zh002919@outlook.com> Date: Sat Apr 12 15:03:40 2025 +0800 debug: qnn int16 linear commit 088fe09 Author: xudaliang <xudaliang@pku.edu.cn> Date: Fri Apr 11 23:22:41 2025 +0800 feat : support INT16 dequantize and quantize. commit 73ebe87 Merge: b73c1c3 6007443 Author: liang1232018 <40791416+liang1232018@users.noreply.github.com> Date: Wed Apr 9 14:50:25 2025 +0800 Merge pull request UbiquitousLearning#12 from liang1232018/develop-zh Develop zh commit 6007443 Merge: 1c8647e b73c1c3 Author: liang1232018 <40791416+liang1232018@users.noreply.github.com> Date: Wed Apr 9 14:50:07 2025 +0800 Merge branch 'develop-xdl' into develop-zh commit 1c8647e Author: oreomaker <zh002919@outlook.com> Date: Tue Apr 8 21:39:56 2025 +0800 fix: qnn quant scale pow(2,bit) -> pow(2,bit-1) commit cc760ae Author: oreomaker <zh002919@outlook.com> Date: Tue Apr 8 17:03:17 2025 +0800 fix: op create param type->dtype commit 6afa80c Author: oreomaker <zh002919@outlook.com> Date: Mon Apr 7 15:25:21 2025 +0800 feat: Tensor::saveData only do when STATIC_READY commit 2ebded3 Author: oreomaker <zh002919@outlook.com> Date: Mon Apr 7 15:24:11 2025 +0800 feat: add qnn int16 layer param & op todo: qnn llama package implement commit 4faeca8 Author: oreomaker <zh002919@outlook.com> Date: Mon Mar 24 15:52:54 2025 +0800 dev: runnable qwen2vl npu (buggy) commit ebf110e Author: oreomaker <zh002919@outlook.com> Date: Mon Mar 24 15:46:23 2025 +0800 feat: add qwen vl export tool (todo: simulate infer and profile tools) commit bde9a92 Author: oreomaker <zh002919@outlook.com> Date: Mon Mar 24 15:44:25 2025 +0800 dev: a just working version of qwen 2.5 npu commit 126c283 Merge: 25de8c3 9d33aaf Author: oreomaker <zh002919@outlook.com> Date: Mon Mar 24 15:43:30 2025 +0800 Merge branch 'fix-qnn-python' into develop-zh commit 9d33aaf Author: oreomaker <zh002919@outlook.com> Date: Fri Mar 21 16:01:23 2025 +0800 fix: qnn profile quant bugs commit 25de8c3 Author: oreomaker <zh002919@outlook.com> Date: Thu Mar 20 16:00:19 2025 +0800 refactor: add graph split layer for QNN, change the modeling note: xnnpack is affected, should not merge commit 690a24e Author: oreomaker <zh002919@outlook.com> Date: Mon Mar 17 17:45:34 2025 +0800 feat: QNN load cache execute commit 4f28330 Author: oreomaker <zh002919@outlook.com> Date: Sun Mar 9 22:33:21 2025 +0800 dev: QNN graph merging execute commit b73c1c3 Author: xudaliang <xudaliang@pku.edu.cn> Date: Tue Nov 12 23:28:12 2024 +0800 feat : support decoding model configuration. commit ec3d4e5 Author: xudaliang <xudaliang@pku.edu.cn> Date: Tue Nov 12 20:31:45 2024 +0800 feat : support Qwen2.5 npu. commit 7246d53 Author: yirongjie <yirj0809@gmail.com> Date: Tue May 27 07:12:53 2025 +0000 feat: set run in Backends commit 1150241 Author: yirongjie <yirj0809@gmail.com> Date: Sat May 24 07:57:09 2025 +0000 fix: getFunc commit 24db241 Author: yirongjie <yirj0809@gmail.com> Date: Fri May 23 05:16:41 2025 +0000 fix: tensor function <Tensor *> to shared_ptr<Tensor> commit 0ecce75 Author: yirongjie <yirj0809@gmail.com> Date: Thu May 22 14:05:11 2025 +0000 feat：eager cpu commit 9835db5 Author: yirongjie <yirj0809@gmail.com> Date: Fri Apr 18 14:57:21 2025 +0000 fix: vtp commit 30c3046 Author: yirongjie <yirj0809@gmail.com> Date: Wed Apr 16 06:49:46 2025 +0000 fix: vtp commit b416268 Author: yirongjie <yirj0809@gmail.com> Date: Tue Apr 15 08:40:22 2025 +0000 fix: vtp commit 6430ca8 Author: yirongjie <yirj0809@gmail.com> Date: Mon Apr 14 12:53:58 2025 +0000 feat: vtp commit f86bff6 Author: yirongjie <yirj0809@gmail.com> Date: Sun Mar 23 09:41:14 2025 +0000 ref: add ShowUI

* Squashed commit of the following: commit efde6d0d014b647b8ceea59441aef1bd3ac424c0 Author: yirongjie <yirj0809@gmail.com> Date: Tue May 27 16:09:16 2025 +0000 fix: merge commit fe7fb476717e99df2eac23ab7fd1088e03cf8b3c Merge: f52bb32e 20e94c0 Author: yirongjie <yirj0809@gmail.com> Date: Tue May 27 16:09:08 2025 +0000 Merge branch 'main' of https://github.com/yirongjie/mllm commit f52bb32e5dbf4edcd4998d664ae071a1b5c8dbbb Author: yirongjie <yirj0809@gmail.com> Date: Tue May 27 12:25:08 2025 +0000 fix: merge from qnn-qwen2vl; commit 6f6c2442f750363c6789e7717861ea3a216cf356 Author: yirongjie <yirj0809@gmail.com> Date: Tue May 27 12:24:17 2025 +0000 Squashed commit of the following: commit 4862c76 Author: oreomaker <zh002919@outlook.com> Date: Thu May 15 14:59:37 2025 +0800 refact: use hvx qnn silu(faster); usable showui npu version commit 5df1b07 Author: oreomaker <zh002919@outlook.com> Date: Wed May 14 22:10:52 2025 +0800 feat: qnn dequantize_add hvx op commit c813f55 Author: oreomaker <zh002919@outlook.com> Date: Tue May 13 09:50:06 2025 +0800 chore: format qnn op package code commit ea215f0 Author: oreomaker <zh002919@outlook.com> Date: Mon May 12 11:34:38 2025 +0800 feat: free act tensors after qnn vit embedding commit e4f5011 Author: oreomaker <zh002919@outlook.com> Date: Mon May 12 11:14:30 2025 +0800 chore: remove save data in modeling qwen2vlnpu commit 2dcb677 Author: oreomaker <zh002919@outlook.com> Date: Mon May 12 10:48:34 2025 +0800 fix: seperate weights for embedding-lmhead when using rotated qwen2vl/showui commit 4847318 Author: oreomaker <zh002919@outlook.com> Date: Sun May 11 21:16:59 2025 +0800 fix: cpu tensor free bug(todo: handle tensor free) commit 799b673 Author: xudaliang <xudaliang@pku.edu.cn> Date: Sat May 10 22:51:11 2025 +0800 feat : new qwen2_vl model. commit dd1817d Author: xudaliang <xudaliang@pku.edu.cn> Date: Sat May 10 22:50:35 2025 +0800 feat : support qwen2-vl rotation model with fp bias. commit 305dc5c Author: oreomaker <zh002919@outlook.com> Date: Thu May 8 21:37:35 2025 +0800 feat: runnable qwen2vl qnn showui(2*256) commit 8e14815 Author: oreomaker <zh002919@outlook.com> Date: Thu May 8 21:36:33 2025 +0800 fix: pre processing of qwen2vl commit e041296 Author: oreomaker <zh002919@outlook.com> Date: Thu May 8 21:34:07 2025 +0800 refact: qwen vl npu modeling using closetFactor view(64->8x8) feat: get_position_id padding in Qwen2VL_ImagePatchAndEmbedding commit 5b17204 Author: oreomaker <zh002919@outlook.com> Date: Thu May 8 21:29:13 2025 +0800 feat: vit(visual_xx) tensor reuse for qnn (noted as: QNN VLM trick) commit 7c42658 Author: oreomaker <zh002919@outlook.com> Date: Thu May 8 21:26:49 2025 +0800 feat: finish cpu pipeline mrope commit 0962c00 Author: oreomaker <zh002919@outlook.com> Date: Tue May 6 11:39:29 2025 +0800 feat: pipeline multimodal rope commit 5317933 Author: oreomaker <zh002919@outlook.com> Date: Tue May 6 11:38:10 2025 +0800 refactor: use old&fast qnn silu commit 5bd14de Author: oreomaker <zh002919@outlook.com> Date: Mon Apr 28 21:10:48 2025 +0800 feat: runnable qwen 2 vl npu commit 1df6eed Author: oreomaker <zh002919@outlook.com> Date: Sun Apr 27 10:13:44 2025 +0800 refactor: tensor.to(QNN) commit d3d29c4 Author: oreomaker <zh002919@outlook.com> Date: Sat Apr 26 21:22:52 2025 +0800 chore: remove saveData in qwen2vl modeling commit c40e0c0 Author: oreomaker <zh002919@outlook.com> Date: Sat Apr 26 20:51:16 2025 +0800 feat: add qnn retrieve context info log commit 175d3a2 Author: oreomaker <zh002919@outlook.com> Date: Sat Apr 26 20:46:14 2025 +0800 fix: qwen 2 vl npu input tensor backend(correct version) commit 871e920 Author: oreomaker <zh002919@outlook.com> Date: Fri Apr 25 09:50:05 2025 +0800 fix: quantize i16 arm neon macro commit a2b802c Author: xudaliang <xudaliang@pku.edu.cn> Date: Wed Apr 23 18:33:26 2025 +0800 fix : Qwen2-VL prefill bugs: 1.FP32 KVCache. 2.LMHead does not execute. commit 8c66604 Author: oreomaker <zh002919@outlook.com> Date: Fri Apr 18 15:35:03 2025 +0800 fix: restore qwen2.5 modeling commit f138beb Author: oreomaker <zh002919@outlook.com> Date: Fri Apr 18 15:28:35 2025 +0800 fix: restore debug change commit 09e12ce Merge: d725942 9b271a9 Author: oreomaker <zh002919@outlook.com> Date: Fri Apr 18 13:39:10 2025 +0800 Merge branch 'debug-qwen2.5' of github.com:liang1232018/mllm into debug-qwen2.5 commit d725942 Author: oreomaker <zh002919@outlook.com> Date: Fri Apr 18 13:39:04 2025 +0800 dev: qnn sigmoid version silu feat: qnn backend f16 type input commit 9b271a9 Author: xudaliang <xudaliang@pku.edu.cn> Date: Fri Apr 18 13:24:52 2025 +0800 fix : linear W8A8 bias uint8 type bug commit 793a6c6 Author: xudaliang <xudaliang@pku.edu.cn> Date: Fri Apr 18 13:23:49 2025 +0800 fix : Shadow linear triger condition. commit 4e24bca Author: oreomaker <zh002919@outlook.com> Date: Wed Apr 16 20:53:07 2025 +0800 qwen 2.5 debug commit 4d74756 Author: oreomaker <zh002919@outlook.com> Date: Wed Apr 16 20:52:33 2025 +0800 fix: shadow linear commit 5866e2b Author: oreomaker <zh002919@outlook.com> Date: Tue Apr 15 22:17:12 2025 +0800 qwen 2.5 debug commit 29e9b92 Author: oreomaker <zh002919@outlook.com> Date: Mon Apr 14 09:28:45 2025 +0800 fix: remove shadow linear if(round_value) logic commit a61e837 Author: oreomaker <zh002919@outlook.com> Date: Sun Apr 13 22:03:45 2025 +0800 feat: int16 qkv for qwen2.5 vl npu commit 566f21d Author: xudaliang <xudaliang@pku.edu.cn> Date: Sun Apr 13 18:45:06 2025 +0800 fix : modeling input quantize to I8, but dequantize with I16 bug. commit 60639d0 Author: xudaliang <xudaliang@pku.edu.cn> Date: Sun Apr 13 18:44:18 2025 +0800 fix : LLaMADequantize INT16 to FP32 shuffle order bugs. commit a5cc652 Author: xudaliang <xudaliang@pku.edu.cn> Date: Sun Apr 13 17:31:10 2025 +0800 fix : LLaMAQuantize FP32 to INT16 round scale error. commit f139822 Author: oreomaker <zh002919@outlook.com> Date: Sat Apr 12 22:24:30 2025 +0800 fix: qnn int 16 linear bias(use int8 bias scale) commit 8831811 Author: oreomaker <zh002919@outlook.com> Date: Sat Apr 12 15:03:40 2025 +0800 debug: qnn int16 linear commit 088fe09 Author: xudaliang <xudaliang@pku.edu.cn> Date: Fri Apr 11 23:22:41 2025 +0800 feat : support INT16 dequantize and quantize. commit 73ebe87 Merge: b73c1c3 6007443 Author: liang1232018 <40791416+liang1232018@users.noreply.github.com> Date: Wed Apr 9 14:50:25 2025 +0800 Merge pull request #12 from liang1232018/develop-zh Develop zh commit 6007443 Merge: 1c8647e b73c1c3 Author: liang1232018 <40791416+liang1232018@users.noreply.github.com> Date: Wed Apr 9 14:50:07 2025 +0800 Merge branch 'develop-xdl' into develop-zh commit 1c8647e Author: oreomaker <zh002919@outlook.com> Date: Tue Apr 8 21:39:56 2025 +0800 fix: qnn quant scale pow(2,bit) -> pow(2,bit-1) commit cc760ae Author: oreomaker <zh002919@outlook.com> Date: Tue Apr 8 17:03:17 2025 +0800 fix: op create param type->dtype commit 6afa80c Author: oreomaker <zh002919@outlook.com> Date: Mon Apr 7 15:25:21 2025 +0800 feat: Tensor::saveData only do when STATIC_READY commit 2ebded3 Author: oreomaker <zh002919@outlook.com> Date: Mon Apr 7 15:24:11 2025 +0800 feat: add qnn int16 layer param & op todo: qnn llama package implement commit 4faeca8 Author: oreomaker <zh002919@outlook.com> Date: Mon Mar 24 15:52:54 2025 +0800 dev: runnable qwen2vl npu (buggy) commit ebf110e Author: oreomaker <zh002919@outlook.com> Date: Mon Mar 24 15:46:23 2025 +0800 feat: add qwen vl export tool (todo: simulate infer and profile tools) commit bde9a92 Author: oreomaker <zh002919@outlook.com> Date: Mon Mar 24 15:44:25 2025 +0800 dev: a just working version of qwen 2.5 npu commit 126c283 Merge: 25de8c3 9d33aaf Author: oreomaker <zh002919@outlook.com> Date: Mon Mar 24 15:43:30 2025 +0800 Merge branch 'fix-qnn-python' into develop-zh commit 9d33aaf Author: oreomaker <zh002919@outlook.com> Date: Fri Mar 21 16:01:23 2025 +0800 fix: qnn profile quant bugs commit 25de8c3 Author: oreomaker <zh002919@outlook.com> Date: Thu Mar 20 16:00:19 2025 +0800 refactor: add graph split layer for QNN, change the modeling note: xnnpack is affected, should not merge commit 690a24e Author: oreomaker <zh002919@outlook.com> Date: Mon Mar 17 17:45:34 2025 +0800 feat: QNN load cache execute commit 4f28330 Author: oreomaker <zh002919@outlook.com> Date: Sun Mar 9 22:33:21 2025 +0800 dev: QNN graph merging execute commit b73c1c3 Author: xudaliang <xudaliang@pku.edu.cn> Date: Tue Nov 12 23:28:12 2024 +0800 feat : support decoding model configuration. commit ec3d4e5 Author: xudaliang <xudaliang@pku.edu.cn> Date: Tue Nov 12 20:31:45 2024 +0800 feat : support Qwen2.5 npu. commit 7246d53 Author: yirongjie <yirj0809@gmail.com> Date: Tue May 27 07:12:53 2025 +0000 feat: set run in Backends commit 1150241 Author: yirongjie <yirj0809@gmail.com> Date: Sat May 24 07:57:09 2025 +0000 fix: getFunc commit 24db241 Author: yirongjie <yirj0809@gmail.com> Date: Fri May 23 05:16:41 2025 +0000 fix: tensor function <Tensor *> to shared_ptr<Tensor> commit 0ecce75 Author: yirongjie <yirj0809@gmail.com> Date: Thu May 22 14:05:11 2025 +0000 feat：eager cpu commit 9835db5 Author: yirongjie <yirj0809@gmail.com> Date: Fri Apr 18 14:57:21 2025 +0000 fix: vtp commit 30c3046 Author: yirongjie <yirj0809@gmail.com> Date: Wed Apr 16 06:49:46 2025 +0000 fix: vtp commit b416268 Author: yirongjie <yirj0809@gmail.com> Date: Tue Apr 15 08:40:22 2025 +0000 fix: vtp commit 6430ca8 Author: yirongjie <yirj0809@gmail.com> Date: Mon Apr 14 12:53:58 2025 +0000 feat: vtp commit f86bff6 Author: yirongjie <yirj0809@gmail.com> Date: Sun Mar 23 09:41:14 2025 +0000 ref: add ShowUI * feat: add FlashAttention2 && fix: MULTIMODELROPE * remove broken submodule --------- Co-authored-by: yirongjie <yirj0809@gmail.com> Co-authored-by: yi <yi@U-21T7VPF4-1903.local>

* Squashed commit of the following: commit efde6d0d014b647b8ceea59441aef1bd3ac424c0 Author: yirongjie <yirj0809@gmail.com> Date: Tue May 27 16:09:16 2025 +0000 fix: merge commit fe7fb476717e99df2eac23ab7fd1088e03cf8b3c Merge: f52bb32e 20e94c0 Author: yirongjie <yirj0809@gmail.com> Date: Tue May 27 16:09:08 2025 +0000 Merge branch 'main' of https://github.com/yirongjie/mllm commit f52bb32e5dbf4edcd4998d664ae071a1b5c8dbbb Author: yirongjie <yirj0809@gmail.com> Date: Tue May 27 12:25:08 2025 +0000 fix: merge from qnn-qwen2vl; commit 6f6c2442f750363c6789e7717861ea3a216cf356 Author: yirongjie <yirj0809@gmail.com> Date: Tue May 27 12:24:17 2025 +0000 Squashed commit of the following: commit 4862c76 Author: oreomaker <zh002919@outlook.com> Date: Thu May 15 14:59:37 2025 +0800 refact: use hvx qnn silu(faster); usable showui npu version commit 5df1b07 Author: oreomaker <zh002919@outlook.com> Date: Wed May 14 22:10:52 2025 +0800 feat: qnn dequantize_add hvx op commit c813f55 Author: oreomaker <zh002919@outlook.com> Date: Tue May 13 09:50:06 2025 +0800 chore: format qnn op package code commit ea215f0 Author: oreomaker <zh002919@outlook.com> Date: Mon May 12 11:34:38 2025 +0800 feat: free act tensors after qnn vit embedding commit e4f5011 Author: oreomaker <zh002919@outlook.com> Date: Mon May 12 11:14:30 2025 +0800 chore: remove save data in modeling qwen2vlnpu commit 2dcb677 Author: oreomaker <zh002919@outlook.com> Date: Mon May 12 10:48:34 2025 +0800 fix: seperate weights for embedding-lmhead when using rotated qwen2vl/showui commit 4847318 Author: oreomaker <zh002919@outlook.com> Date: Sun May 11 21:16:59 2025 +0800 fix: cpu tensor free bug(todo: handle tensor free) commit 799b673 Author: xudaliang <xudaliang@pku.edu.cn> Date: Sat May 10 22:51:11 2025 +0800 feat : new qwen2_vl model. commit dd1817d Author: xudaliang <xudaliang@pku.edu.cn> Date: Sat May 10 22:50:35 2025 +0800 feat : support qwen2-vl rotation model with fp bias. commit 305dc5c Author: oreomaker <zh002919@outlook.com> Date: Thu May 8 21:37:35 2025 +0800 feat: runnable qwen2vl qnn showui(2*256) commit 8e14815 Author: oreomaker <zh002919@outlook.com> Date: Thu May 8 21:36:33 2025 +0800 fix: pre processing of qwen2vl commit e041296 Author: oreomaker <zh002919@outlook.com> Date: Thu May 8 21:34:07 2025 +0800 refact: qwen vl npu modeling using closetFactor view(64->8x8) feat: get_position_id padding in Qwen2VL_ImagePatchAndEmbedding commit 5b17204 Author: oreomaker <zh002919@outlook.com> Date: Thu May 8 21:29:13 2025 +0800 feat: vit(visual_xx) tensor reuse for qnn (noted as: QNN VLM trick) commit 7c42658 Author: oreomaker <zh002919@outlook.com> Date: Thu May 8 21:26:49 2025 +0800 feat: finish cpu pipeline mrope commit 0962c00 Author: oreomaker <zh002919@outlook.com> Date: Tue May 6 11:39:29 2025 +0800 feat: pipeline multimodal rope commit 5317933 Author: oreomaker <zh002919@outlook.com> Date: Tue May 6 11:38:10 2025 +0800 refactor: use old&fast qnn silu commit 5bd14de Author: oreomaker <zh002919@outlook.com> Date: Mon Apr 28 21:10:48 2025 +0800 feat: runnable qwen 2 vl npu commit 1df6eed Author: oreomaker <zh002919@outlook.com> Date: Sun Apr 27 10:13:44 2025 +0800 refactor: tensor.to(QNN) commit d3d29c4 Author: oreomaker <zh002919@outlook.com> Date: Sat Apr 26 21:22:52 2025 +0800 chore: remove saveData in qwen2vl modeling commit c40e0c0 Author: oreomaker <zh002919@outlook.com> Date: Sat Apr 26 20:51:16 2025 +0800 feat: add qnn retrieve context info log commit 175d3a2 Author: oreomaker <zh002919@outlook.com> Date: Sat Apr 26 20:46:14 2025 +0800 fix: qwen 2 vl npu input tensor backend(correct version) commit 871e920 Author: oreomaker <zh002919@outlook.com> Date: Fri Apr 25 09:50:05 2025 +0800 fix: quantize i16 arm neon macro commit a2b802c Author: xudaliang <xudaliang@pku.edu.cn> Date: Wed Apr 23 18:33:26 2025 +0800 fix : Qwen2-VL prefill bugs: 1.FP32 KVCache. 2.LMHead does not execute. commit 8c66604 Author: oreomaker <zh002919@outlook.com> Date: Fri Apr 18 15:35:03 2025 +0800 fix: restore qwen2.5 modeling commit f138beb Author: oreomaker <zh002919@outlook.com> Date: Fri Apr 18 15:28:35 2025 +0800 fix: restore debug change commit 09e12ce Merge: d725942 9b271a9 Author: oreomaker <zh002919@outlook.com> Date: Fri Apr 18 13:39:10 2025 +0800 Merge branch 'debug-qwen2.5' of github.com:liang1232018/mllm into debug-qwen2.5 commit d725942 Author: oreomaker <zh002919@outlook.com> Date: Fri Apr 18 13:39:04 2025 +0800 dev: qnn sigmoid version silu feat: qnn backend f16 type input commit 9b271a9 Author: xudaliang <xudaliang@pku.edu.cn> Date: Fri Apr 18 13:24:52 2025 +0800 fix : linear W8A8 bias uint8 type bug commit 793a6c6 Author: xudaliang <xudaliang@pku.edu.cn> Date: Fri Apr 18 13:23:49 2025 +0800 fix : Shadow linear triger condition. commit 4e24bca Author: oreomaker <zh002919@outlook.com> Date: Wed Apr 16 20:53:07 2025 +0800 qwen 2.5 debug commit 4d74756 Author: oreomaker <zh002919@outlook.com> Date: Wed Apr 16 20:52:33 2025 +0800 fix: shadow linear commit 5866e2b Author: oreomaker <zh002919@outlook.com> Date: Tue Apr 15 22:17:12 2025 +0800 qwen 2.5 debug commit 29e9b92 Author: oreomaker <zh002919@outlook.com> Date: Mon Apr 14 09:28:45 2025 +0800 fix: remove shadow linear if(round_value) logic commit a61e837 Author: oreomaker <zh002919@outlook.com> Date: Sun Apr 13 22:03:45 2025 +0800 feat: int16 qkv for qwen2.5 vl npu commit 566f21d Author: xudaliang <xudaliang@pku.edu.cn> Date: Sun Apr 13 18:45:06 2025 +0800 fix : modeling input quantize to I8, but dequantize with I16 bug. commit 60639d0 Author: xudaliang <xudaliang@pku.edu.cn> Date: Sun Apr 13 18:44:18 2025 +0800 fix : LLaMADequantize INT16 to FP32 shuffle order bugs. commit a5cc652 Author: xudaliang <xudaliang@pku.edu.cn> Date: Sun Apr 13 17:31:10 2025 +0800 fix : LLaMAQuantize FP32 to INT16 round scale error. commit f139822 Author: oreomaker <zh002919@outlook.com> Date: Sat Apr 12 22:24:30 2025 +0800 fix: qnn int 16 linear bias(use int8 bias scale) commit 8831811 Author: oreomaker <zh002919@outlook.com> Date: Sat Apr 12 15:03:40 2025 +0800 debug: qnn int16 linear commit 088fe09 Author: xudaliang <xudaliang@pku.edu.cn> Date: Fri Apr 11 23:22:41 2025 +0800 feat : support INT16 dequantize and quantize. commit 73ebe87 Merge: b73c1c3 6007443 Author: liang1232018 <40791416+liang1232018@users.noreply.github.com> Date: Wed Apr 9 14:50:25 2025 +0800 Merge pull request UbiquitousLearning#12 from liang1232018/develop-zh Develop zh commit 6007443 Merge: 1c8647e b73c1c3 Author: liang1232018 <40791416+liang1232018@users.noreply.github.com> Date: Wed Apr 9 14:50:07 2025 +0800 Merge branch 'develop-xdl' into develop-zh commit 1c8647e Author: oreomaker <zh002919@outlook.com> Date: Tue Apr 8 21:39:56 2025 +0800 fix: qnn quant scale pow(2,bit) -> pow(2,bit-1) commit cc760ae Author: oreomaker <zh002919@outlook.com> Date: Tue Apr 8 17:03:17 2025 +0800 fix: op create param type->dtype commit 6afa80c Author: oreomaker <zh002919@outlook.com> Date: Mon Apr 7 15:25:21 2025 +0800 feat: Tensor::saveData only do when STATIC_READY commit 2ebded3 Author: oreomaker <zh002919@outlook.com> Date: Mon Apr 7 15:24:11 2025 +0800 feat: add qnn int16 layer param & op todo: qnn llama package implement commit 4faeca8 Author: oreomaker <zh002919@outlook.com> Date: Mon Mar 24 15:52:54 2025 +0800 dev: runnable qwen2vl npu (buggy) commit ebf110e Author: oreomaker <zh002919@outlook.com> Date: Mon Mar 24 15:46:23 2025 +0800 feat: add qwen vl export tool (todo: simulate infer and profile tools) commit bde9a92 Author: oreomaker <zh002919@outlook.com> Date: Mon Mar 24 15:44:25 2025 +0800 dev: a just working version of qwen 2.5 npu commit 126c283 Merge: 25de8c3 9d33aaf Author: oreomaker <zh002919@outlook.com> Date: Mon Mar 24 15:43:30 2025 +0800 Merge branch 'fix-qnn-python' into develop-zh commit 9d33aaf Author: oreomaker <zh002919@outlook.com> Date: Fri Mar 21 16:01:23 2025 +0800 fix: qnn profile quant bugs commit 25de8c3 Author: oreomaker <zh002919@outlook.com> Date: Thu Mar 20 16:00:19 2025 +0800 refactor: add graph split layer for QNN, change the modeling note: xnnpack is affected, should not merge commit 690a24e Author: oreomaker <zh002919@outlook.com> Date: Mon Mar 17 17:45:34 2025 +0800 feat: QNN load cache execute commit 4f28330 Author: oreomaker <zh002919@outlook.com> Date: Sun Mar 9 22:33:21 2025 +0800 dev: QNN graph merging execute commit b73c1c3 Author: xudaliang <xudaliang@pku.edu.cn> Date: Tue Nov 12 23:28:12 2024 +0800 feat : support decoding model configuration. commit ec3d4e5 Author: xudaliang <xudaliang@pku.edu.cn> Date: Tue Nov 12 20:31:45 2024 +0800 feat : support Qwen2.5 npu. commit 7246d53 Author: yirongjie <yirj0809@gmail.com> Date: Tue May 27 07:12:53 2025 +0000 feat: set run in Backends commit 1150241 Author: yirongjie <yirj0809@gmail.com> Date: Sat May 24 07:57:09 2025 +0000 fix: getFunc commit 24db241 Author: yirongjie <yirj0809@gmail.com> Date: Fri May 23 05:16:41 2025 +0000 fix: tensor function <Tensor *> to shared_ptr<Tensor> commit 0ecce75 Author: yirongjie <yirj0809@gmail.com> Date: Thu May 22 14:05:11 2025 +0000 feat：eager cpu commit 9835db5 Author: yirongjie <yirj0809@gmail.com> Date: Fri Apr 18 14:57:21 2025 +0000 fix: vtp commit 30c3046 Author: yirongjie <yirj0809@gmail.com> Date: Wed Apr 16 06:49:46 2025 +0000 fix: vtp commit b416268 Author: yirongjie <yirj0809@gmail.com> Date: Tue Apr 15 08:40:22 2025 +0000 fix: vtp commit 6430ca8 Author: yirongjie <yirj0809@gmail.com> Date: Mon Apr 14 12:53:58 2025 +0000 feat: vtp commit f86bff6 Author: yirongjie <yirj0809@gmail.com> Date: Sun Mar 23 09:41:14 2025 +0000 ref: add ShowUI * feat: add FlashAttention2 && fix: MULTIMODELROPE * remove broken submodule --------- Co-authored-by: yirongjie <yirj0809@gmail.com> Co-authored-by: yi <yi@U-21T7VPF4-1903.local>

yirongjie added 2 commits May 27, 2025 16:14

feat: add FlashAttention2 && fix: MULTIMODELROPE

695ddac

yirongjie self-requested a review June 9, 2025 07:41

yirongjie approved these changes Jun 9, 2025

View reviewed changes

remove broken submodule

d9cbd46

yirongjie merged commit d11a8e6 into UbiquitousLearning:vlm Jun 9, 2025
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add qwen2vl qnn support#291

feat: add qwen2vl qnn support#291
yirongjie merged 3 commits intoUbiquitousLearning:vlmfrom
yirongjie:main

UbiquitousLearning commented Jun 9, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

UbiquitousLearning commented Jun 9, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants