feat: add qwen2vl qnn support#291
Merged
yirongjie merged 3 commits intoUbiquitousLearning:vlmfrom Jun 9, 2025
Merged
Conversation
commit efde6d0d014b647b8ceea59441aef1bd3ac424c0
Author: yirongjie <yirj0809@gmail.com>
Date: Tue May 27 16:09:16 2025 +0000
fix: merge
commit fe7fb476717e99df2eac23ab7fd1088e03cf8b3c
Merge: f52bb32e 20e94c0
Author: yirongjie <yirj0809@gmail.com>
Date: Tue May 27 16:09:08 2025 +0000
Merge branch 'main' of https://github.com/yirongjie/mllm
commit f52bb32e5dbf4edcd4998d664ae071a1b5c8dbbb
Author: yirongjie <yirj0809@gmail.com>
Date: Tue May 27 12:25:08 2025 +0000
fix: merge from qnn-qwen2vl;
commit 6f6c2442f750363c6789e7717861ea3a216cf356
Author: yirongjie <yirj0809@gmail.com>
Date: Tue May 27 12:24:17 2025 +0000
Squashed commit of the following:
commit 4862c76
Author: oreomaker <zh002919@outlook.com>
Date: Thu May 15 14:59:37 2025 +0800
refact: use hvx qnn silu(faster); usable showui npu version
commit 5df1b07
Author: oreomaker <zh002919@outlook.com>
Date: Wed May 14 22:10:52 2025 +0800
feat: qnn dequantize_add hvx op
commit c813f55
Author: oreomaker <zh002919@outlook.com>
Date: Tue May 13 09:50:06 2025 +0800
chore: format qnn op package code
commit ea215f0
Author: oreomaker <zh002919@outlook.com>
Date: Mon May 12 11:34:38 2025 +0800
feat: free act tensors after qnn vit embedding
commit e4f5011
Author: oreomaker <zh002919@outlook.com>
Date: Mon May 12 11:14:30 2025 +0800
chore: remove save data in modeling qwen2vlnpu
commit 2dcb677
Author: oreomaker <zh002919@outlook.com>
Date: Mon May 12 10:48:34 2025 +0800
fix: seperate weights for embedding-lmhead when using rotated qwen2vl/showui
commit 4847318
Author: oreomaker <zh002919@outlook.com>
Date: Sun May 11 21:16:59 2025 +0800
fix: cpu tensor free bug(todo: handle tensor free)
commit 799b673
Author: xudaliang <xudaliang@pku.edu.cn>
Date: Sat May 10 22:51:11 2025 +0800
feat : new qwen2_vl model.
commit dd1817d
Author: xudaliang <xudaliang@pku.edu.cn>
Date: Sat May 10 22:50:35 2025 +0800
feat : support qwen2-vl rotation model with fp bias.
commit 305dc5c
Author: oreomaker <zh002919@outlook.com>
Date: Thu May 8 21:37:35 2025 +0800
feat: runnable qwen2vl qnn showui(2*256)
commit 8e14815
Author: oreomaker <zh002919@outlook.com>
Date: Thu May 8 21:36:33 2025 +0800
fix: pre processing of qwen2vl
commit e041296
Author: oreomaker <zh002919@outlook.com>
Date: Thu May 8 21:34:07 2025 +0800
refact: qwen vl npu modeling using closetFactor view(64->8x8)
feat: get_position_id padding in Qwen2VL_ImagePatchAndEmbedding
commit 5b17204
Author: oreomaker <zh002919@outlook.com>
Date: Thu May 8 21:29:13 2025 +0800
feat: vit(visual_xx) tensor reuse for qnn (noted as: QNN VLM trick)
commit 7c42658
Author: oreomaker <zh002919@outlook.com>
Date: Thu May 8 21:26:49 2025 +0800
feat: finish cpu pipeline mrope
commit 0962c00
Author: oreomaker <zh002919@outlook.com>
Date: Tue May 6 11:39:29 2025 +0800
feat: pipeline multimodal rope
commit 5317933
Author: oreomaker <zh002919@outlook.com>
Date: Tue May 6 11:38:10 2025 +0800
refactor: use old&fast qnn silu
commit 5bd14de
Author: oreomaker <zh002919@outlook.com>
Date: Mon Apr 28 21:10:48 2025 +0800
feat: runnable qwen 2 vl npu
commit 1df6eed
Author: oreomaker <zh002919@outlook.com>
Date: Sun Apr 27 10:13:44 2025 +0800
refactor: tensor.to(QNN)
commit d3d29c4
Author: oreomaker <zh002919@outlook.com>
Date: Sat Apr 26 21:22:52 2025 +0800
chore: remove saveData in qwen2vl modeling
commit c40e0c0
Author: oreomaker <zh002919@outlook.com>
Date: Sat Apr 26 20:51:16 2025 +0800
feat: add qnn retrieve context info log
commit 175d3a2
Author: oreomaker <zh002919@outlook.com>
Date: Sat Apr 26 20:46:14 2025 +0800
fix: qwen 2 vl npu input tensor backend(correct version)
commit 871e920
Author: oreomaker <zh002919@outlook.com>
Date: Fri Apr 25 09:50:05 2025 +0800
fix: quantize i16 arm neon macro
commit a2b802c
Author: xudaliang <xudaliang@pku.edu.cn>
Date: Wed Apr 23 18:33:26 2025 +0800
fix : Qwen2-VL prefill bugs: 1.FP32 KVCache. 2.LMHead does not execute.
commit 8c66604
Author: oreomaker <zh002919@outlook.com>
Date: Fri Apr 18 15:35:03 2025 +0800
fix: restore qwen2.5 modeling
commit f138beb
Author: oreomaker <zh002919@outlook.com>
Date: Fri Apr 18 15:28:35 2025 +0800
fix: restore debug change
commit 09e12ce
Merge: d725942 9b271a9
Author: oreomaker <zh002919@outlook.com>
Date: Fri Apr 18 13:39:10 2025 +0800
Merge branch 'debug-qwen2.5' of github.com:liang1232018/mllm into debug-qwen2.5
commit d725942
Author: oreomaker <zh002919@outlook.com>
Date: Fri Apr 18 13:39:04 2025 +0800
dev: qnn sigmoid version silu
feat: qnn backend f16 type input
commit 9b271a9
Author: xudaliang <xudaliang@pku.edu.cn>
Date: Fri Apr 18 13:24:52 2025 +0800
fix : linear W8A8 bias uint8 type bug
commit 793a6c6
Author: xudaliang <xudaliang@pku.edu.cn>
Date: Fri Apr 18 13:23:49 2025 +0800
fix : Shadow linear triger condition.
commit 4e24bca
Author: oreomaker <zh002919@outlook.com>
Date: Wed Apr 16 20:53:07 2025 +0800
qwen 2.5 debug
commit 4d74756
Author: oreomaker <zh002919@outlook.com>
Date: Wed Apr 16 20:52:33 2025 +0800
fix: shadow linear
commit 5866e2b
Author: oreomaker <zh002919@outlook.com>
Date: Tue Apr 15 22:17:12 2025 +0800
qwen 2.5 debug
commit 29e9b92
Author: oreomaker <zh002919@outlook.com>
Date: Mon Apr 14 09:28:45 2025 +0800
fix: remove shadow linear if(round_value) logic
commit a61e837
Author: oreomaker <zh002919@outlook.com>
Date: Sun Apr 13 22:03:45 2025 +0800
feat: int16 qkv for qwen2.5 vl npu
commit 566f21d
Author: xudaliang <xudaliang@pku.edu.cn>
Date: Sun Apr 13 18:45:06 2025 +0800
fix : modeling input quantize to I8, but dequantize with I16 bug.
commit 60639d0
Author: xudaliang <xudaliang@pku.edu.cn>
Date: Sun Apr 13 18:44:18 2025 +0800
fix : LLaMADequantize INT16 to FP32 shuffle order bugs.
commit a5cc652
Author: xudaliang <xudaliang@pku.edu.cn>
Date: Sun Apr 13 17:31:10 2025 +0800
fix : LLaMAQuantize FP32 to INT16 round scale error.
commit f139822
Author: oreomaker <zh002919@outlook.com>
Date: Sat Apr 12 22:24:30 2025 +0800
fix: qnn int 16 linear bias(use int8 bias scale)
commit 8831811
Author: oreomaker <zh002919@outlook.com>
Date: Sat Apr 12 15:03:40 2025 +0800
debug: qnn int16 linear
commit 088fe09
Author: xudaliang <xudaliang@pku.edu.cn>
Date: Fri Apr 11 23:22:41 2025 +0800
feat : support INT16 dequantize and quantize.
commit 73ebe87
Merge: b73c1c3 6007443
Author: liang1232018 <40791416+liang1232018@users.noreply.github.com>
Date: Wed Apr 9 14:50:25 2025 +0800
Merge pull request UbiquitousLearning#12 from liang1232018/develop-zh
Develop zh
commit 6007443
Merge: 1c8647e b73c1c3
Author: liang1232018 <40791416+liang1232018@users.noreply.github.com>
Date: Wed Apr 9 14:50:07 2025 +0800
Merge branch 'develop-xdl' into develop-zh
commit 1c8647e
Author: oreomaker <zh002919@outlook.com>
Date: Tue Apr 8 21:39:56 2025 +0800
fix: qnn quant scale pow(2,bit) -> pow(2,bit-1)
commit cc760ae
Author: oreomaker <zh002919@outlook.com>
Date: Tue Apr 8 17:03:17 2025 +0800
fix: op create param type->dtype
commit 6afa80c
Author: oreomaker <zh002919@outlook.com>
Date: Mon Apr 7 15:25:21 2025 +0800
feat: Tensor::saveData only do when STATIC_READY
commit 2ebded3
Author: oreomaker <zh002919@outlook.com>
Date: Mon Apr 7 15:24:11 2025 +0800
feat: add qnn int16 layer param & op
todo: qnn llama package implement
commit 4faeca8
Author: oreomaker <zh002919@outlook.com>
Date: Mon Mar 24 15:52:54 2025 +0800
dev: runnable qwen2vl npu (buggy)
commit ebf110e
Author: oreomaker <zh002919@outlook.com>
Date: Mon Mar 24 15:46:23 2025 +0800
feat: add qwen vl export tool (todo: simulate infer and profile tools)
commit bde9a92
Author: oreomaker <zh002919@outlook.com>
Date: Mon Mar 24 15:44:25 2025 +0800
dev: a just working version of qwen 2.5 npu
commit 126c283
Merge: 25de8c3 9d33aaf
Author: oreomaker <zh002919@outlook.com>
Date: Mon Mar 24 15:43:30 2025 +0800
Merge branch 'fix-qnn-python' into develop-zh
commit 9d33aaf
Author: oreomaker <zh002919@outlook.com>
Date: Fri Mar 21 16:01:23 2025 +0800
fix: qnn profile quant bugs
commit 25de8c3
Author: oreomaker <zh002919@outlook.com>
Date: Thu Mar 20 16:00:19 2025 +0800
refactor: add graph split layer for QNN, change the modeling
note: xnnpack is affected, should not merge
commit 690a24e
Author: oreomaker <zh002919@outlook.com>
Date: Mon Mar 17 17:45:34 2025 +0800
feat: QNN load cache execute
commit 4f28330
Author: oreomaker <zh002919@outlook.com>
Date: Sun Mar 9 22:33:21 2025 +0800
dev: QNN graph merging execute
commit b73c1c3
Author: xudaliang <xudaliang@pku.edu.cn>
Date: Tue Nov 12 23:28:12 2024 +0800
feat : support decoding model configuration.
commit ec3d4e5
Author: xudaliang <xudaliang@pku.edu.cn>
Date: Tue Nov 12 20:31:45 2024 +0800
feat : support Qwen2.5 npu.
commit 7246d53
Author: yirongjie <yirj0809@gmail.com>
Date: Tue May 27 07:12:53 2025 +0000
feat: set run in Backends
commit 1150241
Author: yirongjie <yirj0809@gmail.com>
Date: Sat May 24 07:57:09 2025 +0000
fix: getFunc
commit 24db241
Author: yirongjie <yirj0809@gmail.com>
Date: Fri May 23 05:16:41 2025 +0000
fix: tensor function <Tensor *> to shared_ptr<Tensor>
commit 0ecce75
Author: yirongjie <yirj0809@gmail.com>
Date: Thu May 22 14:05:11 2025 +0000
feat:eager cpu
commit 9835db5
Author: yirongjie <yirj0809@gmail.com>
Date: Fri Apr 18 14:57:21 2025 +0000
fix: vtp
commit 30c3046
Author: yirongjie <yirj0809@gmail.com>
Date: Wed Apr 16 06:49:46 2025 +0000
fix: vtp
commit b416268
Author: yirongjie <yirj0809@gmail.com>
Date: Tue Apr 15 08:40:22 2025 +0000
fix: vtp
commit 6430ca8
Author: yirongjie <yirj0809@gmail.com>
Date: Mon Apr 14 12:53:58 2025 +0000
feat: vtp
commit f86bff6
Author: yirongjie <yirj0809@gmail.com>
Date: Sun Mar 23 09:41:14 2025 +0000
ref: add ShowUI
yirongjie
approved these changes
Jun 9, 2025
yirongjie
added a commit
that referenced
this pull request
Jun 10, 2025
* Squashed commit of the following:
commit efde6d0d014b647b8ceea59441aef1bd3ac424c0
Author: yirongjie <yirj0809@gmail.com>
Date: Tue May 27 16:09:16 2025 +0000
fix: merge
commit fe7fb476717e99df2eac23ab7fd1088e03cf8b3c
Merge: f52bb32e 20e94c0
Author: yirongjie <yirj0809@gmail.com>
Date: Tue May 27 16:09:08 2025 +0000
Merge branch 'main' of https://github.com/yirongjie/mllm
commit f52bb32e5dbf4edcd4998d664ae071a1b5c8dbbb
Author: yirongjie <yirj0809@gmail.com>
Date: Tue May 27 12:25:08 2025 +0000
fix: merge from qnn-qwen2vl;
commit 6f6c2442f750363c6789e7717861ea3a216cf356
Author: yirongjie <yirj0809@gmail.com>
Date: Tue May 27 12:24:17 2025 +0000
Squashed commit of the following:
commit 4862c76
Author: oreomaker <zh002919@outlook.com>
Date: Thu May 15 14:59:37 2025 +0800
refact: use hvx qnn silu(faster); usable showui npu version
commit 5df1b07
Author: oreomaker <zh002919@outlook.com>
Date: Wed May 14 22:10:52 2025 +0800
feat: qnn dequantize_add hvx op
commit c813f55
Author: oreomaker <zh002919@outlook.com>
Date: Tue May 13 09:50:06 2025 +0800
chore: format qnn op package code
commit ea215f0
Author: oreomaker <zh002919@outlook.com>
Date: Mon May 12 11:34:38 2025 +0800
feat: free act tensors after qnn vit embedding
commit e4f5011
Author: oreomaker <zh002919@outlook.com>
Date: Mon May 12 11:14:30 2025 +0800
chore: remove save data in modeling qwen2vlnpu
commit 2dcb677
Author: oreomaker <zh002919@outlook.com>
Date: Mon May 12 10:48:34 2025 +0800
fix: seperate weights for embedding-lmhead when using rotated qwen2vl/showui
commit 4847318
Author: oreomaker <zh002919@outlook.com>
Date: Sun May 11 21:16:59 2025 +0800
fix: cpu tensor free bug(todo: handle tensor free)
commit 799b673
Author: xudaliang <xudaliang@pku.edu.cn>
Date: Sat May 10 22:51:11 2025 +0800
feat : new qwen2_vl model.
commit dd1817d
Author: xudaliang <xudaliang@pku.edu.cn>
Date: Sat May 10 22:50:35 2025 +0800
feat : support qwen2-vl rotation model with fp bias.
commit 305dc5c
Author: oreomaker <zh002919@outlook.com>
Date: Thu May 8 21:37:35 2025 +0800
feat: runnable qwen2vl qnn showui(2*256)
commit 8e14815
Author: oreomaker <zh002919@outlook.com>
Date: Thu May 8 21:36:33 2025 +0800
fix: pre processing of qwen2vl
commit e041296
Author: oreomaker <zh002919@outlook.com>
Date: Thu May 8 21:34:07 2025 +0800
refact: qwen vl npu modeling using closetFactor view(64->8x8)
feat: get_position_id padding in Qwen2VL_ImagePatchAndEmbedding
commit 5b17204
Author: oreomaker <zh002919@outlook.com>
Date: Thu May 8 21:29:13 2025 +0800
feat: vit(visual_xx) tensor reuse for qnn (noted as: QNN VLM trick)
commit 7c42658
Author: oreomaker <zh002919@outlook.com>
Date: Thu May 8 21:26:49 2025 +0800
feat: finish cpu pipeline mrope
commit 0962c00
Author: oreomaker <zh002919@outlook.com>
Date: Tue May 6 11:39:29 2025 +0800
feat: pipeline multimodal rope
commit 5317933
Author: oreomaker <zh002919@outlook.com>
Date: Tue May 6 11:38:10 2025 +0800
refactor: use old&fast qnn silu
commit 5bd14de
Author: oreomaker <zh002919@outlook.com>
Date: Mon Apr 28 21:10:48 2025 +0800
feat: runnable qwen 2 vl npu
commit 1df6eed
Author: oreomaker <zh002919@outlook.com>
Date: Sun Apr 27 10:13:44 2025 +0800
refactor: tensor.to(QNN)
commit d3d29c4
Author: oreomaker <zh002919@outlook.com>
Date: Sat Apr 26 21:22:52 2025 +0800
chore: remove saveData in qwen2vl modeling
commit c40e0c0
Author: oreomaker <zh002919@outlook.com>
Date: Sat Apr 26 20:51:16 2025 +0800
feat: add qnn retrieve context info log
commit 175d3a2
Author: oreomaker <zh002919@outlook.com>
Date: Sat Apr 26 20:46:14 2025 +0800
fix: qwen 2 vl npu input tensor backend(correct version)
commit 871e920
Author: oreomaker <zh002919@outlook.com>
Date: Fri Apr 25 09:50:05 2025 +0800
fix: quantize i16 arm neon macro
commit a2b802c
Author: xudaliang <xudaliang@pku.edu.cn>
Date: Wed Apr 23 18:33:26 2025 +0800
fix : Qwen2-VL prefill bugs: 1.FP32 KVCache. 2.LMHead does not execute.
commit 8c66604
Author: oreomaker <zh002919@outlook.com>
Date: Fri Apr 18 15:35:03 2025 +0800
fix: restore qwen2.5 modeling
commit f138beb
Author: oreomaker <zh002919@outlook.com>
Date: Fri Apr 18 15:28:35 2025 +0800
fix: restore debug change
commit 09e12ce
Merge: d725942 9b271a9
Author: oreomaker <zh002919@outlook.com>
Date: Fri Apr 18 13:39:10 2025 +0800
Merge branch 'debug-qwen2.5' of github.com:liang1232018/mllm into debug-qwen2.5
commit d725942
Author: oreomaker <zh002919@outlook.com>
Date: Fri Apr 18 13:39:04 2025 +0800
dev: qnn sigmoid version silu
feat: qnn backend f16 type input
commit 9b271a9
Author: xudaliang <xudaliang@pku.edu.cn>
Date: Fri Apr 18 13:24:52 2025 +0800
fix : linear W8A8 bias uint8 type bug
commit 793a6c6
Author: xudaliang <xudaliang@pku.edu.cn>
Date: Fri Apr 18 13:23:49 2025 +0800
fix : Shadow linear triger condition.
commit 4e24bca
Author: oreomaker <zh002919@outlook.com>
Date: Wed Apr 16 20:53:07 2025 +0800
qwen 2.5 debug
commit 4d74756
Author: oreomaker <zh002919@outlook.com>
Date: Wed Apr 16 20:52:33 2025 +0800
fix: shadow linear
commit 5866e2b
Author: oreomaker <zh002919@outlook.com>
Date: Tue Apr 15 22:17:12 2025 +0800
qwen 2.5 debug
commit 29e9b92
Author: oreomaker <zh002919@outlook.com>
Date: Mon Apr 14 09:28:45 2025 +0800
fix: remove shadow linear if(round_value) logic
commit a61e837
Author: oreomaker <zh002919@outlook.com>
Date: Sun Apr 13 22:03:45 2025 +0800
feat: int16 qkv for qwen2.5 vl npu
commit 566f21d
Author: xudaliang <xudaliang@pku.edu.cn>
Date: Sun Apr 13 18:45:06 2025 +0800
fix : modeling input quantize to I8, but dequantize with I16 bug.
commit 60639d0
Author: xudaliang <xudaliang@pku.edu.cn>
Date: Sun Apr 13 18:44:18 2025 +0800
fix : LLaMADequantize INT16 to FP32 shuffle order bugs.
commit a5cc652
Author: xudaliang <xudaliang@pku.edu.cn>
Date: Sun Apr 13 17:31:10 2025 +0800
fix : LLaMAQuantize FP32 to INT16 round scale error.
commit f139822
Author: oreomaker <zh002919@outlook.com>
Date: Sat Apr 12 22:24:30 2025 +0800
fix: qnn int 16 linear bias(use int8 bias scale)
commit 8831811
Author: oreomaker <zh002919@outlook.com>
Date: Sat Apr 12 15:03:40 2025 +0800
debug: qnn int16 linear
commit 088fe09
Author: xudaliang <xudaliang@pku.edu.cn>
Date: Fri Apr 11 23:22:41 2025 +0800
feat : support INT16 dequantize and quantize.
commit 73ebe87
Merge: b73c1c3 6007443
Author: liang1232018 <40791416+liang1232018@users.noreply.github.com>
Date: Wed Apr 9 14:50:25 2025 +0800
Merge pull request #12 from liang1232018/develop-zh
Develop zh
commit 6007443
Merge: 1c8647e b73c1c3
Author: liang1232018 <40791416+liang1232018@users.noreply.github.com>
Date: Wed Apr 9 14:50:07 2025 +0800
Merge branch 'develop-xdl' into develop-zh
commit 1c8647e
Author: oreomaker <zh002919@outlook.com>
Date: Tue Apr 8 21:39:56 2025 +0800
fix: qnn quant scale pow(2,bit) -> pow(2,bit-1)
commit cc760ae
Author: oreomaker <zh002919@outlook.com>
Date: Tue Apr 8 17:03:17 2025 +0800
fix: op create param type->dtype
commit 6afa80c
Author: oreomaker <zh002919@outlook.com>
Date: Mon Apr 7 15:25:21 2025 +0800
feat: Tensor::saveData only do when STATIC_READY
commit 2ebded3
Author: oreomaker <zh002919@outlook.com>
Date: Mon Apr 7 15:24:11 2025 +0800
feat: add qnn int16 layer param & op
todo: qnn llama package implement
commit 4faeca8
Author: oreomaker <zh002919@outlook.com>
Date: Mon Mar 24 15:52:54 2025 +0800
dev: runnable qwen2vl npu (buggy)
commit ebf110e
Author: oreomaker <zh002919@outlook.com>
Date: Mon Mar 24 15:46:23 2025 +0800
feat: add qwen vl export tool (todo: simulate infer and profile tools)
commit bde9a92
Author: oreomaker <zh002919@outlook.com>
Date: Mon Mar 24 15:44:25 2025 +0800
dev: a just working version of qwen 2.5 npu
commit 126c283
Merge: 25de8c3 9d33aaf
Author: oreomaker <zh002919@outlook.com>
Date: Mon Mar 24 15:43:30 2025 +0800
Merge branch 'fix-qnn-python' into develop-zh
commit 9d33aaf
Author: oreomaker <zh002919@outlook.com>
Date: Fri Mar 21 16:01:23 2025 +0800
fix: qnn profile quant bugs
commit 25de8c3
Author: oreomaker <zh002919@outlook.com>
Date: Thu Mar 20 16:00:19 2025 +0800
refactor: add graph split layer for QNN, change the modeling
note: xnnpack is affected, should not merge
commit 690a24e
Author: oreomaker <zh002919@outlook.com>
Date: Mon Mar 17 17:45:34 2025 +0800
feat: QNN load cache execute
commit 4f28330
Author: oreomaker <zh002919@outlook.com>
Date: Sun Mar 9 22:33:21 2025 +0800
dev: QNN graph merging execute
commit b73c1c3
Author: xudaliang <xudaliang@pku.edu.cn>
Date: Tue Nov 12 23:28:12 2024 +0800
feat : support decoding model configuration.
commit ec3d4e5
Author: xudaliang <xudaliang@pku.edu.cn>
Date: Tue Nov 12 20:31:45 2024 +0800
feat : support Qwen2.5 npu.
commit 7246d53
Author: yirongjie <yirj0809@gmail.com>
Date: Tue May 27 07:12:53 2025 +0000
feat: set run in Backends
commit 1150241
Author: yirongjie <yirj0809@gmail.com>
Date: Sat May 24 07:57:09 2025 +0000
fix: getFunc
commit 24db241
Author: yirongjie <yirj0809@gmail.com>
Date: Fri May 23 05:16:41 2025 +0000
fix: tensor function <Tensor *> to shared_ptr<Tensor>
commit 0ecce75
Author: yirongjie <yirj0809@gmail.com>
Date: Thu May 22 14:05:11 2025 +0000
feat:eager cpu
commit 9835db5
Author: yirongjie <yirj0809@gmail.com>
Date: Fri Apr 18 14:57:21 2025 +0000
fix: vtp
commit 30c3046
Author: yirongjie <yirj0809@gmail.com>
Date: Wed Apr 16 06:49:46 2025 +0000
fix: vtp
commit b416268
Author: yirongjie <yirj0809@gmail.com>
Date: Tue Apr 15 08:40:22 2025 +0000
fix: vtp
commit 6430ca8
Author: yirongjie <yirj0809@gmail.com>
Date: Mon Apr 14 12:53:58 2025 +0000
feat: vtp
commit f86bff6
Author: yirongjie <yirj0809@gmail.com>
Date: Sun Mar 23 09:41:14 2025 +0000
ref: add ShowUI
* feat: add FlashAttention2 && fix: MULTIMODELROPE
* remove broken submodule
---------
Co-authored-by: yirongjie <yirj0809@gmail.com>
Co-authored-by: yi <yi@U-21T7VPF4-1903.local>
yirongjie
added a commit
to yirongjie/mllm
that referenced
this pull request
Jun 14, 2025
* Squashed commit of the following:
commit efde6d0d014b647b8ceea59441aef1bd3ac424c0
Author: yirongjie <yirj0809@gmail.com>
Date: Tue May 27 16:09:16 2025 +0000
fix: merge
commit fe7fb476717e99df2eac23ab7fd1088e03cf8b3c
Merge: f52bb32e 20e94c0
Author: yirongjie <yirj0809@gmail.com>
Date: Tue May 27 16:09:08 2025 +0000
Merge branch 'main' of https://github.com/yirongjie/mllm
commit f52bb32e5dbf4edcd4998d664ae071a1b5c8dbbb
Author: yirongjie <yirj0809@gmail.com>
Date: Tue May 27 12:25:08 2025 +0000
fix: merge from qnn-qwen2vl;
commit 6f6c2442f750363c6789e7717861ea3a216cf356
Author: yirongjie <yirj0809@gmail.com>
Date: Tue May 27 12:24:17 2025 +0000
Squashed commit of the following:
commit 4862c76
Author: oreomaker <zh002919@outlook.com>
Date: Thu May 15 14:59:37 2025 +0800
refact: use hvx qnn silu(faster); usable showui npu version
commit 5df1b07
Author: oreomaker <zh002919@outlook.com>
Date: Wed May 14 22:10:52 2025 +0800
feat: qnn dequantize_add hvx op
commit c813f55
Author: oreomaker <zh002919@outlook.com>
Date: Tue May 13 09:50:06 2025 +0800
chore: format qnn op package code
commit ea215f0
Author: oreomaker <zh002919@outlook.com>
Date: Mon May 12 11:34:38 2025 +0800
feat: free act tensors after qnn vit embedding
commit e4f5011
Author: oreomaker <zh002919@outlook.com>
Date: Mon May 12 11:14:30 2025 +0800
chore: remove save data in modeling qwen2vlnpu
commit 2dcb677
Author: oreomaker <zh002919@outlook.com>
Date: Mon May 12 10:48:34 2025 +0800
fix: seperate weights for embedding-lmhead when using rotated qwen2vl/showui
commit 4847318
Author: oreomaker <zh002919@outlook.com>
Date: Sun May 11 21:16:59 2025 +0800
fix: cpu tensor free bug(todo: handle tensor free)
commit 799b673
Author: xudaliang <xudaliang@pku.edu.cn>
Date: Sat May 10 22:51:11 2025 +0800
feat : new qwen2_vl model.
commit dd1817d
Author: xudaliang <xudaliang@pku.edu.cn>
Date: Sat May 10 22:50:35 2025 +0800
feat : support qwen2-vl rotation model with fp bias.
commit 305dc5c
Author: oreomaker <zh002919@outlook.com>
Date: Thu May 8 21:37:35 2025 +0800
feat: runnable qwen2vl qnn showui(2*256)
commit 8e14815
Author: oreomaker <zh002919@outlook.com>
Date: Thu May 8 21:36:33 2025 +0800
fix: pre processing of qwen2vl
commit e041296
Author: oreomaker <zh002919@outlook.com>
Date: Thu May 8 21:34:07 2025 +0800
refact: qwen vl npu modeling using closetFactor view(64->8x8)
feat: get_position_id padding in Qwen2VL_ImagePatchAndEmbedding
commit 5b17204
Author: oreomaker <zh002919@outlook.com>
Date: Thu May 8 21:29:13 2025 +0800
feat: vit(visual_xx) tensor reuse for qnn (noted as: QNN VLM trick)
commit 7c42658
Author: oreomaker <zh002919@outlook.com>
Date: Thu May 8 21:26:49 2025 +0800
feat: finish cpu pipeline mrope
commit 0962c00
Author: oreomaker <zh002919@outlook.com>
Date: Tue May 6 11:39:29 2025 +0800
feat: pipeline multimodal rope
commit 5317933
Author: oreomaker <zh002919@outlook.com>
Date: Tue May 6 11:38:10 2025 +0800
refactor: use old&fast qnn silu
commit 5bd14de
Author: oreomaker <zh002919@outlook.com>
Date: Mon Apr 28 21:10:48 2025 +0800
feat: runnable qwen 2 vl npu
commit 1df6eed
Author: oreomaker <zh002919@outlook.com>
Date: Sun Apr 27 10:13:44 2025 +0800
refactor: tensor.to(QNN)
commit d3d29c4
Author: oreomaker <zh002919@outlook.com>
Date: Sat Apr 26 21:22:52 2025 +0800
chore: remove saveData in qwen2vl modeling
commit c40e0c0
Author: oreomaker <zh002919@outlook.com>
Date: Sat Apr 26 20:51:16 2025 +0800
feat: add qnn retrieve context info log
commit 175d3a2
Author: oreomaker <zh002919@outlook.com>
Date: Sat Apr 26 20:46:14 2025 +0800
fix: qwen 2 vl npu input tensor backend(correct version)
commit 871e920
Author: oreomaker <zh002919@outlook.com>
Date: Fri Apr 25 09:50:05 2025 +0800
fix: quantize i16 arm neon macro
commit a2b802c
Author: xudaliang <xudaliang@pku.edu.cn>
Date: Wed Apr 23 18:33:26 2025 +0800
fix : Qwen2-VL prefill bugs: 1.FP32 KVCache. 2.LMHead does not execute.
commit 8c66604
Author: oreomaker <zh002919@outlook.com>
Date: Fri Apr 18 15:35:03 2025 +0800
fix: restore qwen2.5 modeling
commit f138beb
Author: oreomaker <zh002919@outlook.com>
Date: Fri Apr 18 15:28:35 2025 +0800
fix: restore debug change
commit 09e12ce
Merge: d725942 9b271a9
Author: oreomaker <zh002919@outlook.com>
Date: Fri Apr 18 13:39:10 2025 +0800
Merge branch 'debug-qwen2.5' of github.com:liang1232018/mllm into debug-qwen2.5
commit d725942
Author: oreomaker <zh002919@outlook.com>
Date: Fri Apr 18 13:39:04 2025 +0800
dev: qnn sigmoid version silu
feat: qnn backend f16 type input
commit 9b271a9
Author: xudaliang <xudaliang@pku.edu.cn>
Date: Fri Apr 18 13:24:52 2025 +0800
fix : linear W8A8 bias uint8 type bug
commit 793a6c6
Author: xudaliang <xudaliang@pku.edu.cn>
Date: Fri Apr 18 13:23:49 2025 +0800
fix : Shadow linear triger condition.
commit 4e24bca
Author: oreomaker <zh002919@outlook.com>
Date: Wed Apr 16 20:53:07 2025 +0800
qwen 2.5 debug
commit 4d74756
Author: oreomaker <zh002919@outlook.com>
Date: Wed Apr 16 20:52:33 2025 +0800
fix: shadow linear
commit 5866e2b
Author: oreomaker <zh002919@outlook.com>
Date: Tue Apr 15 22:17:12 2025 +0800
qwen 2.5 debug
commit 29e9b92
Author: oreomaker <zh002919@outlook.com>
Date: Mon Apr 14 09:28:45 2025 +0800
fix: remove shadow linear if(round_value) logic
commit a61e837
Author: oreomaker <zh002919@outlook.com>
Date: Sun Apr 13 22:03:45 2025 +0800
feat: int16 qkv for qwen2.5 vl npu
commit 566f21d
Author: xudaliang <xudaliang@pku.edu.cn>
Date: Sun Apr 13 18:45:06 2025 +0800
fix : modeling input quantize to I8, but dequantize with I16 bug.
commit 60639d0
Author: xudaliang <xudaliang@pku.edu.cn>
Date: Sun Apr 13 18:44:18 2025 +0800
fix : LLaMADequantize INT16 to FP32 shuffle order bugs.
commit a5cc652
Author: xudaliang <xudaliang@pku.edu.cn>
Date: Sun Apr 13 17:31:10 2025 +0800
fix : LLaMAQuantize FP32 to INT16 round scale error.
commit f139822
Author: oreomaker <zh002919@outlook.com>
Date: Sat Apr 12 22:24:30 2025 +0800
fix: qnn int 16 linear bias(use int8 bias scale)
commit 8831811
Author: oreomaker <zh002919@outlook.com>
Date: Sat Apr 12 15:03:40 2025 +0800
debug: qnn int16 linear
commit 088fe09
Author: xudaliang <xudaliang@pku.edu.cn>
Date: Fri Apr 11 23:22:41 2025 +0800
feat : support INT16 dequantize and quantize.
commit 73ebe87
Merge: b73c1c3 6007443
Author: liang1232018 <40791416+liang1232018@users.noreply.github.com>
Date: Wed Apr 9 14:50:25 2025 +0800
Merge pull request UbiquitousLearning#12 from liang1232018/develop-zh
Develop zh
commit 6007443
Merge: 1c8647e b73c1c3
Author: liang1232018 <40791416+liang1232018@users.noreply.github.com>
Date: Wed Apr 9 14:50:07 2025 +0800
Merge branch 'develop-xdl' into develop-zh
commit 1c8647e
Author: oreomaker <zh002919@outlook.com>
Date: Tue Apr 8 21:39:56 2025 +0800
fix: qnn quant scale pow(2,bit) -> pow(2,bit-1)
commit cc760ae
Author: oreomaker <zh002919@outlook.com>
Date: Tue Apr 8 17:03:17 2025 +0800
fix: op create param type->dtype
commit 6afa80c
Author: oreomaker <zh002919@outlook.com>
Date: Mon Apr 7 15:25:21 2025 +0800
feat: Tensor::saveData only do when STATIC_READY
commit 2ebded3
Author: oreomaker <zh002919@outlook.com>
Date: Mon Apr 7 15:24:11 2025 +0800
feat: add qnn int16 layer param & op
todo: qnn llama package implement
commit 4faeca8
Author: oreomaker <zh002919@outlook.com>
Date: Mon Mar 24 15:52:54 2025 +0800
dev: runnable qwen2vl npu (buggy)
commit ebf110e
Author: oreomaker <zh002919@outlook.com>
Date: Mon Mar 24 15:46:23 2025 +0800
feat: add qwen vl export tool (todo: simulate infer and profile tools)
commit bde9a92
Author: oreomaker <zh002919@outlook.com>
Date: Mon Mar 24 15:44:25 2025 +0800
dev: a just working version of qwen 2.5 npu
commit 126c283
Merge: 25de8c3 9d33aaf
Author: oreomaker <zh002919@outlook.com>
Date: Mon Mar 24 15:43:30 2025 +0800
Merge branch 'fix-qnn-python' into develop-zh
commit 9d33aaf
Author: oreomaker <zh002919@outlook.com>
Date: Fri Mar 21 16:01:23 2025 +0800
fix: qnn profile quant bugs
commit 25de8c3
Author: oreomaker <zh002919@outlook.com>
Date: Thu Mar 20 16:00:19 2025 +0800
refactor: add graph split layer for QNN, change the modeling
note: xnnpack is affected, should not merge
commit 690a24e
Author: oreomaker <zh002919@outlook.com>
Date: Mon Mar 17 17:45:34 2025 +0800
feat: QNN load cache execute
commit 4f28330
Author: oreomaker <zh002919@outlook.com>
Date: Sun Mar 9 22:33:21 2025 +0800
dev: QNN graph merging execute
commit b73c1c3
Author: xudaliang <xudaliang@pku.edu.cn>
Date: Tue Nov 12 23:28:12 2024 +0800
feat : support decoding model configuration.
commit ec3d4e5
Author: xudaliang <xudaliang@pku.edu.cn>
Date: Tue Nov 12 20:31:45 2024 +0800
feat : support Qwen2.5 npu.
commit 7246d53
Author: yirongjie <yirj0809@gmail.com>
Date: Tue May 27 07:12:53 2025 +0000
feat: set run in Backends
commit 1150241
Author: yirongjie <yirj0809@gmail.com>
Date: Sat May 24 07:57:09 2025 +0000
fix: getFunc
commit 24db241
Author: yirongjie <yirj0809@gmail.com>
Date: Fri May 23 05:16:41 2025 +0000
fix: tensor function <Tensor *> to shared_ptr<Tensor>
commit 0ecce75
Author: yirongjie <yirj0809@gmail.com>
Date: Thu May 22 14:05:11 2025 +0000
feat:eager cpu
commit 9835db5
Author: yirongjie <yirj0809@gmail.com>
Date: Fri Apr 18 14:57:21 2025 +0000
fix: vtp
commit 30c3046
Author: yirongjie <yirj0809@gmail.com>
Date: Wed Apr 16 06:49:46 2025 +0000
fix: vtp
commit b416268
Author: yirongjie <yirj0809@gmail.com>
Date: Tue Apr 15 08:40:22 2025 +0000
fix: vtp
commit 6430ca8
Author: yirongjie <yirj0809@gmail.com>
Date: Mon Apr 14 12:53:58 2025 +0000
feat: vtp
commit f86bff6
Author: yirongjie <yirj0809@gmail.com>
Date: Sun Mar 23 09:41:14 2025 +0000
ref: add ShowUI
* feat: add FlashAttention2 && fix: MULTIMODELROPE
* remove broken submodule
---------
Co-authored-by: yirongjie <yirj0809@gmail.com>
Co-authored-by: yi <yi@U-21T7VPF4-1903.local>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.