Skip to content

feat: add qwen2vl qnn support#291

Merged
yirongjie merged 3 commits intoUbiquitousLearning:vlmfrom
yirongjie:main
Jun 9, 2025
Merged

feat: add qwen2vl qnn support#291
yirongjie merged 3 commits intoUbiquitousLearning:vlmfrom
yirongjie:main

Conversation

@UbiquitousLearning
Copy link
Copy Markdown
Owner

No description provided.

yirongjie added 2 commits May 27, 2025 16:14
commit efde6d0d014b647b8ceea59441aef1bd3ac424c0
Author: yirongjie <yirj0809@gmail.com>
Date:   Tue May 27 16:09:16 2025 +0000

    fix: merge

commit fe7fb476717e99df2eac23ab7fd1088e03cf8b3c
Merge: f52bb32e 20e94c0
Author: yirongjie <yirj0809@gmail.com>
Date:   Tue May 27 16:09:08 2025 +0000

    Merge branch 'main' of https://github.com/yirongjie/mllm

commit f52bb32e5dbf4edcd4998d664ae071a1b5c8dbbb
Author: yirongjie <yirj0809@gmail.com>
Date:   Tue May 27 12:25:08 2025 +0000

    fix: merge from qnn-qwen2vl;

commit 6f6c2442f750363c6789e7717861ea3a216cf356
Author: yirongjie <yirj0809@gmail.com>
Date:   Tue May 27 12:24:17 2025 +0000

    Squashed commit of the following:

    commit 4862c76
    Author: oreomaker <zh002919@outlook.com>
    Date:   Thu May 15 14:59:37 2025 +0800

        refact: use hvx qnn silu(faster); usable showui npu version

    commit 5df1b07
    Author: oreomaker <zh002919@outlook.com>
    Date:   Wed May 14 22:10:52 2025 +0800

        feat: qnn dequantize_add hvx op

    commit c813f55
    Author: oreomaker <zh002919@outlook.com>
    Date:   Tue May 13 09:50:06 2025 +0800

        chore: format qnn op package code

    commit ea215f0
    Author: oreomaker <zh002919@outlook.com>
    Date:   Mon May 12 11:34:38 2025 +0800

        feat: free act tensors after qnn vit embedding

    commit e4f5011
    Author: oreomaker <zh002919@outlook.com>
    Date:   Mon May 12 11:14:30 2025 +0800

        chore: remove save data in modeling qwen2vlnpu

    commit 2dcb677
    Author: oreomaker <zh002919@outlook.com>
    Date:   Mon May 12 10:48:34 2025 +0800

        fix: seperate weights for embedding-lmhead when using rotated qwen2vl/showui

    commit 4847318
    Author: oreomaker <zh002919@outlook.com>
    Date:   Sun May 11 21:16:59 2025 +0800

        fix: cpu tensor free bug(todo: handle tensor free)

    commit 799b673
    Author: xudaliang <xudaliang@pku.edu.cn>
    Date:   Sat May 10 22:51:11 2025 +0800

        feat : new qwen2_vl model.

    commit dd1817d
    Author: xudaliang <xudaliang@pku.edu.cn>
    Date:   Sat May 10 22:50:35 2025 +0800

        feat : support qwen2-vl rotation model with fp bias.

    commit 305dc5c
    Author: oreomaker <zh002919@outlook.com>
    Date:   Thu May 8 21:37:35 2025 +0800

        feat: runnable qwen2vl qnn showui(2*256)

    commit 8e14815
    Author: oreomaker <zh002919@outlook.com>
    Date:   Thu May 8 21:36:33 2025 +0800

        fix: pre processing of qwen2vl

    commit e041296
    Author: oreomaker <zh002919@outlook.com>
    Date:   Thu May 8 21:34:07 2025 +0800

        refact: qwen vl npu modeling using closetFactor view(64->8x8)
        feat: get_position_id padding in Qwen2VL_ImagePatchAndEmbedding

    commit 5b17204
    Author: oreomaker <zh002919@outlook.com>
    Date:   Thu May 8 21:29:13 2025 +0800

        feat: vit(visual_xx) tensor reuse for qnn (noted as: QNN VLM trick)

    commit 7c42658
    Author: oreomaker <zh002919@outlook.com>
    Date:   Thu May 8 21:26:49 2025 +0800

        feat: finish cpu pipeline mrope

    commit 0962c00
    Author: oreomaker <zh002919@outlook.com>
    Date:   Tue May 6 11:39:29 2025 +0800

        feat: pipeline multimodal rope

    commit 5317933
    Author: oreomaker <zh002919@outlook.com>
    Date:   Tue May 6 11:38:10 2025 +0800

        refactor: use old&fast qnn silu

    commit 5bd14de
    Author: oreomaker <zh002919@outlook.com>
    Date:   Mon Apr 28 21:10:48 2025 +0800

        feat: runnable qwen 2 vl npu

    commit 1df6eed
    Author: oreomaker <zh002919@outlook.com>
    Date:   Sun Apr 27 10:13:44 2025 +0800

        refactor: tensor.to(QNN)

    commit d3d29c4
    Author: oreomaker <zh002919@outlook.com>
    Date:   Sat Apr 26 21:22:52 2025 +0800

        chore: remove saveData in qwen2vl modeling

    commit c40e0c0
    Author: oreomaker <zh002919@outlook.com>
    Date:   Sat Apr 26 20:51:16 2025 +0800

        feat: add qnn retrieve context info log

    commit 175d3a2
    Author: oreomaker <zh002919@outlook.com>
    Date:   Sat Apr 26 20:46:14 2025 +0800

        fix: qwen 2 vl npu input tensor backend(correct version)

    commit 871e920
    Author: oreomaker <zh002919@outlook.com>
    Date:   Fri Apr 25 09:50:05 2025 +0800

        fix: quantize i16 arm neon macro

    commit a2b802c
    Author: xudaliang <xudaliang@pku.edu.cn>
    Date:   Wed Apr 23 18:33:26 2025 +0800

        fix : Qwen2-VL prefill bugs: 1.FP32 KVCache. 2.LMHead does not execute.

    commit 8c66604
    Author: oreomaker <zh002919@outlook.com>
    Date:   Fri Apr 18 15:35:03 2025 +0800

        fix: restore qwen2.5 modeling

    commit f138beb
    Author: oreomaker <zh002919@outlook.com>
    Date:   Fri Apr 18 15:28:35 2025 +0800

        fix: restore debug change

    commit 09e12ce
    Merge: d725942 9b271a9
    Author: oreomaker <zh002919@outlook.com>
    Date:   Fri Apr 18 13:39:10 2025 +0800

        Merge branch 'debug-qwen2.5' of github.com:liang1232018/mllm into debug-qwen2.5

    commit d725942
    Author: oreomaker <zh002919@outlook.com>
    Date:   Fri Apr 18 13:39:04 2025 +0800

        dev: qnn sigmoid version silu
        feat: qnn backend f16 type input

    commit 9b271a9
    Author: xudaliang <xudaliang@pku.edu.cn>
    Date:   Fri Apr 18 13:24:52 2025 +0800

        fix : linear W8A8 bias uint8 type bug

    commit 793a6c6
    Author: xudaliang <xudaliang@pku.edu.cn>
    Date:   Fri Apr 18 13:23:49 2025 +0800

        fix : Shadow linear triger condition.

    commit 4e24bca
    Author: oreomaker <zh002919@outlook.com>
    Date:   Wed Apr 16 20:53:07 2025 +0800

        qwen 2.5 debug

    commit 4d74756
    Author: oreomaker <zh002919@outlook.com>
    Date:   Wed Apr 16 20:52:33 2025 +0800

        fix: shadow linear

    commit 5866e2b
    Author: oreomaker <zh002919@outlook.com>
    Date:   Tue Apr 15 22:17:12 2025 +0800

        qwen 2.5 debug

    commit 29e9b92
    Author: oreomaker <zh002919@outlook.com>
    Date:   Mon Apr 14 09:28:45 2025 +0800

        fix: remove shadow linear if(round_value) logic

    commit a61e837
    Author: oreomaker <zh002919@outlook.com>
    Date:   Sun Apr 13 22:03:45 2025 +0800

        feat: int16 qkv for qwen2.5 vl npu

    commit 566f21d
    Author: xudaliang <xudaliang@pku.edu.cn>
    Date:   Sun Apr 13 18:45:06 2025 +0800

        fix : modeling input quantize to I8, but dequantize with I16 bug.

    commit 60639d0
    Author: xudaliang <xudaliang@pku.edu.cn>
    Date:   Sun Apr 13 18:44:18 2025 +0800

        fix : LLaMADequantize INT16 to FP32 shuffle order bugs.

    commit a5cc652
    Author: xudaliang <xudaliang@pku.edu.cn>
    Date:   Sun Apr 13 17:31:10 2025 +0800

        fix : LLaMAQuantize FP32 to INT16 round scale error.

    commit f139822
    Author: oreomaker <zh002919@outlook.com>
    Date:   Sat Apr 12 22:24:30 2025 +0800

        fix: qnn int 16 linear bias(use int8 bias scale)

    commit 8831811
    Author: oreomaker <zh002919@outlook.com>
    Date:   Sat Apr 12 15:03:40 2025 +0800

        debug: qnn int16 linear

    commit 088fe09
    Author: xudaliang <xudaliang@pku.edu.cn>
    Date:   Fri Apr 11 23:22:41 2025 +0800

        feat : support INT16 dequantize and quantize.

    commit 73ebe87
    Merge: b73c1c3 6007443
    Author: liang1232018 <40791416+liang1232018@users.noreply.github.com>
    Date:   Wed Apr 9 14:50:25 2025 +0800

        Merge pull request UbiquitousLearning#12 from liang1232018/develop-zh

        Develop zh

    commit 6007443
    Merge: 1c8647e b73c1c3
    Author: liang1232018 <40791416+liang1232018@users.noreply.github.com>
    Date:   Wed Apr 9 14:50:07 2025 +0800

        Merge branch 'develop-xdl' into develop-zh

    commit 1c8647e
    Author: oreomaker <zh002919@outlook.com>
    Date:   Tue Apr 8 21:39:56 2025 +0800

        fix: qnn quant scale pow(2,bit) -> pow(2,bit-1)

    commit cc760ae
    Author: oreomaker <zh002919@outlook.com>
    Date:   Tue Apr 8 17:03:17 2025 +0800

        fix: op create param type->dtype

    commit 6afa80c
    Author: oreomaker <zh002919@outlook.com>
    Date:   Mon Apr 7 15:25:21 2025 +0800

        feat: Tensor::saveData only do when STATIC_READY

    commit 2ebded3
    Author: oreomaker <zh002919@outlook.com>
    Date:   Mon Apr 7 15:24:11 2025 +0800

        feat: add qnn int16 layer param & op
        todo: qnn llama package implement

    commit 4faeca8
    Author: oreomaker <zh002919@outlook.com>
    Date:   Mon Mar 24 15:52:54 2025 +0800

        dev: runnable qwen2vl npu (buggy)

    commit ebf110e
    Author: oreomaker <zh002919@outlook.com>
    Date:   Mon Mar 24 15:46:23 2025 +0800

        feat: add qwen vl export tool (todo: simulate infer and profile tools)

    commit bde9a92
    Author: oreomaker <zh002919@outlook.com>
    Date:   Mon Mar 24 15:44:25 2025 +0800

        dev: a just working version of qwen 2.5 npu

    commit 126c283
    Merge: 25de8c3 9d33aaf
    Author: oreomaker <zh002919@outlook.com>
    Date:   Mon Mar 24 15:43:30 2025 +0800

        Merge branch 'fix-qnn-python' into develop-zh

    commit 9d33aaf
    Author: oreomaker <zh002919@outlook.com>
    Date:   Fri Mar 21 16:01:23 2025 +0800

        fix: qnn profile quant bugs

    commit 25de8c3
    Author: oreomaker <zh002919@outlook.com>
    Date:   Thu Mar 20 16:00:19 2025 +0800

        refactor: add graph split layer for QNN, change the modeling
        note: xnnpack is affected, should not merge

    commit 690a24e
    Author: oreomaker <zh002919@outlook.com>
    Date:   Mon Mar 17 17:45:34 2025 +0800

        feat: QNN load cache execute

    commit 4f28330
    Author: oreomaker <zh002919@outlook.com>
    Date:   Sun Mar 9 22:33:21 2025 +0800

        dev: QNN graph merging execute

    commit b73c1c3
    Author: xudaliang <xudaliang@pku.edu.cn>
    Date:   Tue Nov 12 23:28:12 2024 +0800

        feat : support decoding model configuration.

    commit ec3d4e5
    Author: xudaliang <xudaliang@pku.edu.cn>
    Date:   Tue Nov 12 20:31:45 2024 +0800

        feat : support Qwen2.5 npu.

commit 7246d53
Author: yirongjie <yirj0809@gmail.com>
Date:   Tue May 27 07:12:53 2025 +0000

    feat: set run in Backends

commit 1150241
Author: yirongjie <yirj0809@gmail.com>
Date:   Sat May 24 07:57:09 2025 +0000

    fix: getFunc

commit 24db241
Author: yirongjie <yirj0809@gmail.com>
Date:   Fri May 23 05:16:41 2025 +0000

    fix: tensor function <Tensor *> to shared_ptr<Tensor>

commit 0ecce75
Author: yirongjie <yirj0809@gmail.com>
Date:   Thu May 22 14:05:11 2025 +0000

    feat:eager cpu

commit 9835db5
Author: yirongjie <yirj0809@gmail.com>
Date:   Fri Apr 18 14:57:21 2025 +0000

    fix: vtp

commit 30c3046
Author: yirongjie <yirj0809@gmail.com>
Date:   Wed Apr 16 06:49:46 2025 +0000

    fix: vtp

commit b416268
Author: yirongjie <yirj0809@gmail.com>
Date:   Tue Apr 15 08:40:22 2025 +0000

    fix: vtp

commit 6430ca8
Author: yirongjie <yirj0809@gmail.com>
Date:   Mon Apr 14 12:53:58 2025 +0000

    feat: vtp

commit f86bff6
Author: yirongjie <yirj0809@gmail.com>
Date:   Sun Mar 23 09:41:14 2025 +0000

    ref: add ShowUI
@yirongjie yirongjie self-requested a review June 9, 2025 07:41
@yirongjie yirongjie merged commit d11a8e6 into UbiquitousLearning:vlm Jun 9, 2025
1 check passed
yirongjie added a commit that referenced this pull request Jun 10, 2025
* Squashed commit of the following:

commit efde6d0d014b647b8ceea59441aef1bd3ac424c0
Author: yirongjie <yirj0809@gmail.com>
Date:   Tue May 27 16:09:16 2025 +0000

    fix: merge

commit fe7fb476717e99df2eac23ab7fd1088e03cf8b3c
Merge: f52bb32e 20e94c0
Author: yirongjie <yirj0809@gmail.com>
Date:   Tue May 27 16:09:08 2025 +0000

    Merge branch 'main' of https://github.com/yirongjie/mllm

commit f52bb32e5dbf4edcd4998d664ae071a1b5c8dbbb
Author: yirongjie <yirj0809@gmail.com>
Date:   Tue May 27 12:25:08 2025 +0000

    fix: merge from qnn-qwen2vl;

commit 6f6c2442f750363c6789e7717861ea3a216cf356
Author: yirongjie <yirj0809@gmail.com>
Date:   Tue May 27 12:24:17 2025 +0000

    Squashed commit of the following:

    commit 4862c76
    Author: oreomaker <zh002919@outlook.com>
    Date:   Thu May 15 14:59:37 2025 +0800

        refact: use hvx qnn silu(faster); usable showui npu version

    commit 5df1b07
    Author: oreomaker <zh002919@outlook.com>
    Date:   Wed May 14 22:10:52 2025 +0800

        feat: qnn dequantize_add hvx op

    commit c813f55
    Author: oreomaker <zh002919@outlook.com>
    Date:   Tue May 13 09:50:06 2025 +0800

        chore: format qnn op package code

    commit ea215f0
    Author: oreomaker <zh002919@outlook.com>
    Date:   Mon May 12 11:34:38 2025 +0800

        feat: free act tensors after qnn vit embedding

    commit e4f5011
    Author: oreomaker <zh002919@outlook.com>
    Date:   Mon May 12 11:14:30 2025 +0800

        chore: remove save data in modeling qwen2vlnpu

    commit 2dcb677
    Author: oreomaker <zh002919@outlook.com>
    Date:   Mon May 12 10:48:34 2025 +0800

        fix: seperate weights for embedding-lmhead when using rotated qwen2vl/showui

    commit 4847318
    Author: oreomaker <zh002919@outlook.com>
    Date:   Sun May 11 21:16:59 2025 +0800

        fix: cpu tensor free bug(todo: handle tensor free)

    commit 799b673
    Author: xudaliang <xudaliang@pku.edu.cn>
    Date:   Sat May 10 22:51:11 2025 +0800

        feat : new qwen2_vl model.

    commit dd1817d
    Author: xudaliang <xudaliang@pku.edu.cn>
    Date:   Sat May 10 22:50:35 2025 +0800

        feat : support qwen2-vl rotation model with fp bias.

    commit 305dc5c
    Author: oreomaker <zh002919@outlook.com>
    Date:   Thu May 8 21:37:35 2025 +0800

        feat: runnable qwen2vl qnn showui(2*256)

    commit 8e14815
    Author: oreomaker <zh002919@outlook.com>
    Date:   Thu May 8 21:36:33 2025 +0800

        fix: pre processing of qwen2vl

    commit e041296
    Author: oreomaker <zh002919@outlook.com>
    Date:   Thu May 8 21:34:07 2025 +0800

        refact: qwen vl npu modeling using closetFactor view(64->8x8)
        feat: get_position_id padding in Qwen2VL_ImagePatchAndEmbedding

    commit 5b17204
    Author: oreomaker <zh002919@outlook.com>
    Date:   Thu May 8 21:29:13 2025 +0800

        feat: vit(visual_xx) tensor reuse for qnn (noted as: QNN VLM trick)

    commit 7c42658
    Author: oreomaker <zh002919@outlook.com>
    Date:   Thu May 8 21:26:49 2025 +0800

        feat: finish cpu pipeline mrope

    commit 0962c00
    Author: oreomaker <zh002919@outlook.com>
    Date:   Tue May 6 11:39:29 2025 +0800

        feat: pipeline multimodal rope

    commit 5317933
    Author: oreomaker <zh002919@outlook.com>
    Date:   Tue May 6 11:38:10 2025 +0800

        refactor: use old&fast qnn silu

    commit 5bd14de
    Author: oreomaker <zh002919@outlook.com>
    Date:   Mon Apr 28 21:10:48 2025 +0800

        feat: runnable qwen 2 vl npu

    commit 1df6eed
    Author: oreomaker <zh002919@outlook.com>
    Date:   Sun Apr 27 10:13:44 2025 +0800

        refactor: tensor.to(QNN)

    commit d3d29c4
    Author: oreomaker <zh002919@outlook.com>
    Date:   Sat Apr 26 21:22:52 2025 +0800

        chore: remove saveData in qwen2vl modeling

    commit c40e0c0
    Author: oreomaker <zh002919@outlook.com>
    Date:   Sat Apr 26 20:51:16 2025 +0800

        feat: add qnn retrieve context info log

    commit 175d3a2
    Author: oreomaker <zh002919@outlook.com>
    Date:   Sat Apr 26 20:46:14 2025 +0800

        fix: qwen 2 vl npu input tensor backend(correct version)

    commit 871e920
    Author: oreomaker <zh002919@outlook.com>
    Date:   Fri Apr 25 09:50:05 2025 +0800

        fix: quantize i16 arm neon macro

    commit a2b802c
    Author: xudaliang <xudaliang@pku.edu.cn>
    Date:   Wed Apr 23 18:33:26 2025 +0800

        fix : Qwen2-VL prefill bugs: 1.FP32 KVCache. 2.LMHead does not execute.

    commit 8c66604
    Author: oreomaker <zh002919@outlook.com>
    Date:   Fri Apr 18 15:35:03 2025 +0800

        fix: restore qwen2.5 modeling

    commit f138beb
    Author: oreomaker <zh002919@outlook.com>
    Date:   Fri Apr 18 15:28:35 2025 +0800

        fix: restore debug change

    commit 09e12ce
    Merge: d725942 9b271a9
    Author: oreomaker <zh002919@outlook.com>
    Date:   Fri Apr 18 13:39:10 2025 +0800

        Merge branch 'debug-qwen2.5' of github.com:liang1232018/mllm into debug-qwen2.5

    commit d725942
    Author: oreomaker <zh002919@outlook.com>
    Date:   Fri Apr 18 13:39:04 2025 +0800

        dev: qnn sigmoid version silu
        feat: qnn backend f16 type input

    commit 9b271a9
    Author: xudaliang <xudaliang@pku.edu.cn>
    Date:   Fri Apr 18 13:24:52 2025 +0800

        fix : linear W8A8 bias uint8 type bug

    commit 793a6c6
    Author: xudaliang <xudaliang@pku.edu.cn>
    Date:   Fri Apr 18 13:23:49 2025 +0800

        fix : Shadow linear triger condition.

    commit 4e24bca
    Author: oreomaker <zh002919@outlook.com>
    Date:   Wed Apr 16 20:53:07 2025 +0800

        qwen 2.5 debug

    commit 4d74756
    Author: oreomaker <zh002919@outlook.com>
    Date:   Wed Apr 16 20:52:33 2025 +0800

        fix: shadow linear

    commit 5866e2b
    Author: oreomaker <zh002919@outlook.com>
    Date:   Tue Apr 15 22:17:12 2025 +0800

        qwen 2.5 debug

    commit 29e9b92
    Author: oreomaker <zh002919@outlook.com>
    Date:   Mon Apr 14 09:28:45 2025 +0800

        fix: remove shadow linear if(round_value) logic

    commit a61e837
    Author: oreomaker <zh002919@outlook.com>
    Date:   Sun Apr 13 22:03:45 2025 +0800

        feat: int16 qkv for qwen2.5 vl npu

    commit 566f21d
    Author: xudaliang <xudaliang@pku.edu.cn>
    Date:   Sun Apr 13 18:45:06 2025 +0800

        fix : modeling input quantize to I8, but dequantize with I16 bug.

    commit 60639d0
    Author: xudaliang <xudaliang@pku.edu.cn>
    Date:   Sun Apr 13 18:44:18 2025 +0800

        fix : LLaMADequantize INT16 to FP32 shuffle order bugs.

    commit a5cc652
    Author: xudaliang <xudaliang@pku.edu.cn>
    Date:   Sun Apr 13 17:31:10 2025 +0800

        fix : LLaMAQuantize FP32 to INT16 round scale error.

    commit f139822
    Author: oreomaker <zh002919@outlook.com>
    Date:   Sat Apr 12 22:24:30 2025 +0800

        fix: qnn int 16 linear bias(use int8 bias scale)

    commit 8831811
    Author: oreomaker <zh002919@outlook.com>
    Date:   Sat Apr 12 15:03:40 2025 +0800

        debug: qnn int16 linear

    commit 088fe09
    Author: xudaliang <xudaliang@pku.edu.cn>
    Date:   Fri Apr 11 23:22:41 2025 +0800

        feat : support INT16 dequantize and quantize.

    commit 73ebe87
    Merge: b73c1c3 6007443
    Author: liang1232018 <40791416+liang1232018@users.noreply.github.com>
    Date:   Wed Apr 9 14:50:25 2025 +0800

        Merge pull request #12 from liang1232018/develop-zh

        Develop zh

    commit 6007443
    Merge: 1c8647e b73c1c3
    Author: liang1232018 <40791416+liang1232018@users.noreply.github.com>
    Date:   Wed Apr 9 14:50:07 2025 +0800

        Merge branch 'develop-xdl' into develop-zh

    commit 1c8647e
    Author: oreomaker <zh002919@outlook.com>
    Date:   Tue Apr 8 21:39:56 2025 +0800

        fix: qnn quant scale pow(2,bit) -> pow(2,bit-1)

    commit cc760ae
    Author: oreomaker <zh002919@outlook.com>
    Date:   Tue Apr 8 17:03:17 2025 +0800

        fix: op create param type->dtype

    commit 6afa80c
    Author: oreomaker <zh002919@outlook.com>
    Date:   Mon Apr 7 15:25:21 2025 +0800

        feat: Tensor::saveData only do when STATIC_READY

    commit 2ebded3
    Author: oreomaker <zh002919@outlook.com>
    Date:   Mon Apr 7 15:24:11 2025 +0800

        feat: add qnn int16 layer param & op
        todo: qnn llama package implement

    commit 4faeca8
    Author: oreomaker <zh002919@outlook.com>
    Date:   Mon Mar 24 15:52:54 2025 +0800

        dev: runnable qwen2vl npu (buggy)

    commit ebf110e
    Author: oreomaker <zh002919@outlook.com>
    Date:   Mon Mar 24 15:46:23 2025 +0800

        feat: add qwen vl export tool (todo: simulate infer and profile tools)

    commit bde9a92
    Author: oreomaker <zh002919@outlook.com>
    Date:   Mon Mar 24 15:44:25 2025 +0800

        dev: a just working version of qwen 2.5 npu

    commit 126c283
    Merge: 25de8c3 9d33aaf
    Author: oreomaker <zh002919@outlook.com>
    Date:   Mon Mar 24 15:43:30 2025 +0800

        Merge branch 'fix-qnn-python' into develop-zh

    commit 9d33aaf
    Author: oreomaker <zh002919@outlook.com>
    Date:   Fri Mar 21 16:01:23 2025 +0800

        fix: qnn profile quant bugs

    commit 25de8c3
    Author: oreomaker <zh002919@outlook.com>
    Date:   Thu Mar 20 16:00:19 2025 +0800

        refactor: add graph split layer for QNN, change the modeling
        note: xnnpack is affected, should not merge

    commit 690a24e
    Author: oreomaker <zh002919@outlook.com>
    Date:   Mon Mar 17 17:45:34 2025 +0800

        feat: QNN load cache execute

    commit 4f28330
    Author: oreomaker <zh002919@outlook.com>
    Date:   Sun Mar 9 22:33:21 2025 +0800

        dev: QNN graph merging execute

    commit b73c1c3
    Author: xudaliang <xudaliang@pku.edu.cn>
    Date:   Tue Nov 12 23:28:12 2024 +0800

        feat : support decoding model configuration.

    commit ec3d4e5
    Author: xudaliang <xudaliang@pku.edu.cn>
    Date:   Tue Nov 12 20:31:45 2024 +0800

        feat : support Qwen2.5 npu.

commit 7246d53
Author: yirongjie <yirj0809@gmail.com>
Date:   Tue May 27 07:12:53 2025 +0000

    feat: set run in Backends

commit 1150241
Author: yirongjie <yirj0809@gmail.com>
Date:   Sat May 24 07:57:09 2025 +0000

    fix: getFunc

commit 24db241
Author: yirongjie <yirj0809@gmail.com>
Date:   Fri May 23 05:16:41 2025 +0000

    fix: tensor function <Tensor *> to shared_ptr<Tensor>

commit 0ecce75
Author: yirongjie <yirj0809@gmail.com>
Date:   Thu May 22 14:05:11 2025 +0000

    feat:eager cpu

commit 9835db5
Author: yirongjie <yirj0809@gmail.com>
Date:   Fri Apr 18 14:57:21 2025 +0000

    fix: vtp

commit 30c3046
Author: yirongjie <yirj0809@gmail.com>
Date:   Wed Apr 16 06:49:46 2025 +0000

    fix: vtp

commit b416268
Author: yirongjie <yirj0809@gmail.com>
Date:   Tue Apr 15 08:40:22 2025 +0000

    fix: vtp

commit 6430ca8
Author: yirongjie <yirj0809@gmail.com>
Date:   Mon Apr 14 12:53:58 2025 +0000

    feat: vtp

commit f86bff6
Author: yirongjie <yirj0809@gmail.com>
Date:   Sun Mar 23 09:41:14 2025 +0000

    ref: add ShowUI

* feat: add FlashAttention2 && fix: MULTIMODELROPE

* remove broken submodule

---------

Co-authored-by: yirongjie <yirj0809@gmail.com>
Co-authored-by: yi <yi@U-21T7VPF4-1903.local>
yirongjie added a commit to yirongjie/mllm that referenced this pull request Jun 14, 2025
* Squashed commit of the following:

commit efde6d0d014b647b8ceea59441aef1bd3ac424c0
Author: yirongjie <yirj0809@gmail.com>
Date:   Tue May 27 16:09:16 2025 +0000

    fix: merge

commit fe7fb476717e99df2eac23ab7fd1088e03cf8b3c
Merge: f52bb32e 20e94c0
Author: yirongjie <yirj0809@gmail.com>
Date:   Tue May 27 16:09:08 2025 +0000

    Merge branch 'main' of https://github.com/yirongjie/mllm

commit f52bb32e5dbf4edcd4998d664ae071a1b5c8dbbb
Author: yirongjie <yirj0809@gmail.com>
Date:   Tue May 27 12:25:08 2025 +0000

    fix: merge from qnn-qwen2vl;

commit 6f6c2442f750363c6789e7717861ea3a216cf356
Author: yirongjie <yirj0809@gmail.com>
Date:   Tue May 27 12:24:17 2025 +0000

    Squashed commit of the following:

    commit 4862c76
    Author: oreomaker <zh002919@outlook.com>
    Date:   Thu May 15 14:59:37 2025 +0800

        refact: use hvx qnn silu(faster); usable showui npu version

    commit 5df1b07
    Author: oreomaker <zh002919@outlook.com>
    Date:   Wed May 14 22:10:52 2025 +0800

        feat: qnn dequantize_add hvx op

    commit c813f55
    Author: oreomaker <zh002919@outlook.com>
    Date:   Tue May 13 09:50:06 2025 +0800

        chore: format qnn op package code

    commit ea215f0
    Author: oreomaker <zh002919@outlook.com>
    Date:   Mon May 12 11:34:38 2025 +0800

        feat: free act tensors after qnn vit embedding

    commit e4f5011
    Author: oreomaker <zh002919@outlook.com>
    Date:   Mon May 12 11:14:30 2025 +0800

        chore: remove save data in modeling qwen2vlnpu

    commit 2dcb677
    Author: oreomaker <zh002919@outlook.com>
    Date:   Mon May 12 10:48:34 2025 +0800

        fix: seperate weights for embedding-lmhead when using rotated qwen2vl/showui

    commit 4847318
    Author: oreomaker <zh002919@outlook.com>
    Date:   Sun May 11 21:16:59 2025 +0800

        fix: cpu tensor free bug(todo: handle tensor free)

    commit 799b673
    Author: xudaliang <xudaliang@pku.edu.cn>
    Date:   Sat May 10 22:51:11 2025 +0800

        feat : new qwen2_vl model.

    commit dd1817d
    Author: xudaliang <xudaliang@pku.edu.cn>
    Date:   Sat May 10 22:50:35 2025 +0800

        feat : support qwen2-vl rotation model with fp bias.

    commit 305dc5c
    Author: oreomaker <zh002919@outlook.com>
    Date:   Thu May 8 21:37:35 2025 +0800

        feat: runnable qwen2vl qnn showui(2*256)

    commit 8e14815
    Author: oreomaker <zh002919@outlook.com>
    Date:   Thu May 8 21:36:33 2025 +0800

        fix: pre processing of qwen2vl

    commit e041296
    Author: oreomaker <zh002919@outlook.com>
    Date:   Thu May 8 21:34:07 2025 +0800

        refact: qwen vl npu modeling using closetFactor view(64->8x8)
        feat: get_position_id padding in Qwen2VL_ImagePatchAndEmbedding

    commit 5b17204
    Author: oreomaker <zh002919@outlook.com>
    Date:   Thu May 8 21:29:13 2025 +0800

        feat: vit(visual_xx) tensor reuse for qnn (noted as: QNN VLM trick)

    commit 7c42658
    Author: oreomaker <zh002919@outlook.com>
    Date:   Thu May 8 21:26:49 2025 +0800

        feat: finish cpu pipeline mrope

    commit 0962c00
    Author: oreomaker <zh002919@outlook.com>
    Date:   Tue May 6 11:39:29 2025 +0800

        feat: pipeline multimodal rope

    commit 5317933
    Author: oreomaker <zh002919@outlook.com>
    Date:   Tue May 6 11:38:10 2025 +0800

        refactor: use old&fast qnn silu

    commit 5bd14de
    Author: oreomaker <zh002919@outlook.com>
    Date:   Mon Apr 28 21:10:48 2025 +0800

        feat: runnable qwen 2 vl npu

    commit 1df6eed
    Author: oreomaker <zh002919@outlook.com>
    Date:   Sun Apr 27 10:13:44 2025 +0800

        refactor: tensor.to(QNN)

    commit d3d29c4
    Author: oreomaker <zh002919@outlook.com>
    Date:   Sat Apr 26 21:22:52 2025 +0800

        chore: remove saveData in qwen2vl modeling

    commit c40e0c0
    Author: oreomaker <zh002919@outlook.com>
    Date:   Sat Apr 26 20:51:16 2025 +0800

        feat: add qnn retrieve context info log

    commit 175d3a2
    Author: oreomaker <zh002919@outlook.com>
    Date:   Sat Apr 26 20:46:14 2025 +0800

        fix: qwen 2 vl npu input tensor backend(correct version)

    commit 871e920
    Author: oreomaker <zh002919@outlook.com>
    Date:   Fri Apr 25 09:50:05 2025 +0800

        fix: quantize i16 arm neon macro

    commit a2b802c
    Author: xudaliang <xudaliang@pku.edu.cn>
    Date:   Wed Apr 23 18:33:26 2025 +0800

        fix : Qwen2-VL prefill bugs: 1.FP32 KVCache. 2.LMHead does not execute.

    commit 8c66604
    Author: oreomaker <zh002919@outlook.com>
    Date:   Fri Apr 18 15:35:03 2025 +0800

        fix: restore qwen2.5 modeling

    commit f138beb
    Author: oreomaker <zh002919@outlook.com>
    Date:   Fri Apr 18 15:28:35 2025 +0800

        fix: restore debug change

    commit 09e12ce
    Merge: d725942 9b271a9
    Author: oreomaker <zh002919@outlook.com>
    Date:   Fri Apr 18 13:39:10 2025 +0800

        Merge branch 'debug-qwen2.5' of github.com:liang1232018/mllm into debug-qwen2.5

    commit d725942
    Author: oreomaker <zh002919@outlook.com>
    Date:   Fri Apr 18 13:39:04 2025 +0800

        dev: qnn sigmoid version silu
        feat: qnn backend f16 type input

    commit 9b271a9
    Author: xudaliang <xudaliang@pku.edu.cn>
    Date:   Fri Apr 18 13:24:52 2025 +0800

        fix : linear W8A8 bias uint8 type bug

    commit 793a6c6
    Author: xudaliang <xudaliang@pku.edu.cn>
    Date:   Fri Apr 18 13:23:49 2025 +0800

        fix : Shadow linear triger condition.

    commit 4e24bca
    Author: oreomaker <zh002919@outlook.com>
    Date:   Wed Apr 16 20:53:07 2025 +0800

        qwen 2.5 debug

    commit 4d74756
    Author: oreomaker <zh002919@outlook.com>
    Date:   Wed Apr 16 20:52:33 2025 +0800

        fix: shadow linear

    commit 5866e2b
    Author: oreomaker <zh002919@outlook.com>
    Date:   Tue Apr 15 22:17:12 2025 +0800

        qwen 2.5 debug

    commit 29e9b92
    Author: oreomaker <zh002919@outlook.com>
    Date:   Mon Apr 14 09:28:45 2025 +0800

        fix: remove shadow linear if(round_value) logic

    commit a61e837
    Author: oreomaker <zh002919@outlook.com>
    Date:   Sun Apr 13 22:03:45 2025 +0800

        feat: int16 qkv for qwen2.5 vl npu

    commit 566f21d
    Author: xudaliang <xudaliang@pku.edu.cn>
    Date:   Sun Apr 13 18:45:06 2025 +0800

        fix : modeling input quantize to I8, but dequantize with I16 bug.

    commit 60639d0
    Author: xudaliang <xudaliang@pku.edu.cn>
    Date:   Sun Apr 13 18:44:18 2025 +0800

        fix : LLaMADequantize INT16 to FP32 shuffle order bugs.

    commit a5cc652
    Author: xudaliang <xudaliang@pku.edu.cn>
    Date:   Sun Apr 13 17:31:10 2025 +0800

        fix : LLaMAQuantize FP32 to INT16 round scale error.

    commit f139822
    Author: oreomaker <zh002919@outlook.com>
    Date:   Sat Apr 12 22:24:30 2025 +0800

        fix: qnn int 16 linear bias(use int8 bias scale)

    commit 8831811
    Author: oreomaker <zh002919@outlook.com>
    Date:   Sat Apr 12 15:03:40 2025 +0800

        debug: qnn int16 linear

    commit 088fe09
    Author: xudaliang <xudaliang@pku.edu.cn>
    Date:   Fri Apr 11 23:22:41 2025 +0800

        feat : support INT16 dequantize and quantize.

    commit 73ebe87
    Merge: b73c1c3 6007443
    Author: liang1232018 <40791416+liang1232018@users.noreply.github.com>
    Date:   Wed Apr 9 14:50:25 2025 +0800

        Merge pull request UbiquitousLearning#12 from liang1232018/develop-zh

        Develop zh

    commit 6007443
    Merge: 1c8647e b73c1c3
    Author: liang1232018 <40791416+liang1232018@users.noreply.github.com>
    Date:   Wed Apr 9 14:50:07 2025 +0800

        Merge branch 'develop-xdl' into develop-zh

    commit 1c8647e
    Author: oreomaker <zh002919@outlook.com>
    Date:   Tue Apr 8 21:39:56 2025 +0800

        fix: qnn quant scale pow(2,bit) -> pow(2,bit-1)

    commit cc760ae
    Author: oreomaker <zh002919@outlook.com>
    Date:   Tue Apr 8 17:03:17 2025 +0800

        fix: op create param type->dtype

    commit 6afa80c
    Author: oreomaker <zh002919@outlook.com>
    Date:   Mon Apr 7 15:25:21 2025 +0800

        feat: Tensor::saveData only do when STATIC_READY

    commit 2ebded3
    Author: oreomaker <zh002919@outlook.com>
    Date:   Mon Apr 7 15:24:11 2025 +0800

        feat: add qnn int16 layer param & op
        todo: qnn llama package implement

    commit 4faeca8
    Author: oreomaker <zh002919@outlook.com>
    Date:   Mon Mar 24 15:52:54 2025 +0800

        dev: runnable qwen2vl npu (buggy)

    commit ebf110e
    Author: oreomaker <zh002919@outlook.com>
    Date:   Mon Mar 24 15:46:23 2025 +0800

        feat: add qwen vl export tool (todo: simulate infer and profile tools)

    commit bde9a92
    Author: oreomaker <zh002919@outlook.com>
    Date:   Mon Mar 24 15:44:25 2025 +0800

        dev: a just working version of qwen 2.5 npu

    commit 126c283
    Merge: 25de8c3 9d33aaf
    Author: oreomaker <zh002919@outlook.com>
    Date:   Mon Mar 24 15:43:30 2025 +0800

        Merge branch 'fix-qnn-python' into develop-zh

    commit 9d33aaf
    Author: oreomaker <zh002919@outlook.com>
    Date:   Fri Mar 21 16:01:23 2025 +0800

        fix: qnn profile quant bugs

    commit 25de8c3
    Author: oreomaker <zh002919@outlook.com>
    Date:   Thu Mar 20 16:00:19 2025 +0800

        refactor: add graph split layer for QNN, change the modeling
        note: xnnpack is affected, should not merge

    commit 690a24e
    Author: oreomaker <zh002919@outlook.com>
    Date:   Mon Mar 17 17:45:34 2025 +0800

        feat: QNN load cache execute

    commit 4f28330
    Author: oreomaker <zh002919@outlook.com>
    Date:   Sun Mar 9 22:33:21 2025 +0800

        dev: QNN graph merging execute

    commit b73c1c3
    Author: xudaliang <xudaliang@pku.edu.cn>
    Date:   Tue Nov 12 23:28:12 2024 +0800

        feat : support decoding model configuration.

    commit ec3d4e5
    Author: xudaliang <xudaliang@pku.edu.cn>
    Date:   Tue Nov 12 20:31:45 2024 +0800

        feat : support Qwen2.5 npu.

commit 7246d53
Author: yirongjie <yirj0809@gmail.com>
Date:   Tue May 27 07:12:53 2025 +0000

    feat: set run in Backends

commit 1150241
Author: yirongjie <yirj0809@gmail.com>
Date:   Sat May 24 07:57:09 2025 +0000

    fix: getFunc

commit 24db241
Author: yirongjie <yirj0809@gmail.com>
Date:   Fri May 23 05:16:41 2025 +0000

    fix: tensor function <Tensor *> to shared_ptr<Tensor>

commit 0ecce75
Author: yirongjie <yirj0809@gmail.com>
Date:   Thu May 22 14:05:11 2025 +0000

    feat:eager cpu

commit 9835db5
Author: yirongjie <yirj0809@gmail.com>
Date:   Fri Apr 18 14:57:21 2025 +0000

    fix: vtp

commit 30c3046
Author: yirongjie <yirj0809@gmail.com>
Date:   Wed Apr 16 06:49:46 2025 +0000

    fix: vtp

commit b416268
Author: yirongjie <yirj0809@gmail.com>
Date:   Tue Apr 15 08:40:22 2025 +0000

    fix: vtp

commit 6430ca8
Author: yirongjie <yirj0809@gmail.com>
Date:   Mon Apr 14 12:53:58 2025 +0000

    feat: vtp

commit f86bff6
Author: yirongjie <yirj0809@gmail.com>
Date:   Sun Mar 23 09:41:14 2025 +0000

    ref: add ShowUI

* feat: add FlashAttention2 && fix: MULTIMODELROPE

* remove broken submodule

---------

Co-authored-by: yirongjie <yirj0809@gmail.com>
Co-authored-by: yi <yi@U-21T7VPF4-1903.local>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants