Thank you for proposing such an insightful work! However, when I tried to run the MASQuant code, I encountered several issues:
-
The conda environment is very difficult to configure. It would be helpful if the exact versions of the dependency packages could be clearly specified.
-
The codebase appears somewhat messy, and there seems to be a lot of code that is unrelated to multimodal models.
-
In the provided examples, I did not find any logic related to the use of LoRA, which seems inconsistent with the description in the paper.
-
Regarding the actual inference speed, it seems that the CUDA kernel implementation has not been provided.
Thank you for proposing such an insightful work! However, when I tried to run the MASQuant code, I encountered several issues:
The conda environment is very difficult to configure. It would be helpful if the exact versions of the dependency packages could be clearly specified.
The codebase appears somewhat messy, and there seems to be a lot of code that is unrelated to multimodal models.
In the provided examples, I did not find any logic related to the use of LoRA, which seems inconsistent with the description in the paper.
Regarding the actual inference speed, it seems that the CUDA kernel implementation has not been provided.