Skip to content

Conversation

@yzh119
Copy link
Member

@yzh119 yzh119 commented Jul 10, 2023

Currently, mlc-llm produces meaningless results if we compile models with dlight and set the quantization mode to q4f16_0 (which transposes the weight matrix).
This is due to a typo in decode-gemv, thank @Hzfengsy for finding this bug.

cc @junrushao @Hzfengsy

@tvm-bot
Copy link
Collaborator

tvm-bot commented Jul 10, 2023

Thanks for contributing to TVM! Please refer to the contributing guidelines https://tvm.apache.org/docs/contribute/ for useful information and tips. Please request code reviews from Reviewers by @-ing them in a comment.

  • No users to tag found in teams: bugfix, dlight See #10317 for details

Generated by tvm-bot

@github-actions github-actions bot requested review from Hzfengsy and junrushao July 10, 2023 11:40
@Hzfengsy
Copy link
Member

Would be great to add a regression test

@yzh119 yzh119 marked this pull request as draft July 10, 2023 12:08
@yzh119
Copy link
Member Author

yzh119 commented Jul 10, 2023

It turns out this PR itself cannot solve the issue.
Currently failed cases include:

  1. q3f16_0
  2. q4f16_0
  3. q4f32_0

@tqchen
Copy link
Member

tqchen commented Jul 10, 2023

still looks like a right direction. let us add a test and merge this in first, then iterate

@MasterJH5574
Copy link
Contributor

Closed since the regression is covered by #15330.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants