Skip to content

Conversation

@jiqing-feng
Copy link
Contributor

@jiqing-feng jiqing-feng commented Mar 17, 2025

It fix

Dequant value error: The default value is "fp4" and we checked the default value 1st so it is always "fp4". Now we fixed it by enable quant_state 1st.
Use ipex dequant_4bit kernel in XPU dequant 4bit. It can bring 1.5x speed-up on QLora.
We have at least 2x speed-up for the dequant_4bit kernel.

@jiqing-feng jiqing-feng marked this pull request as ready for review March 17, 2025 08:17
@jiqing-feng
Copy link
Contributor Author

Hi @Titus-von-Koeller @matthewdouglas . This PR is ready to be merged, please review it. Thanks!

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
@Titus-von-Koeller Titus-von-Koeller merged commit 8fe6325 into bitsandbytes-foundation:multi-backend-refactor Mar 18, 2025
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
@jiqing-feng jiqing-feng deleted the 4bit branch March 31, 2025 03:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants