supports HPU double dequantization #8

rsshaik1 · 2025-05-05T12:08:07Z

This PR integrates the support for double dequantization on HPU.

* fix 4bit XPU dequant 4bit Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix default value Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix ipex linear set Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix ipex linear set to false when calling state dict Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix Int8Param device patch Signed-off-by: jiqing-feng <jiqing.feng@intel.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix xpu to cpu Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix xpu cpu data device Signed-off-by: jiqing-feng <jiqing.feng@intel.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix intel cpu/xpu warning Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix error log Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix lib Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * rm return Nonr Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * error log only without ipex Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix import eerror Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix format Signed-off-by: jiqing-feng <jiqing.feng@intel.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* enable xpu 8bit optim * add deqaunt_blockwise * dequantize_blockwise * add bakcend synchronize * refine code * ipex dep * ipex dep too * ipex version check --------- Co-authored-by: jiqing-feng <jiqing.feng@intel.com>

Authored by: Chetan Kumar Verma <chetan.kumar.verma@intel.com> Co-authored-by: Ruheena Suhani Shaik <ruheena.suhani.shaik@intel.com> Co-authored-by: Bhargav Eede <bhargav.eede@intel.com> Co-authored-by: Vivek Goel <vivek.goel@intel.com> Co-authored-by: Ruheena Suhani Shaik <rsshaik@habana.ai>

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

vivekgoe · 2025-05-06T05:39:50Z

bitsandbytes/backends/hpu.py

    ) -> Tuple[torch.Tensor, torch.Tensor, torch.Tensor, torch.Tensor, Optional[torch.Tensor]]:
-        assert_on_hpu([A, col_stats, row_stats,  out_col, out_row])
-        return double_quant_impl(A, col_stats, row_stats,  out_col, out_row, threshold)
+        assert_on_hpu([A, col_stats, row_stats, out_col, out_row])


I think we are not using this function anywhere on HPU. Please check and if that is the case lets add an assert.

vivekgoe · 2025-05-06T05:43:09Z

bitsandbytes/autograd/_functions.py

 ):
    assert quant_state is not None
-    if A.device.type in ("cpu", "xpu") and A.requires_grad == False:
+    if A.device.type in ("cpu", "xpu", "hpu") and A.requires_grad == False:


Let's do this in a different way. Remove "hpu" check from here, add a check such that we do not enter elif part for "hpu", then we will always go to else case.

After making this change if there is no path to get to gemv_4bit for HPU, remove the content of that function and add assert there.

jiqing-feng and others added 8 commits March 18, 2025 10:43

Fix xpu to cpu (bitsandbytes-foundation#1570)

d3658c5

* fix xpu to cpu Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix xpu cpu data device Signed-off-by: jiqing-feng <jiqing.feng@intel.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

fix double compress 8bit precision (bitsandbytes-foundation#1582)

d180d8e

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

XPU backend support 8bit optimizer (bitsandbytes-foundation#1565)

5c48b33

* enable xpu 8bit optim * add deqaunt_blockwise * dequantize_blockwise * add bakcend synchronize * refine code * ipex dep * ipex dep too * ipex version check --------- Co-authored-by: jiqing-feng <jiqing.feng@intel.com>

fix log (bitsandbytes-foundation#1604)

5027e64

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

fix xpu ipex linear in torch2.7 (bitsandbytes-foundation#1618)

263179a

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

rsshaik1 requested review from bhargaveede and ckvermaAI May 5, 2025 12:08

rsshaik1 force-pushed the double_quant branch 2 times, most recently from b862ecc to a35e7e1 Compare May 5, 2025 12:44

update compute_type_is_set attr (bitsandbytes-foundation#1623)

5e267f5

vivekgoe reviewed May 6, 2025

View reviewed changes

rsshaik1 added 2 commits May 9, 2025 06:28

supports HPU double dequantization

af475df

added hpu specific changes

76a072e

rsshaik1 force-pushed the double_quant branch from a35e7e1 to 76a072e Compare May 9, 2025 04:19

rsshaik1 closed this May 9, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

supports HPU double dequantization #8

supports HPU double dequantization #8

Uh oh!

rsshaik1 commented May 5, 2025

Uh oh!

vivekgoe May 6, 2025

Uh oh!

vivekgoe May 6, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

supports HPU double dequantization #8

supports HPU double dequantization #8

Uh oh!

Conversation

rsshaik1 commented May 5, 2025

Uh oh!

vivekgoe May 6, 2025

Choose a reason for hiding this comment

Uh oh!

vivekgoe May 6, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants