-
Notifications
You must be signed in to change notification settings - Fork 0
supports HPU double dequantization #8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
* fix 4bit XPU dequant 4bit Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix default value Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix ipex linear set Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix ipex linear set to false when calling state dict Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix Int8Param device patch Signed-off-by: jiqing-feng <jiqing.feng@intel.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* fix xpu to cpu Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix xpu cpu data device Signed-off-by: jiqing-feng <jiqing.feng@intel.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* fix intel cpu/xpu warning Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix error log Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix lib Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * rm return Nonr Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * error log only without ipex Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix import eerror Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix format Signed-off-by: jiqing-feng <jiqing.feng@intel.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* enable xpu 8bit optim * add deqaunt_blockwise * dequantize_blockwise * add bakcend synchronize * refine code * ipex dep * ipex dep too * ipex version check --------- Co-authored-by: jiqing-feng <jiqing.feng@intel.com>
Authored by: Chetan Kumar Verma <chetan.kumar.verma@intel.com> Co-authored-by: Ruheena Suhani Shaik <ruheena.suhani.shaik@intel.com> Co-authored-by: Bhargav Eede <bhargav.eede@intel.com> Co-authored-by: Vivek Goel <vivek.goel@intel.com> Co-authored-by: Ruheena Suhani Shaik <rsshaik@habana.ai>
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
b862ecc to
a35e7e1
Compare
bitsandbytes/backends/hpu.py
Outdated
| ) -> Tuple[torch.Tensor, torch.Tensor, torch.Tensor, torch.Tensor, Optional[torch.Tensor]]: | ||
| assert_on_hpu([A, col_stats, row_stats, out_col, out_row]) | ||
| return double_quant_impl(A, col_stats, row_stats, out_col, out_row, threshold) | ||
| assert_on_hpu([A, col_stats, row_stats, out_col, out_row]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we are not using this function anywhere on HPU. Please check and if that is the case lets add an assert.
bitsandbytes/autograd/_functions.py
Outdated
| ): | ||
| assert quant_state is not None | ||
| if A.device.type in ("cpu", "xpu") and A.requires_grad == False: | ||
| if A.device.type in ("cpu", "xpu", "hpu") and A.requires_grad == False: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's do this in a different way. Remove "hpu" check from here, add a check such that we do not enter elif part for "hpu", then we will always go to else case.
After making this change if there is no path to get to gemv_4bit for HPU, remove the content of that function and add assert there.
This PR integrates the support for double dequantization on HPU.