-
-
Notifications
You must be signed in to change notification settings - Fork 817
Update diagnostic functions for ROCm #1333
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
5891465
d03a680
ec9000f
9b8c1da
1413c5f
578b2f4
fd655b0
6b77f4c
78324b3
c146b8b
953a383
d6c3df4
7e9a65c
cdb209a
77e1499
7c91909
b78b340
a62b9d4
9059bff
c5a406a
9cbb5e1
3580624
3bde1b7
b123125
db1df72
e498b4d
7d2e027
0c76b1c
714d9e9
ce77361
b87c2b9
828fdc6
5721601
d58303f
483e8ca
9d111df
52ba52e
755dfbe
70c3d6b
7b038e9
343c9fa
42cc717
b22eb2e
f2ea137
ee6abed
6f9cd26
570137c
3380df4
1c5bd4f
4655a41
f39ff48
f57addd
251a0e8
260a3ac
48bfb20
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -3,11 +3,12 @@ | |
|
|
||
| import torch | ||
|
|
||
| from bitsandbytes.cextension import BNB_BACKEND, HIP_ENVIRONMENT | ||
| from bitsandbytes.consts import PACKAGE_GITHUB_URL | ||
| from bitsandbytes.cuda_specs import get_cuda_specs | ||
| from bitsandbytes.diagnostics.cuda import ( | ||
| print_cuda_diagnostics, | ||
| print_cuda_runtime_diagnostics, | ||
| print_diagnostics, | ||
| print_runtime_diagnostics, | ||
| ) | ||
| from bitsandbytes.diagnostics.utils import print_dedented, print_header | ||
|
|
||
|
|
@@ -16,12 +17,13 @@ def sanity_check(): | |
| from bitsandbytes.cextension import lib | ||
|
|
||
| if lib is None: | ||
| compute_backend = "cuda" if not HIP_ENVIRONMENT else "hip" | ||
| print_dedented( | ||
| """ | ||
| f""" | ||
| Couldn't load the bitsandbytes library, likely due to missing binaries. | ||
| Please ensure bitsandbytes is properly installed. | ||
|
|
||
| For source installations, compile the binaries with `cmake -DCOMPUTE_BACKEND=cuda -S .`. | ||
| For source installations, compile the binaries with `cmake -DCOMPUTE_BACKEND={compute_backend} -S .`. | ||
| See the documentation for more details if needed. | ||
|
|
||
| Trying a simple check anyway, but this will likely fail... | ||
|
|
@@ -49,19 +51,24 @@ def main(): | |
|
|
||
| print_header("OTHER") | ||
| cuda_specs = get_cuda_specs() | ||
| print("CUDA specs:", cuda_specs) | ||
| if HIP_ENVIRONMENT: | ||
| rocm_specs = f" rocm_version_string='{cuda_specs.cuda_version_string}'," | ||
| rocm_specs += f" rocm_version_tuple={cuda_specs.cuda_version_tuple}" | ||
| print(f"{BNB_BACKEND} specs:{rocm_specs}") | ||
| else: | ||
| print(f"{BNB_BACKEND} specs:{cuda_specs}") | ||
|
Comment on lines
+54
to
+59
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, this smells even more like
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This requires updates to cextensions and additional testing. Could we use this workaround for now and make the change after alpha release? |
||
| if not torch.cuda.is_available(): | ||
| print("Torch says CUDA is not available. Possible reasons:") | ||
| print("1. CUDA driver not installed") | ||
| print("2. CUDA not installed") | ||
| print("3. You have multiple conflicting CUDA libraries") | ||
| print(f"Torch says {BNB_BACKEND} is not available. Possible reasons:") | ||
| print(f"1. {BNB_BACKEND} driver not installed") | ||
| print(f"2. {BNB_BACKEND} not installed") | ||
| print(f"3. You have multiple conflicting {BNB_BACKEND} libraries") | ||
| if cuda_specs: | ||
| print_cuda_diagnostics(cuda_specs) | ||
| print_cuda_runtime_diagnostics() | ||
| print_diagnostics(cuda_specs) | ||
| print_runtime_diagnostics() | ||
| print_header("") | ||
| print_header("DEBUG INFO END") | ||
| print_header("") | ||
| print("Checking that the library is importable and CUDA is callable...") | ||
| print(f"Checking that the library is importable and {BNB_BACKEND} is callable...") | ||
| try: | ||
| sanity_check() | ||
| print("SUCCESS!") | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this cleaning needed in the first place? I've never seen anything like it in any GitHub Actions workflow I've come across 🤔
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Github runner runs into disk space issues during docker pull. Those applications are not used, so I deleted them to clear some space.