Summary
- When trying to convert a model to iq3_s or iq3_xxs, gives fatal error and abort.
Error
sd -M convert -m realDream_sdxl6.safetensors --type iq3_s
[INFO ] model.cpp:908 - load realDream_sdxl6.safetensors using safetensors format
[INFO ] model.cpp:1985 - model tensors mem size: 2183.06MB
|=> | 55/2641 - 0.00it/sOops: found point 103 not on grid: 103 0 0 0
/usr/src/debug/stable-diffusion.cpp-vulkan-git/stable-diffusion.cpp/ggml/src/ggml-quants.c:3929: fatal error
ptrace: Operation not permitted.
No stack.
The program is not being run.
Aborted (core dumped)
Command to test quants:
sd -M convert -m realDream_sdxl6.safetensors --type q4_0
Test model: realDream_sdxl6 ( SDXL | F16 | 6.46GB )
Speed to convert quants (almost all of them)
| quant |
model tensor mem size |
it/s |
| tq1_0 |
1565.20MB |
14.49 |
| tq2_0 |
1697.60MB |
13.51 |
| q2_K |
1896.20MB |
4.52 |
| iq3_xxs |
2050.66MB |
no |
| iq3_s |
2183.06MB |
no |
| q3_K |
2183.06MB |
8.26 |
| iq4_xs |
2469.92MB |
1.61 |
| iq4_nl |
2479.52MB |
1.78 |
| q4_0 |
2479.52MB |
10.00 |
| q4_K |
2558.19MB |
5.46 |
| q4_1 |
2659.47MB |
5.56 |
| q5_0 |
2839.42MB |
9.80 |
| q5_K |
2911.25MB |
5.26 |
| q5_1 |
3019.38MB |
5.52 |
| q6_K |
3286.38MB |
5.95 |
| q8_0 |
3919.13MB |
13.70 |
Time to finish convertion:
- q8_0: 4m16s
- iq4_xs: 19m52s (very very slow)
Conclusions
- Bug error (fatal) in:
iq3_xxs, iq3_s, maybe more
- Conversion use only "one" CPU core, multithreaded optimization maybe?
q8_0 is converted 4.65x faster than iq4_xs
- Faster:
q8_0 > q4_0
System:
OS: Arch Linux x86_64
Kernel: Linux 6.12.24-1-lts
Shell: bash 5.2.37
WM: dwm (X11)
Terminal: tmux 3.5a
CPU: Intel(R) Core(TM) i7-4790 (8) @ 3.60 GHz
GPU: NVIDIA GeForce GTX 1660 SUPER [Discrete] (6GB)
Memory: 2.47 GiB / 15.56 GiB (16%)
Locale: en_US.UTF-8
Summary
Error
Command to test quants:
Test model: realDream_sdxl6 ( SDXL | F16 | 6.46GB )
Speed to convert quants (almost all of them)
Time to finish convertion:
Conclusions
iq3_xxs,iq3_s, maybe moreq8_0is converted 4.65x faster thaniq4_xsq8_0>q4_0System: