It errors out with mismatched dims. See the logs below.
Traceback (most recent call last):
File "/home/moe/llama.cpp/./examples/minicpmv/minicpmv-convert-image-encoder-to-gguf.py", line 156, in <module>
model.load_state_dict(torch.load(os.path.join(dir_model, "minicpmv.clip")))
File "/home/moe/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 2189, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for Idefics2VisionTransformer:
size mismatch for encoder.layers.0.self_attn.k_proj.weight: copying a param with shape torch.Size([663552, 1]) from checkpoint, the shape in current model is torch.Size([1152, 1152]).
size mismatch for encoder.layers.0.self_attn.v_proj.weight: copying a param with shape torch.Size([663552, 1]) from checkpoint, the shape in current model is torch.Size([1152, 1152]).
size mismatch for encoder.layers.0.self_attn.q_proj.weight: copying a param with shape torch.Size([663552, 1]) from checkpoint, the shape in current model is torch.Size([1152, 1152]).
size mismatch for encoder.layers.0.mlp.fc1.weight: copying a param with shape torch.Size([2479104, 1]) from checkpoint, the shape in current model is torch.Size([4304, 1152]).
size mismatch for encoder.layers.0.mlp.fc2.weight: copying a param with shape torch.Size([2479104, 1]) from checkpoint, the shape in current model is torch.Size([1152, 4304]).
size mismatch for encoder.layers.1.self_attn.k_proj.weight: copying a param with shape torch.Size([663552, 1]) from checkpoint, the shape in current model is torch.Size([1152, 1152]).
size mismatch for encoder.layers.1.self_attn.v_proj.weight: copying a param with shape torch.Size([663552, 1]) from checkpoint, the shape in current model is torch.Size([1152, 1152]).
size mismatch for encoder.layers.1.self_attn.q_proj.weight: copying a param with shape torch.Size([663552, 1]) from checkpoint, the shape in current model is torch.Size([1152, 1152]).
size mismatch for encoder.layers.1.mlp.fc1.weight: copying a param with shape torch.Size([2479104, 1]) from checkpoint, the shape in current model is torch.Size([4304, 1152]).
size mismatch for encoder.layers.1.mlp.fc2.weight: copying a param with shape torch.Size([2479104, 1]) from checkpoint, the shape in current model is torch.Size([1152, 4304]).
size mismatch for encoder.layers.2.self_attn.k_proj.weight: copying a param with shape torch.Size([663552, 1]) from checkpoint, the shape in current model is torch.Size([1152, 1152]).
size mismatch for encoder.layers.2.self_attn.v_proj.weight: copying a param with shape torch.Size([663552, 1]) from checkpoint, the shape in current model is torch.Size([1152, 1152]).
size mismatch for encoder.layers.2.self_attn.q_proj.weight: copying a param with shape torch.Size([663552, 1]) from checkpoint, the shape in current model is torch.Size([1152, 1152]).
size mismatch for encoder.layers.2.mlp.fc1.weight: copying a param with shape torch.Size([2479104, 1]) from checkpoint, the shape in current model is torch.Size([4304, 1152]).
size mismatch for encoder.layers.2.mlp.fc2.weight: copying a param with shape torch.Size([2479104, 1]) from checkpoint, the shape in current model is torch.Size([1152, 4304]).
size mismatch for encoder.layers.3.self_attn.k_proj.weight: copying a param with shape torch.Size([663552, 1]) from checkpoint, the shape in current model is torch.Size([1152, 1152]).
size mismatch for encoder.layers.3.self_attn.v_proj.weight: copying a param with shape torch.Size([663552, 1]) from checkpoint, the shape in current model is torch.Size([1152, 1152]).
size mismatch for encoder.layers.3.self_attn.q_proj.weight: copying a param with shape torch.Size([663552, 1]) from checkpoint, the shape in current model is torch.Size([1152, 1152]).
size mismatch for encoder.layers.3.mlp.fc1.weight: copying a param with shape torch.Size([2479104, 1]) from checkpoint, the shape in current model is torch.Size([4304, 1152]).
size mismatch for encoder.layers.3.mlp.fc2.weight: copying a param with shape torch.Size([2479104, 1]) from checkpoint, the shape in current model is torch.Size([1152, 4304]).
size mismatch for encoder.layers.4.self_attn.k_proj.weight: copying a param with shape torch.Size([663552, 1]) from checkpoint, the shape in current model is torch.Size([1152, 1152]).
size mismatch for encoder.layers.4.self_attn.v_proj.weight: copying a param with shape torch.Size([663552, 1]) from checkpoint, the shape in current model is torch.Size([1152, 1152]).
size mismatch for encoder.layers.4.self_attn.q_proj.weight: copying a param with shape torch.Size([663552, 1]) from checkpoint, the shape in current model is torch.Size([1152, 1152]).
size mismatch for encoder.layers.4.mlp.fc1.weight: copying a param with shape torch.Size([2479104, 1]) from checkpoint, the shape in current model is torch.Size([4304, 1152]).
size mismatch for encoder.layers.4.mlp.fc2.weight: copying a param with shape torch.Size([2479104, 1]) from checkpoint, the shape in current model is torch.Size([1152, 4304]).
size mismatch for encoder.layers.5.self_attn.k_proj.weight: copying a param with shape torch.Size([663552, 1]) from checkpoint, the shape in current model is torch.Size([1152, 1152]).
size mismatch for encoder.layers.5.self_attn.v_proj.weight: copying a param with shape torch.Size([663552, 1]) from checkpoint, the shape in current model is torch.Size([1152, 1152]).
size mismatch for encoder.layers.5.self_attn.q_proj.weight: copying a param with shape torch.Size([663552, 1]) from checkpoint, the shape in current model is torch.Size([1152, 1152]).
size mismatch for encoder.layers.5.mlp.fc1.weight: copying a param with shape torch.Size([2479104, 1]) from checkpoint, the shape in current model is torch.Size([4304, 1152]).
size mismatch for encoder.layers.5.mlp.fc2.weight: copying a param with shape torch.Size([2479104, 1]) from checkpoint, the shape in current model is torch.Size([1152, 4304]).
size mismatch for encoder.layers.6.self_attn.k_proj.weight: copying a param with shape torch.Size([663552, 1]) from checkpoint, the shape in current model is torch.Size([1152, 1152]).
size mismatch for encoder.layers.6.self_attn.v_proj.weight: copying a param with shape torch.Size([663552, 1]) from checkpoint, the shape in current model is torch.Size([1152, 1152]).
size mismatch for encoder.layers.6.self_attn.q_proj.weight: copying a param with shape torch.Size([663552, 1]) from checkpoint, the shape in current model is torch.Size([1152, 1152]).
size mismatch for encoder.layers.6.mlp.fc1.weight: copying a param with shape torch.Size([2479104, 1]) from checkpoint, the shape in current model is torch.Size([4304, 1152]).
size mismatch for encoder.layers.6.mlp.fc2.weight: copying a param with shape torch.Size([2479104, 1]) from checkpoint, the shape in current model is torch.Size([1152, 4304]).
size mismatch for encoder.layers.7.self_attn.k_proj.weight: copying a param with shape torch.Size([663552, 1]) from checkpoint, the shape in current model is torch.Size([1152, 1152]).
size mismatch for encoder.layers.7.self_attn.v_proj.weight: copying a param with shape torch.Size([663552, 1]) from checkpoint, the shape in current model is torch.Size([1152, 1152]).
size mismatch for encoder.layers.7.self_attn.q_proj.weight: copying a param with shape torch.Size([663552, 1]) from checkpoint, the shape in current model is torch.Size([1152, 1152]).
size mismatch for encoder.layers.7.mlp.fc1.weight: copying a param with shape torch.Size([2479104, 1]) from checkpoint, the shape in current model is torch.Size([4304, 1152]).
size mismatch for encoder.layers.7.mlp.fc2.weight: copying a param with shape torch.Size([2479104, 1]) from checkpoint, the shape in current model is torch.Size([1152, 4304]).
size mismatch for encoder.layers.8.self_attn.k_proj.weight: copying a param with shape torch.Size([663552, 1]) from checkpoint, the shape in current model is torch.Size([1152, 1152]).
size mismatch for encoder.layers.8.self_attn.v_proj.weight: copying a param with shape torch.Size([663552, 1]) from checkpoint, the shape in current model is torch.Size([1152, 1152]).
size mismatch for encoder.layers.8.self_attn.q_proj.weight: copying a param with shape torch.Size([663552, 1]) from checkpoint, the shape in current model is torch.Size([1152, 1152]).
size mismatch for encoder.layers.8.mlp.fc1.weight: copying a param with shape torch.Size([2479104, 1]) from checkpoint, the shape in current model is torch.Size([4304, 1152]).
size mismatch for encoder.layers.8.mlp.fc2.weight: copying a param with shape torch.Size([2479104, 1]) from checkpoint, the shape in current model is torch.Size([1152, 4304]).
size mismatch for encoder.layers.9.self_attn.k_proj.weight: copying a param with shape torch.Size([663552, 1]) from checkpoint, the shape in current model is torch.Size([1152, 1152]).
size mismatch for encoder.layers.9.self_attn.v_proj.weight: copying a param with shape torch.Size([663552, 1]) from checkpoint, the shape in current model is torch.Size([1152, 1152]).
size mismatch for encoder.layers.9.self_attn.q_proj.weight: copying a param with shape torch.Size([663552, 1]) from checkpoint, the shape in current model is torch.Size([1152, 1152]).
size mismatch for encoder.layers.9.mlp.fc1.weight: copying a param with shape torch.Size([2479104, 1]) from checkpoint, the shape in current model is torch.Size([4304, 1152]).
size mismatch for encoder.layers.9.mlp.fc2.weight: copying a param with shape torch.Size([2479104, 1]) from checkpoint, the shape in current model is torch.Size([1152, 4304]).
size mismatch for encoder.layers.10.self_attn.k_proj.weight: copying a param with shape torch.Size([663552, 1]) from checkpoint, the shape in current model is torch.Size([1152, 1152]).
size mismatch for encoder.layers.10.self_attn.v_proj.weight: copying a param with shape torch.Size([663552, 1]) from checkpoint, the shape in current model is torch.Size([1152, 1152]).
size mismatch for encoder.layers.10.self_attn.q_proj.weight: copying a param with shape torch.Size([663552, 1]) from checkpoint, the shape in current model is torch.Size([1152, 1152]).
size mismatch for encoder.layers.10.mlp.fc1.weight: copying a param with shape torch.Size([2479104, 1]) from checkpoint, the shape in current model is torch.Size([4304, 1152]).
size mismatch for encoder.layers.10.mlp.fc2.weight: copying a param with shape torch.Size([2479104, 1]) from checkpoint, the shape in current model is torch.Size([1152, 4304]).
size mismatch for encoder.layers.11.self_attn.k_proj.weight: copying a param with shape torch.Size([663552, 1]) from checkpoint, the shape in current model is torch.Size([1152, 1152]).
size mismatch for encoder.layers.11.self_attn.v_proj.weight: copying a param with shape torch.Size([663552, 1]) from checkpoint, the shape in current model is torch.Size([1152, 1152]).
size mismatch for encoder.layers.11.self_attn.q_proj.weight: copying a param with shape torch.Size([663552, 1]) from checkpoint, the shape in current model is torch.Size([1152, 1152]).
size mismatch for encoder.layers.11.mlp.fc1.weight: copying a param with shape torch.Size([2479104, 1]) from checkpoint, the shape in current model is torch.Size([4304, 1152]).
size mismatch for encoder.layers.11.mlp.fc2.weight: copying a param with shape torch.Size([2479104, 1]) from checkpoint, the shape in current model is torch.Size([1152, 4304]).
size mismatch for encoder.layers.12.self_attn.k_proj.weight: copying a param with shape torch.Size([663552, 1]) from checkpoint, the shape in current model is torch.Size([1152, 1152]).
size mismatch for encoder.layers.12.self_attn.v_proj.weight: copying a param with shape torch.Size([663552, 1]) from checkpoint, the shape in current model is torch.Size([1152, 1152]).
size mismatch for encoder.layers.12.self_attn.q_proj.weight: copying a param with shape torch.Size([663552, 1]) from checkpoint, the shape in current model is torch.Size([1152, 1152]).
size mismatch for encoder.layers.12.mlp.fc1.weight: copying a param with shape torch.Size([2479104, 1]) from checkpoint, the shape in current model is torch.Size([4304, 1152]).
size mismatch for encoder.layers.12.mlp.fc2.weight: copying a param with shape torch.Size([2479104, 1]) from checkpoint, the shape in current model is torch.Size([1152, 4304]).
size mismatch for encoder.layers.13.self_attn.k_proj.weight: copying a param with shape torch.Size([663552, 1]) from checkpoint, the shape in current model is torch.Size([1152, 1152]).
size mismatch for encoder.layers.13.self_attn.v_proj.weight: copying a param with shape torch.Size([663552, 1]) from checkpoint, the shape in current model is torch.Size([1152, 1152]).
size mismatch for encoder.layers.13.self_attn.q_proj.weight: copying a param with shape torch.Size([663552, 1]) from checkpoint, the shape in current model is torch.Size([1152, 1152]).
size mismatch for encoder.layers.13.mlp.fc1.weight: copying a param with shape torch.Size([2479104, 1]) from checkpoint, the shape in current model is torch.Size([4304, 1152]).
size mismatch for encoder.layers.13.mlp.fc2.weight: copying a param with shape torch.Size([2479104, 1]) from checkpoint, the shape in current model is torch.Size([1152, 4304]).
size mismatch for encoder.layers.14.self_attn.k_proj.weight: copying a param with shape torch.Size([663552, 1]) from checkpoint, the shape in current model is torch.Size([1152, 1152]).
size mismatch for encoder.layers.14.self_attn.v_proj.weight: copying a param with shape torch.Size([663552, 1]) from checkpoint, the shape in current model is torch.Size([1152, 1152]).
size mismatch for encoder.layers.14.self_attn.q_proj.weight: copying a param with shape torch.Size([663552, 1]) from checkpoint, the shape in current model is torch.Size([1152, 1152]).
size mismatch for encoder.layers.14.mlp.fc1.weight: copying a param with shape torch.Size([2479104, 1]) from checkpoint, the shape in current model is torch.Size([4304, 1152]).
size mismatch for encoder.layers.14.mlp.fc2.weight: copying a param with shape torch.Size([2479104, 1]) from checkpoint, the shape in current model is torch.Size([1152, 4304]).
size mismatch for encoder.layers.15.self_attn.k_proj.weight: copying a param with shape torch.Size([663552, 1]) from checkpoint, the shape in current model is torch.Size([1152, 1152]).
size mismatch for encoder.layers.15.self_attn.v_proj.weight: copying a param with shape torch.Size([663552, 1]) from checkpoint, the shape in current model is torch.Size([1152, 1152]).
size mismatch for encoder.layers.15.self_attn.q_proj.weight: copying a param with shape torch.Size([663552, 1]) from checkpoint, the shape in current model is torch.Size([1152, 1152]).
size mismatch for encoder.layers.15.mlp.fc1.weight: copying a param with shape torch.Size([2479104, 1]) from checkpoint, the shape in current model is torch.Size([4304, 1152]).
size mismatch for encoder.layers.15.mlp.fc2.weight: copying a param with shape torch.Size([2479104, 1]) from checkpoint, the shape in current model is torch.Size([1152, 4304]).
size mismatch for encoder.layers.16.self_attn.k_proj.weight: copying a param with shape torch.Size([663552, 1]) from checkpoint, the shape in current model is torch.Size([1152, 1152]).
size mismatch for encoder.layers.16.self_attn.v_proj.weight: copying a param with shape torch.Size([663552, 1]) from checkpoint, the shape in current model is torch.Size([1152, 1152]).
size mismatch for encoder.layers.16.self_attn.q_proj.weight: copying a param with shape torch.Size([663552, 1]) from checkpoint, the shape in current model is torch.Size([1152, 1152]).
size mismatch for encoder.layers.16.mlp.fc1.weight: copying a param with shape torch.Size([2479104, 1]) from checkpoint, the shape in current model is torch.Size([4304, 1152]).
size mismatch for encoder.layers.16.mlp.fc2.weight: copying a param with shape torch.Size([2479104, 1]) from checkpoint, the shape in current model is torch.Size([1152, 4304]).
size mismatch for encoder.layers.17.self_attn.k_proj.weight: copying a param with shape torch.Size([663552, 1]) from checkpoint, the shape in current model is torch.Size([1152, 1152]).
size mismatch for encoder.layers.17.self_attn.v_proj.weight: copying a param with shape torch.Size([663552, 1]) from checkpoint, the shape in current model is torch.Size([1152, 1152]).
size mismatch for encoder.layers.17.self_attn.q_proj.weight: copying a param with shape torch.Size([663552, 1]) from checkpoint, the shape in current model is torch.Size([1152, 1152]).
size mismatch for encoder.layers.17.mlp.fc1.weight: copying a param with shape torch.Size([2479104, 1]) from checkpoint, the shape in current model is torch.Size([4304, 1152]).
size mismatch for encoder.layers.17.mlp.fc2.weight: copying a param with shape torch.Size([2479104, 1]) from checkpoint, the shape in current model is torch.Size([1152, 4304]).
size mismatch for encoder.layers.18.self_attn.k_proj.weight: copying a param with shape torch.Size([663552, 1]) from checkpoint, the shape in current model is torch.Size([1152, 1152]).
size mismatch for encoder.layers.18.self_attn.v_proj.weight: copying a param with shape torch.Size([663552, 1]) from checkpoint, the shape in current model is torch.Size([1152, 1152]).
size mismatch for encoder.layers.18.self_attn.q_proj.weight: copying a param with shape torch.Size([663552, 1]) from checkpoint, the shape in current model is torch.Size([1152, 1152]).
size mismatch for encoder.layers.18.mlp.fc1.weight: copying a param with shape torch.Size([2479104, 1]) from checkpoint, the shape in current model is torch.Size([4304, 1152]).
size mismatch for encoder.layers.18.mlp.fc2.weight: copying a param with shape torch.Size([2479104, 1]) from checkpoint, the shape in current model is torch.Size([1152, 4304]).
size mismatch for encoder.layers.19.self_attn.k_proj.weight: copying a param with shape torch.Size([663552, 1]) from checkpoint, the shape in current model is torch.Size([1152, 1152]).
size mismatch for encoder.layers.19.self_attn.v_proj.weight: copying a param with shape torch.Size([663552, 1]) from checkpoint, the shape in current model is torch.Size([1152, 1152]).
size mismatch for encoder.layers.19.self_attn.q_proj.weight: copying a param with shape torch.Size([663552, 1]) from checkpoint, the shape in current model is torch.Size([1152, 1152]).
size mismatch for encoder.layers.19.mlp.fc1.weight: copying a param with shape torch.Size([2479104, 1]) from checkpoint, the shape in current model is torch.Size([4304, 1152]).
size mismatch for encoder.layers.19.mlp.fc2.weight: copying a param with shape torch.Size([2479104, 1]) from checkpoint, the shape in current model is torch.Size([1152, 4304]).
size mismatch for encoder.layers.20.self_attn.k_proj.weight: copying a param with shape torch.Size([663552, 1]) from checkpoint, the shape in current model is torch.Size([1152, 1152]).
size mismatch for encoder.layers.20.self_attn.v_proj.weight: copying a param with shape torch.Size([663552, 1]) from checkpoint, the shape in current model is torch.Size([1152, 1152]).
size mismatch for encoder.layers.20.self_attn.q_proj.weight: copying a param with shape torch.Size([663552, 1]) from checkpoint, the shape in current model is torch.Size([1152, 1152]).
size mismatch for encoder.layers.20.mlp.fc1.weight: copying a param with shape torch.Size([2479104, 1]) from checkpoint, the shape in current model is torch.Size([4304, 1152]).
size mismatch for encoder.layers.20.mlp.fc2.weight: copying a param with shape torch.Size([2479104, 1]) from checkpoint, the shape in current model is torch.Size([1152, 4304]).
size mismatch for encoder.layers.21.self_attn.k_proj.weight: copying a param with shape torch.Size([663552, 1]) from checkpoint, the shape in current model is torch.Size([1152, 1152]).
size mismatch for encoder.layers.21.self_attn.v_proj.weight: copying a param with shape torch.Size([663552, 1]) from checkpoint, the shape in current model is torch.Size([1152, 1152]).
size mismatch for encoder.layers.21.self_attn.q_proj.weight: copying a param with shape torch.Size([663552, 1]) from checkpoint, the shape in current model is torch.Size([1152, 1152]).
size mismatch for encoder.layers.21.mlp.fc1.weight: copying a param with shape torch.Size([2479104, 1]) from checkpoint, the shape in current model is torch.Size([4304, 1152]).
size mismatch for encoder.layers.21.mlp.fc2.weight: copying a param with shape torch.Size([2479104, 1]) from checkpoint, the shape in current model is torch.Size([1152, 4304]).
size mismatch for encoder.layers.22.self_attn.k_proj.weight: copying a param with shape torch.Size([663552, 1]) from checkpoint, the shape in current model is torch.Size([1152, 1152]).
size mismatch for encoder.layers.22.self_attn.v_proj.weight: copying a param with shape torch.Size([663552, 1]) from checkpoint, the shape in current model is torch.Size([1152, 1152]).
size mismatch for encoder.layers.22.self_attn.q_proj.weight: copying a param with shape torch.Size([663552, 1]) from checkpoint, the shape in current model is torch.Size([1152, 1152]).
size mismatch for encoder.layers.22.mlp.fc1.weight: copying a param with shape torch.Size([2479104, 1]) from checkpoint, the shape in current model is torch.Size([4304, 1152]).
size mismatch for encoder.layers.22.mlp.fc2.weight: copying a param with shape torch.Size([2479104, 1]) from checkpoint, the shape in current model is torch.Size([1152, 4304]).
size mismatch for encoder.layers.23.self_attn.k_proj.weight: copying a param with shape torch.Size([663552, 1]) from checkpoint, the shape in current model is torch.Size([1152, 1152]).
size mismatch for encoder.layers.23.self_attn.v_proj.weight: copying a param with shape torch.Size([663552, 1]) from checkpoint, the shape in current model is torch.Size([1152, 1152]).
size mismatch for encoder.layers.23.self_attn.q_proj.weight: copying a param with shape torch.Size([663552, 1]) from checkpoint, the shape in current model is torch.Size([1152, 1152]).
size mismatch for encoder.layers.23.mlp.fc1.weight: copying a param with shape torch.Size([2479104, 1]) from checkpoint, the shape in current model is torch.Size([4304, 1152]).
size mismatch for encoder.layers.23.mlp.fc2.weight: copying a param with shape torch.Size([2479104, 1]) from checkpoint, the shape in current model is torch.Size([1152, 4304]).
size mismatch for encoder.layers.24.self_attn.k_proj.weight: copying a param with shape torch.Size([663552, 1]) from checkpoint, the shape in current model is torch.Size([1152, 1152]).
size mismatch for encoder.layers.24.self_attn.v_proj.weight: copying a param with shape torch.Size([663552, 1]) from checkpoint, the shape in current model is torch.Size([1152, 1152]).
size mismatch for encoder.layers.24.self_attn.q_proj.weight: copying a param with shape torch.Size([663552, 1]) from checkpoint, the shape in current model is torch.Size([1152, 1152]).
size mismatch for encoder.layers.24.mlp.fc1.weight: copying a param with shape torch.Size([2479104, 1]) from checkpoint, the shape in current model is torch.Size([4304, 1152]).
size mismatch for encoder.layers.24.mlp.fc2.weight: copying a param with shape torch.Size([2479104, 1]) from checkpoint, the shape in current model is torch.Size([1152, 4304]).
size mismatch for encoder.layers.25.self_attn.k_proj.weight: copying a param with shape torch.Size([663552, 1]) from checkpoint, the shape in current model is torch.Size([1152, 1152]).
size mismatch for encoder.layers.25.self_attn.v_proj.weight: copying a param with shape torch.Size([663552, 1]) from checkpoint, the shape in current model is torch.Size([1152, 1152]).
size mismatch for encoder.layers.25.self_attn.q_proj.weight: copying a param with shape torch.Size([663552, 1]) from checkpoint, the shape in current model is torch.Size([1152, 1152]).
size mismatch for encoder.layers.25.mlp.fc1.weight: copying a param with shape torch.Size([2479104, 1]) from checkpoint, the shape in current model is torch.Size([4304, 1152]).
size mismatch for encoder.layers.25.mlp.fc2.weight: copying a param with shape torch.Size([2479104, 1]) from checkpoint, the shape in current model is torch.Size([1152, 4304]).
size mismatch for encoder.layers.26.self_attn.k_proj.weight: copying a param with shape torch.Size([663552, 1]) from checkpoint, the shape in current model is torch.Size([1152, 1152]).
size mismatch for encoder.layers.26.self_attn.v_proj.weight: copying a param with shape torch.Size([663552, 1]) from checkpoint, the shape in current model is torch.Size([1152, 1152]).
size mismatch for encoder.layers.26.self_attn.q_proj.weight: copying a param with shape torch.Size([663552, 1]) from checkpoint, the shape in current model is torch.Size([1152, 1152]).
size mismatch for encoder.layers.26.mlp.fc1.weight: copying a param with shape torch.Size([2479104, 1]) from checkpoint, the shape in current model is torch.Size([4304, 1152]).
size mismatch for encoder.layers.26.mlp.fc2.weight: copying a param with shape torch.Size([2479104, 1]) from checkpoint, the shape in current model is torch.Size([1152, 4304]).
What happened?
Failed to load a 4-bit model with
It errors out with mismatched dims. See the logs below.
Name and Version
Commit 6366d62
What operating system are you seeing the problem on?
Linux
Relevant log output