fix: nemotron_flash_1b_squad checkpoint robustness (tied weights, remote-code import recursion, layer_types) #1945
+133
−6
Loading