mech-interp suite for Granite4 models that use Mamba-2 architecture
-
Updated
Feb 21, 2026 - Python
mech-interp suite for Granite4 models that use Mamba-2 architecture
Early baby steps towards a long-term vision regarding Mamba-2's state interpretability.
Systematic study of LoRA fine-tuning strategies for IBM Granite 4.0-H-Micro (Mamba-2 + Transformer hybrid). Demonstrates the impact of architecture-aware target selection and SSM core parameter co-training, including analysis of PEFT serialization behavior. Reports up to 37% relative improvement over LoRA-only baselines.
Add a description, image, and links to the mamba-2 topic page so that developers can more easily learn about it.
To associate your repository with the mamba-2 topic, visit your repo's landing page and select "manage topics."