[NeurIPS 2025 Oral] Official Code for Exploring Diffusion Transformer Designs via Grafting
-
Updated
Jan 9, 2026 - Jupyter Notebook
[NeurIPS 2025 Oral] Official Code for Exploring Diffusion Transformer Designs via Grafting
Run IBM Granite 4.0 locally on Raspberry Pi 5 with Ollama.This is a privacy-first AI. Your data never leaves your device because it runs 100% locally. There are no cloud uploads and no third-party tracking.
mech-interp suite for Granite4 models that use Mamba-2 architecture
Early baby steps towards a long-term vision regarding Mamba-2's state interpretability.
Systematic study of LoRA fine-tuning strategies for IBM Granite 4.0-H-Micro (Mamba-2 + Transformer hybrid). Demonstrates the impact of architecture-aware target selection and SSM core parameter co-training, including analysis of PEFT serialization behavior. Reports up to 37% relative improvement over LoRA-only baselines.
Add a description, image, and links to the mamba-2 topic page so that developers can more easily learn about it.
To associate your repository with the mamba-2 topic, visit your repo's landing page and select "manage topics."