A student collective learning the hard way — by breaking things, rebuilding them, and figuring it out ourselves.
- Reinforcement Learning: Group Relative Policy Optimization (GRPO), AlphaZero-style training, and model exploration.
- Advanced ML Optimization: Matrix-aware optimizers (Muon), speculative decoding, and energy-aware inference.
- Compiler & System Engineering: CRDT state management, compiler optimization, and lock-free data structures.
- Formal Verification & Safety: Hybrid AI frameworks, constraint-based validation, and autonomous code generation.
- Distributed Systems: Distributed training, distributed queue servers, and high-performance inter-process communication.
- alpha-stack — LLM-powered dev agent with project scaffolding, automated iteration, and multi-stack support.
- FrugalSOT — Adaptive model selection for efficient on-device NLP inference with dynamic routing.
- SpecQuant — Adaptive LLM serving with prompt complexity classification and quantized draft model selection.
- AlphaDesign — RL and genetic algorithms for F1 front wing aerodynamic optimization.
- Phydra — 3D bin packing cargo management system with A*/Dijkstra pathfinding in C++.
- DGAT — Instant LLM-annotated dependency graph generation for any codebase.
Only
muon_expsandreinforcement_learning_llmshave seen meaningful progress. The rest are currently stuck — we could use mentorship and guidance on these.
- AlphaD-RL — Multi-teacher MCTS for code generation with execution-based rewards from unit tests.
- elastic_continual_learning — Adaptive optimization framework to mitigate catastrophic forgetting in neural networks.
- energy_throttling_llms — Energy-aware DDPG RL for dynamic LLM speculative decoding under thermal constraints.
- muon_exps — CUDA implementation of Muon optimizer with Newton-Schulz decomposition, benchmarked on A100 with Llama 3.1 8B.
- reinforcement_learning_llms — Experiments with RL objectives (PPO, GRPO) and MaxRL variants for LLM training.