Skip to content
@HyperKuvid-Labs

HyperKuvid Labs

HyperKuvid-Labs is a student collective focused on research-level experiments in reinforcement learning, CUDA programming, and core ML research.

HyperKuvid-Labs

A student collective learning the hard way — by breaking things, rebuilding them, and figuring it out ourselves.

Focus Areas

  • Reinforcement Learning: Group Relative Policy Optimization (GRPO), AlphaZero-style training, and model exploration.
  • Advanced ML Optimization: Matrix-aware optimizers (Muon), speculative decoding, and energy-aware inference.
  • Compiler & System Engineering: CRDT state management, compiler optimization, and lock-free data structures.
  • Formal Verification & Safety: Hybrid AI frameworks, constraint-based validation, and autonomous code generation.
  • Distributed Systems: Distributed training, distributed queue servers, and high-performance inter-process communication.

Top Repositories

  • alpha-stack — LLM-powered dev agent with project scaffolding, automated iteration, and multi-stack support.
  • FrugalSOT — Adaptive model selection for efficient on-device NLP inference with dynamic routing.
  • SpecQuant — Adaptive LLM serving with prompt complexity classification and quantized draft model selection.
  • AlphaDesign — RL and genetic algorithms for F1 front wing aerodynamic optimization.
  • Phydra — 3D bin packing cargo management system with A*/Dijkstra pathfinding in C++.
  • DGAT — Instant LLM-annotated dependency graph generation for any codebase.

Top Experimental Repositories

Only muon_exps and reinforcement_learning_llms have seen meaningful progress. The rest are currently stuck — we could use mentorship and guidance on these.

  • AlphaD-RL — Multi-teacher MCTS for code generation with execution-based rewards from unit tests.
  • elastic_continual_learning — Adaptive optimization framework to mitigate catastrophic forgetting in neural networks.
  • energy_throttling_llms — Energy-aware DDPG RL for dynamic LLM speculative decoding under thermal constraints.
  • muon_exps — CUDA implementation of Muon optimizer with Newton-Schulz decomposition, benchmarked on A100 with Llama 3.1 8B.
  • reinforcement_learning_llms — Experiments with RL objectives (PPO, GRPO) and MaxRL variants for LLM training.

Pinned Loading

  1. alphazero_llm_trainer alphazero_llm_trainer Public

    AlphaZero-style RL training for LLMs using MCTS on mathematical reasoning tasks (GSM8K). Student model explores reasoning paths guided by teacher ensembles and reward signals.

    Python 1

  2. FrugalSOT FrugalSOT Public

    An adaptive model selection system for efficient on-device NLP inference, enhancing speed, privacy, and resource use on edge devices.

    TypeScript 1

  3. AlphaDesign AlphaDesign Public

    Hybrid AI framework combining reinforcement learning and genetic algorithms to optimize Formula 1 front wing aerodynamic designs. Features neural network-guided optimization, CFD analysis, structur…

    Python 1 1

  4. energy_throttling_llms energy_throttling_llms Public

    Energy-aware DDPG RL framework that dynamically optimizes LLM speculative decoding parameters based on real-time hardware metrics (CPU/GPU temps, battery). Maintains 95-98% energy utilization to ma…

    Python

  5. alpha-stack alpha-stack Public

    Intelligent agent that converts natural language prompts into production-ready multi-file codebases with automatic dependency resolution, Docker validation, and iterative error correction.

    Python 3 1

  6. SpecQuant SpecQuant Public

    Scalable framework for adaptive LLM serving: classify prompt complexity → select quantized drafts → verify with FP16 target, no model retraining required.

    Python 1

Repositories

Showing 10 of 21 repositories

Top languages

Loading…

Most used topics

Loading…