Skip to content
Change the repository type filter

All

    Repositories list

    • LLMRouter

      Public
      LLMRouter: An Open-Source Library for LLM Routing
      Python
      65000Updated Jan 1, 2026Jan 1, 2026
    • DistCA

      Public
      Efficient Long-context Language Model Training by Core Attention Disaggregation
      Python
      4000Updated Dec 20, 2025Dec 20, 2025
    • cornserve

      Public
      Easy, Fast, and Scalable Multimodal AI
      Python
      6000Updated Dec 17, 2025Dec 17, 2025
    • C++
      5000Updated Dec 9, 2025Dec 9, 2025
    • fastrl

      Public
      Efficient Reinforcement Learning for Language Models
      Python
      8000Updated Nov 21, 2025Nov 21, 2025
    • miles

      Public
      Python
      69000Updated Nov 20, 2025Nov 20, 2025
    • NexRL

      Public
      NexRL is an ultra-loosely-coupled LLM post-training framework.
      Python
      4000Updated Nov 18, 2025Nov 18, 2025
    • Nex Venus Communication Library
      C++
      6000Updated Nov 17, 2025Nov 17, 2025
    • An implementation of Paper "Empowering Agentic Video Analytics Systems with Video Language Models"
      Python
      3100Updated Nov 5, 2025Nov 5, 2025
    • DynaPipe

      Public
      Python
      1000Updated Oct 23, 2025Oct 23, 2025
    • Python
      2000Updated Oct 15, 2025Oct 15, 2025
    • StreamingVLM: Real-Time Understanding for Infinite Video Streams
      Python
      51000Updated Oct 15, 2025Oct 15, 2025
    • StreamingVLM: Real-Time Understanding for Infinite Video Streams
      Python
      51000Updated Oct 13, 2025Oct 13, 2025
    • LongLive

      Public
      LongLive: Real-time Interactive Long Video Generation
      Python
      65000Updated Oct 13, 2025Oct 13, 2025
    • Python
      3000Updated Oct 11, 2025Oct 11, 2025
    • Python
      6000Updated Oct 1, 2025Oct 1, 2025
    • Python
      3000Updated Sep 26, 2025Sep 26, 2025
    • BurstEngine is an efficient framework designed to train LLMs on long-sequence data.
      Python
      3000Updated Sep 25, 2025Sep 25, 2025
    • Boosting GPU utilization for LLM serving via dynamic spatial-temporal prefill & decode orchestration
      Python
      2000Updated Sep 24, 2025Sep 24, 2025
    • LoRAFusion: Efficient LoRA Fine-Tuning for LLMs
      Python
      2000Updated Sep 23, 2025Sep 23, 2025
    • The official implementation of OSDI'25 paper BlitzScale
      Rust
      2000Updated Sep 20, 2025Sep 20, 2025
    • The official implementation of OSDI'25 paper BlitzScale
      Rust
      2100Updated Sep 20, 2025Sep 20, 2025
    • 25SC-gLLM

      Public
      gLLM: Global Balanced Pipeline Parallelism System for Distributed LLM Serving with Token Throttling
      Python
      4000Updated Sep 15, 2025Sep 15, 2025
    • Rust
      11000Updated Sep 10, 2025Sep 10, 2025
    • The RTOS components for the CHERIoT research platform
      C++
      59000Updated Sep 8, 2025Sep 8, 2025
    • Jupyter Notebook
      3000Updated Sep 6, 2025Sep 6, 2025
    • Gorgeous

      Public
      C++
      3000Updated Sep 2, 2025Sep 2, 2025
    • C++
      2000Updated Sep 2, 2025Sep 2, 2025
    • Artifact for SOSP 25 paper: Scalable Far Memory: Balancing Faults and Evictions
      C
      1000Updated Aug 31, 2025Aug 31, 2025
    • Python
      1000Updated Aug 27, 2025Aug 27, 2025