Conradzz

Follow

Aelryic Conradzz

Follow

1 follower · 0 following

Popular repositories Loading

llama-turboquant llama-turboquant Public

Forked from animehacker/llama-turboquant

TurboQuant for GGML: 4.57x KV Cache Compression with 72K+ Context for Llama-3.3-70B on Consumer GPUs.

C++
OpenArc OpenArc Public

Forked from SearchSavior/OpenArc

Inference engine for Intel devices. Serve LLMs, VLMs, Whisper, Kokoro-TTS, Embedding and Rerank models over OpenAI endpoints.

Python