eljrte

Follow

eljrte

Follow

3 followers · 6 following

Popular repositories Loading

multi_turn_LLM_Inference multi_turn_LLM_Inference Public

Shell 1
attention-is-all-you-need-pytorch attention-is-all-you-need-pytorch Public

Forked from jadore801120/attention-is-all-you-need-pytorch

A PyTorch implementation of the Transformer model in "Attention is All You Need".

Python
server server Public

Forked from triton-inference-server/server

The Triton Inference Server provides an optimized cloud and edge inferencing solution.

Python
vllm vllm Public

Forked from vllm-project/vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python
specinfer-ae specinfer-ae Public

Forked from goliaro/specinfer-ae

Shell
PowerInfer PowerInfer Public

Forked from Tiiny-AI/PowerInfer

High-speed Large Language Model Serving on PCs with Consumer-grade GPUs

C++