Skip to content

[vLLM supports] FireRedLID vLLM supports #46

@PatchouliTIS

Description

@PatchouliTIS

Summary

I have adapted FireRedLID for vLLM inference and submitted a PR to the vLLM project:
vllm-project/vllm#39290

The converted model weights are available on Hugging Face:
https://huggingface.co/PatchyTisa/FireRedLID-vllm

Architecture

FireRedLID in vLLM follows the Whisper-style encoder-decoder pattern:

  • Encoder: ConformerEncoder (shared architecture with FireRedASR2)
  • Decoder: TransformerDecoder (6-layer cross-attention)
  • Vocabulary: 120 LID tokens (dict.txt)
  • Output: Up to 2 tokens per utterance (e.g. "en", "zh mandarin")

Usage

Server:

vllm serve PatchyTisa/FireRedLID-vllm -tp=1 --dtype=float32

Client:

python examples/online_serving/openai_lid_client.py \
    --audio_paths audio_en.wav audio_zh.wav audio_fr.wav

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions