Skip to content

Support for LFM2 architecture (Liquid Foundation Model 2) #22287

@burkun

Description

@burkun

Feature Request: Add support for LFM2 architecture

Background

LFM2 (Liquid Foundation Model 2) is a novel architecture developed by Liquid AI that combines Convolution and Attention mechanisms in a hybrid design. The model achieves competitive performance with efficient inference characteristics.

Architecture Overview

LFM2 introduces a unique hybrid structure:

- Conv Blocks: 10 layers, O(n) complexity for local features
- Attention Blocks: 6 layers, O(n²) for global dependencies
- Total: 16 layers (mixed)
- SwiGLU FFN with parallel residual connections
- GQA: 16 query heads, 8 KV heads
- RoPE with theta=1M

Key Innovation: Custom Conv Block

The Conv block replaces self-attention in certain layers:

x → in_proj (3x expansion) 
  → chunk into 3 parts
  → shift each part (0, 1, 2 positions) 
  → sum (causal convolution equivalent)
  → Depthwise Conv1d (kernel=3)
  → out_proj

This provides O(n) complexity for local patterns while attention handles global context.

Why Support This Architecture?

  1. Novel hybrid design - Combines best of both worlds (Conv + Attention)
  2. Efficient inference - Conv blocks are faster than attention
  3. Growing adoption - Liquid AI is actively developing this architecture
  4. Open weights available - LFM2.5-350M is publicly available on HuggingFace

Available Models

Model Parameters Link
LFM2.5-350M 350M https://huggingface.co/LiquidAI/LFM2.5-350M
LFM2.5-1.2B 1.2B https://huggingface.co/LiquidAI/LFM2.5-1.2B
LFM2.5-3.2B 3.2B https://huggingface.co/LiquidAI/LFM2.5-3.2B

GGUF versions are also available: https://huggingface.co/LiquidAI/LFM2.5-350M-GGUF

Technical Details

Model config example (LFM2.5-350M):

{
  "hidden_size": 1024,
  "intermediate_size": 4608,
  "num_attention_heads": 16,
  "num_key_value_heads": 8,
  "num_hidden_layers": 16,
  "conv_layers": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9],  // Conv blocks
  "attention_layers": [10, 11, 12, 13, 14, 15],  // Attention blocks
  "conv_kernel": 3,
  "rms_norm_eps": 1e-5,
  "rope_theta": 1000000.0
}

Conv Block Implementation Reference:

class ConvModule(nn.Module):
    def __init__(self, hidden_dim, kernel_size=3, L=3):
        self.in_proj = nn.Linear(hidden_dim, L * hidden_dim)
        self.conv = nn.Conv1d(hidden_dim, hidden_dim, 
                              kernel_size=kernel_size,
                              groups=hidden_dim,  # depthwise
                              padding=kernel_size - 1)
        self.out_proj = nn.Linear(hidden_dim, hidden_dim)
    
    def forward(self, x):
        proj = self.in_proj(x)
        chunks = proj.chunk(3, dim=-1)
        # Causal shift
        shifted = [F.pad(c[..., :-i], (i, 0)) if i > 0 else c 
                   for i, c in enumerate(chunks)]
        x = sum(shifted)
        x = self.conv(x.transpose(1, 2)).transpose(1, 2)
        return self.out_proj(x)

Current Status

When attempting to convert LFM2 to GGUF:

ERROR: unknown model architecture: 'lfm2'

Request

Would it be possible to add support for this architecture? I'm happy to contribute code if given some guidance on the implementation approach.

References


Thanks for considering this! The llama.cpp project is amazing and we'd love to see LFM2 support added.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions