Skip to content

[Refactor] Reduce per-model code duplication in modelconfig package #587

@pallasathena92

Description

@pallasathena92

Summary

The pkg/hfutil/modelconfig package has 36 separate .go model files (6,187 lines) that each define a struct and implement the same HuggingFaceModel interface with near-identical methods. No external code depends on specific model types — all consumers use the interface only.

Key Findings

  • GetModelSizeBytes(), GetQuantizationType() are character-for-character identical across 30+ files
  • GetParameterCount() follows the exact same 3-phase pattern (safetensors → hardcoded lookup → estimation) in every file
  • GetContextLength() differs meaningfully in only 4 models
  • HasVision() returns false in 30+ files; IsEmbedding() returns false in 34 files
  • 9 different parameter estimation functions scattered across 6 files
  • Zero type assertions to specific model configs in production code

Bugs

  • mistral.go:144: IsEmbedding() returns true — wrong for a LLM
  • phi.go, llama4.go: shadow ConfigPath field already in BaseModelConfig

Refactoring Plan

Phase 1: Infrastructure (no behavior change)

Step 1.1 — Consolidate estimation functions in interface.go

9 estimation functions scattered across 6 files:

Function Location Action
estimateModelParams phi3_v.go:111 Move to interface.go as canonical EstimateModelParams()
estimateGenericParams interface.go:265 Replace with more accurate phi3_v version
estimateParamsFromArchitecture llama.go:141 Replace with call to EstimateModelParams
estimateTextParams mllama.go:132 Move to interface.go as shared helper
estimateVisionParams mllama.go:138 Deduplicate (identical to estimateTextParams)
estimateMoEParamCount llama4.go:258 Consolidate into shared EstimateMoEParams
estimateMoEParams deepseek_vl.go:204 Consolidate into shared EstimateMoEParams
estimateQwen3VLMoEParams qwen3_vl.go:169 Deduplicate with shared MoE estimator
estimateQwen3VLVisionParams qwen3_vl.go:190 Move to interface.go as shared helper

Step 1.2 — Fix mistral.go IsEmbedding bug

Step 1.3 — Add StandardModelConfig to interface.go

New struct embedding BaseModelConfig with common transformer fields. Provides default implementations for GetParameterCount(), GetContextLength(), GetModelSizeBytes(), GetQuantizationType() with a per-model paramLookupTable.

Phase 2: Consolidate simple text models (~20 files → 1 file)

Create models_text.go with thin wrapper structs embedding StandardModelConfig. Use generic loader factory (Go 1.25).

Models: llama, mistral, gemma, gemma2, gemma3_text, phi3, phi3small, exaone, command_r, internlm, internlm2, stablelm, xverse, minicpm, minicpm3

Create models_text_special.go for models with custom GetContextLength(): Qwen family (qwen, qwen2, qwen3), Baichuan.

Phase 3: Consolidate MoE models (~5 files → 1 file)

Create models_moe.go with MoEModelConfig embedding StandardModelConfig + MoE fields. Override GetParameterCount() with MoE-aware estimation.

Models: mixtral, phimoe, qwen3_moe, gpt_oss, kimi_k2, deepseek_v3

Phase 4: Clean up vision models (keep separate, reduce duplication)

Vision models keep individual files (genuinely unique nested configs). But:

  • Remove re-implemented methods that BaseModelConfig already provides
  • Use shared estimation helpers from Phase 1

Phase 5: Clean up standalone special models

Keep as individual files due to non-standard JSON field names:

  • chatglm.gonum_layers, ffn_hidden_size, padded_vocab_size
  • dbrx.god_model, n_heads, n_layers
  • bert.goIsEmbedding() = true, BERT-specific estimation
  • phi.go — doesn't embed BaseModelConfig

Target File Structure

modelconfig/
  interface.go              # Interface, BaseModelConfig, StandardModelConfig, utilities
  safetensors.go            # Safetensors parsing (unchanged)
  diffusion.go              # Diffusion pipelines (unchanged)
  models_text.go            # ~15 simple text models (consolidated)
  models_text_special.go    # Qwen/Baichuan with custom GetContextLength
  models_moe.go             # MoE models (consolidated)
  chatglm.go                # Standalone (non-standard fields)
  dbrx.go                   # Standalone (non-standard fields)
  bert.go                   # Standalone (embedding model)
  phi.go                    # Standalone (non-standard structure)
  mllama.go                 # Vision (standalone)
  llava.go                  # Vision (standalone)
  gemma3.go                 # Vision (standalone)
  qwen2_vl.go               # Vision (standalone)
  qwen3_vl.go               # Vision (standalone)
  phi3_v.go                 # Vision (standalone)
  deepseek_vl.go            # Vision (standalone)
  llama4.go                 # Vision + MoE (standalone)

Result: 36 model files → ~18 files, significant code deduplication

🤖 Generated with Claude Code

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions