Skip to content

Auto-extract context size from GGUF and Safetensors models at package time#64

Draft
Copilot wants to merge 4 commits intomainfrom
copilot/improve-model-configuration-visibility
Draft

Auto-extract context size from GGUF and Safetensors models at package time#64
Copilot wants to merge 4 commits intomainfrom
copilot/improve-model-configuration-visibility

Conversation

Copy link

Copilot AI commented Nov 12, 2025

Context size is only visible after running inference via docker model requests | jq '.[0].config'. This change extracts it directly from model files during packaging, making it immediately available in docker model ls.

Changes

GGUF models (pkg/distribution/internal/gguf/create.go)

  • Extract llama.context_length from metadata and populate Config.ContextSize
  • Reuse metadata extraction to avoid duplicate parsing

Safetensors models (pkg/distribution/internal/safetensors/create.go)

  • Parse config.json for common context size fields: max_position_embeddings, n_positions, max_length, n_ctx
  • Gracefully handle missing/invalid configs

Tests (pkg/distribution/internal/safetensors/context_test.go)

  • Coverage for all supported field names and error cases

Example

// GGUF: extract once and reuse
metadata := extractGGUFMetadata(&gguf.Header)
if contextLengthStr, ok := metadata["llama.context_length"]; ok {
    if parsed, err := strconv.ParseUint(contextLengthStr, 10, 64); err == nil {
        cfg.ContextSize = &parsed
    }
}

// Safetensors: parse config.json if present
configPath := filepath.Join(filepath.Dir(paths[0]), "config.json")
if contextSize := extractContextSizeFromConfig(configPath); contextSize > 0 {
    cfg.ContextSize = &contextSize
}

Backward compatible: models without context size metadata continue to work as before.

Original prompt

This section details on the original issue you should resolve

<issue_title>Make it easy to see/verify the a model configuration</issue_title>
<issue_description>Currently, unless the model has the context size embedded in the OCI artifact so it's displayed in docker model ls , you can only see the full config after running at least one inference using:

docker model requests ai/gemma3-qat | jq '.[0].config'
{
  "context-size": 1024
}

This is not friendly and should be improved.</issue_description>

Comments on the Issue (you are @copilot in this section)

@ericcurtin if it's not present parse the gguf or safetensors file for the value I guess, to be implemented.

Might make sense to solve this at model package time. If no value present make package parse the value and add it anyway.

Both techniques work for us I guess.</comment_new>


✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

Copilot AI and others added 3 commits November 12, 2025 15:27
Co-authored-by: ericcurtin <1694275+ericcurtin@users.noreply.github.com>
Co-authored-by: ericcurtin <1694275+ericcurtin@users.noreply.github.com>
… fix comment

Co-authored-by: ericcurtin <1694275+ericcurtin@users.noreply.github.com>
Copilot AI changed the title [WIP] Enhance visibility of model configuration details Auto-extract context size from GGUF and Safetensors models at package time Nov 12, 2025
Copilot AI requested a review from ericcurtin November 12, 2025 15:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Make it easy to see/verify the a model configuration

2 participants