Skip to content

Implement Anthropic Messages API support#75

Draft
Copilot wants to merge 4 commits intomainfrom
copilot/implement-anthropic-messages-api
Draft

Implement Anthropic Messages API support#75
Copilot wants to merge 4 commits intomainfrom
copilot/implement-anthropic-messages-api

Conversation

Copy link

Copilot AI commented Dec 3, 2025

Adds Anthropic Messages API compatibility layer, enabling llama-server's native Anthropic support (merged in llama.cpp #17570) to be accessible through model-runner.

Changes

  • New pkg/anthropic handler: Proxies requests to scheduler → llama.cpp backend

    • POST /anthropic/v1/messages - Chat completions with streaming
    • POST /anthropic/v1/messages/count_tokens - Token counting
    • Anthropic-format error responses
  • Scheduler routing: Added Anthropic routes to routeHandlers(), recognized /v1/messages paths in backendModeForRequest()

  • Request tracking: Added OriginAnthropicMessages origin constant for metrics

Usage

curl http://localhost:8080/anthropic/v1/messages \
  -H "Content-Type: application/json" \
  -d '{
    "model": "ai/qwen3-coder",
    "max_tokens": 1024,
    "messages": [{"role": "user", "content": "Hello"}]
  }'

Architecture mirrors the existing Ollama compatibility layer—thin proxy that forwards to llama.cpp which handles format conversion internally.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • docs.anthropic.com
    • Triggering command: /usr/bin/curl curl -s REDACTED (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

Original prompt

This section details on the original issue you should resolve

<issue_title>Implement and test Anthropic Messages API</issue_title>
<issue_description>Qwen 3 Coder on Docker Hub would be a good model to test this with

ggml-org/llama.cpp#17570

The max context size an 36GB VRAM macbook pro can handle is:

llama-server -hf unsloth/Qwen3-Coder-30B-A3B-Instruct-GGUF:Q4_K_M -c 65536</issue_description>

Comments on the Issue (you are @copilot in this section)


💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

Copilot AI and others added 3 commits December 3, 2025 19:23
Co-authored-by: ericcurtin <1694275+ericcurtin@users.noreply.github.com>
Co-authored-by: ericcurtin <1694275+ericcurtin@users.noreply.github.com>
Co-authored-by: ericcurtin <1694275+ericcurtin@users.noreply.github.com>
Copilot AI changed the title [WIP] Implement and test Anthropic Messages API Implement Anthropic Messages API support Dec 3, 2025
Copilot AI requested a review from ericcurtin December 3, 2025 19:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement and test Anthropic Messages API

2 participants