Skip to content

fix: Add Databricks max_tokens support#2128

Draft
juanmichelini wants to merge 4 commits intomainfrom
fix/databricks-max-output-tokens-cap
Draft

fix: Add Databricks max_tokens support#2128
juanmichelini wants to merge 4 commits intomainfrom
fix/databricks-max-output-tokens-cap

Conversation

@juanmichelini
Copy link
Collaborator

@juanmichelini juanmichelini commented Feb 19, 2026

Summary

This PR adds proper Databricks model support to the SDK by ensuring that:

  1. The max_completion_tokens parameter is converted to max_tokens (which Databricks requires)
  2. The value is capped at 25000 (Databricks API hard limit)

Problem

Databricks models were failing because:

  • The SDK was sending max_completion_tokens but Databricks expects max_tokens
  • Values exceeding 25000 would cause API rejection

Solution

Added Databricks-specific handling in chat_options.py (similar to existing Azure handling):

  • Convert max_completion_tokensmax_tokens
  • Cap at 25000 to comply with API limits

This is the minimal and elegant fix that ensures compatibility.

Testing

✅ Verified with test script in OpenHands:

  • Parameter conversion works correctly
  • Values > 25000 are properly capped
  • Existing Azure handling still works (regression test)

Related

  • fix: Cap Databricks max_tokens at 25000 and prevent condenser crash OpenHands#12925 (applied fix there, but the proper location is in the SDK)
  • This fix will be picked up by Op- This fix will be picked up by Op- This fix will be picked up by Op- Th c- This fix will be picked up by Op- This fix will be picked up by Op- This fix will be picked up by ks- This fix will be picked up by Op- This fix will be pickepar- This findling

Agent Server images for this PR

GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant Architectures Base Image Docs / Tags
java amd64, arm64 eclipse-temurin:17-jdk Link
python amd64, arm64 nikolaik/python-nodejs:python3.12-nodejs22 Link
golang amd64, arm64 golang:1.21-bookworm Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:9272ad3-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-9272ad3-python \
  ghcr.io/openhands/agent-server:9272ad3-python

All tags pushed for this build

ghcr.io/openhands/agent-server:9272ad3-golang-amd64
ghcr.io/openhands/agent-server:9272ad3-golang_tag_1.21-bookworm-amd64
ghcr.io/openhands/agent-server:9272ad3-golang-arm64
ghcr.io/openhands/agent-server:9272ad3-golang_tag_1.21-bookworm-arm64
ghcr.io/openhands/agent-server:9272ad3-java-amd64
ghcr.io/openhands/agent-server:9272ad3-eclipse-temurin_tag_17-jdk-amd64
ghcr.io/openhands/agent-server:9272ad3-java-arm64
ghcr.io/openhands/agent-server:9272ad3-eclipse-temurin_tag_17-jdk-arm64
ghcr.io/openhands/agent-server:9272ad3-python-amd64
ghcr.io/openhands/agent-server:9272ad3-nikolaik_s_python-nodejs_tag_python3.12-nodejs22-amd64
ghcr.io/openhands/agent-server:9272ad3-python-arm64
ghcr.io/openhands/agent-server:9272ad3-nikolaik_s_python-nodejs_tag_python3.12-nodejs22-arm64
ghcr.io/openhands/agent-server:9272ad3-golang
ghcr.io/openhands/agent-server:9272ad3-java
ghcr.io/openhands/agent-server:9272ad3-python

About Multi-Architecture Support

  • Each variant tag (e.g., 9272ad3-python) is a multi-arch manifest supporting both amd64 and arm64
  • Docker automatically pulls the correct architecture for your platform
  • Individual architecture tags (e.g., 9272ad3-python-amd64) are also available if needed

juanmichelini and others added 4 commits February 19, 2026 01:09
This adds proper Databricks model support by capping max_output_tokens at 25000,
which is the API limit enforced by Databricks.

The fix is implemented in the LLM class _coerce_inputs model_validator, which runs
before model instantiation. This ensures that all Databricks models (model names
starting with 'databricks/') automatically get the correct max_output_tokens value.

If max_output_tokens is not set or exceeds 25000, it will be capped at 25000.
This prevents API rejections from Databricks while still allowing users to set
lower values if desired.

This is the proper fix location compared to the OpenHands workaround, as it
handles the limitation at the SDK level where all model-specific configurations
should be centralized.

Related PR: OpenHands/OpenHands#12925

Co-authored-by: openhands <openhands@all-hands.dev>
Databricks API requires max_tokens parameter (not max_completion_tokens)
and has a hard limit of 25000 tokens. This fix:

1. Converts max_completion_tokens to max_tokens for Databricks models
2. Caps the value at 25000 to comply with API limits

This is similar to the existing Azure handling and ensures Databricks
models work correctly with the SDK.

Related: OpenHands/OpenHands#12925
Signed-off-by: Juan Michelini <juan@juan.com.uy>
- Fix line length issue in comment
- Improve logic to match working OpenHands PR #12925
- Handle both max_tokens and max_completion_tokens correctly
- Cap at 25000 with proper precedence
Databricks API doesn't support the metadata parameter that LiteLLM can pass.
This causes BadRequestError: json: unknown field "metadata"

Added metadata removal for Databricks models, similar to how we handle
max_completion_tokens -> max_tokens conversion.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments