Skip to content

Fix: Add claude-sonnet-4-6 to EXTENDED_THINKING_MODELS#2138

Merged
juanmichelini merged 1 commit intomainfrom
openhands/fix-claude-sonnet-4-6-temperature-top-p
Feb 19, 2026
Merged

Fix: Add claude-sonnet-4-6 to EXTENDED_THINKING_MODELS#2138
juanmichelini merged 1 commit intomainfrom
openhands/fix-claude-sonnet-4-6-temperature-top-p

Conversation

@juanmichelini
Copy link
Collaborator

@juanmichelini juanmichelini commented Feb 19, 2026

Summary

This PR fixes issue #2137 where Claude Sonnet 4.6 rejects API requests when both temperature AND top_p parameters are specified together.

Problem

Claude Sonnet 4.6 evaluations were failing with 100% error rate because:

  1. SDK sets top_p=1.0 by default
  2. Benchmarks override with temperature=0.1
  3. Both parameters sent to Anthropic API → request rejected

Anthropic's API documentation states:

You cannot specify both temperature and top_p in the same request.

Solution

Added claude-sonnet-4-6 to the EXTENDED_THINKING_MODELS list, following the same pattern as claude-sonnet-4-5 and claude-haiku-4-5. This ensures that when the model is used, both temperature and top_p are automatically stripped from the API call.

Additionally:

  • Added claude-sonnet-4-5 and claude-sonnet-4-6 to PROMPT_CACHE_MODELS for consistency with other Claude 4.x models that support prompt caching.

Changes Made

  • ✅ Added claude-sonnet-4-6 to EXTENDED_THINKING_MODELS list in model_features.py
  • ✅ Added claude-sonnet-4-5 and claude-sonnet-4-6 to PROMPT_CACHE_MODELS list
  • ✅ Added comprehensive tests for extended thinking support in test_model_features.py
  • ✅ Added regression test for temperature/top_p stripping behavior in test_chat_options.py

Testing

All tests pass successfully:

  • New test test_extended_thinking_support verifies that claude-sonnet-4-6 is correctly recognized
  • New test test_claude_sonnet_4_6_strips_temp_and_top_p verifies the fix addresses the specific issue
  • Updated test_prompt_cache_support to include claude-sonnet-4-6
  • All 154 existing tests continue to pass

Closes #2137

Checklist

  • If the PR is changing/adding functionality, are there tests to reflect this?
    • Yes, added tests for extended thinking support and temperature/top_p stripping
  • If there is an example, have you run the example to make sure that it works?
    • N/A - this is a bug fix, not a new example
  • If there are instructions on how to run the code, have you followed the instructions and made sure that it works?
    • Yes, all tests pass
  • If the feature is significant enough to require documentation, is there a PR open on the OpenHands/docs repository with the same branch name?
    • N/A - this is a bug fix for model configuration, no user-facing documentation needed
  • Is the github CI passing?
    • Waiting for CI to run

@juanmichelini can click here to continue refining the PR


Agent Server images for this PR

GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant Architectures Base Image Docs / Tags
java amd64, arm64 eclipse-temurin:17-jdk Link
python amd64, arm64 nikolaik/python-nodejs:python3.12-nodejs22 Link
golang amd64, arm64 golang:1.21-bookworm Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:6f9fa7e-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-6f9fa7e-python \
  ghcr.io/openhands/agent-server:6f9fa7e-python

All tags pushed for this build

ghcr.io/openhands/agent-server:6f9fa7e-golang-amd64
ghcr.io/openhands/agent-server:6f9fa7e-golang_tag_1.21-bookworm-amd64
ghcr.io/openhands/agent-server:6f9fa7e-golang-arm64
ghcr.io/openhands/agent-server:6f9fa7e-golang_tag_1.21-bookworm-arm64
ghcr.io/openhands/agent-server:6f9fa7e-java-amd64
ghcr.io/openhands/agent-server:6f9fa7e-eclipse-temurin_tag_17-jdk-amd64
ghcr.io/openhands/agent-server:6f9fa7e-java-arm64
ghcr.io/openhands/agent-server:6f9fa7e-eclipse-temurin_tag_17-jdk-arm64
ghcr.io/openhands/agent-server:6f9fa7e-python-amd64
ghcr.io/openhands/agent-server:6f9fa7e-nikolaik_s_python-nodejs_tag_python3.12-nodejs22-amd64
ghcr.io/openhands/agent-server:6f9fa7e-python-arm64
ghcr.io/openhands/agent-server:6f9fa7e-nikolaik_s_python-nodejs_tag_python3.12-nodejs22-arm64
ghcr.io/openhands/agent-server:6f9fa7e-golang
ghcr.io/openhands/agent-server:6f9fa7e-java
ghcr.io/openhands/agent-server:6f9fa7e-python

About Multi-Architecture Support

  • Each variant tag (e.g., 6f9fa7e-python) is a multi-arch manifest supporting both amd64 and arm64
  • Docker automatically pulls the correct architecture for your platform
  • Individual architecture tags (e.g., 6f9fa7e-python-amd64) are also available if needed

This fix resolves issue #2137 where Claude Sonnet 4.6 rejects API
requests with both temperature AND top_p parameters specified.

By adding claude-sonnet-4-6 to EXTENDED_THINKING_MODELS, the SDK
now automatically strips both temperature and top_p parameters
before sending requests to Anthropic's API, preventing the error.

Also added claude-sonnet-4-5 and claude-sonnet-4-6 to
PROMPT_CACHE_MODELS for consistency with other Claude 4.x models.

Changes:
- Added claude-sonnet-4-6 to EXTENDED_THINKING_MODELS list
- Added claude-sonnet-4-5 and claude-sonnet-4-6 to PROMPT_CACHE_MODELS
- Added comprehensive tests to verify extended thinking support
- Added regression test for temperature/top_p stripping behavior

Co-authored-by: openhands <openhands@all-hands.dev>
Copy link
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟢 Good taste - Clean fix following existing patterns.

Solves a real production problem (100% eval failure) with a simple, pragmatic solution. Tests cover the actual behavior. The addition of both 4-5 and 4-6 to PROMPT_CACHE_MODELS is good housekeeping.

LGTM! 👍

@juanmichelini juanmichelini enabled auto-merge (squash) February 19, 2026 22:22
@github-actions
Copy link
Contributor

Coverage

Coverage Report •
FileStmtsMissCoverMissing
openhands-sdk/openhands/sdk/llm/utils
   model_features.py46197%32
TOTAL18314556469% 

@juanmichelini juanmichelini merged commit 4d0a8b4 into main Feb 19, 2026
23 checks passed
@juanmichelini juanmichelini deleted the openhands/fix-claude-sonnet-4-6-temperature-top-p branch February 19, 2026 22:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Fix: Claude Sonnet 4.6 rejects requests with both temperature and top_p

3 participants

Comments