-
Notifications
You must be signed in to change notification settings - Fork 152
Closed
Labels
bugSomething isn't workingSomething isn't working
Description
Problem
Claude Sonnet 4.6 (and other Anthropic models) reject API requests when both temperature AND top_p are specified together.
Evidence from Production Failures
Multiple evaluation jobs failing with 100% failure rate:
eval-22184270397-claude-son(failed after 7h32m, swebench)eval-22184282669-claude-son(failed after 7h32m, commit0)eval-22112470055-claude-son(error state, 21h47m, swtbench)
Root Cause
- SDK default:
top_p=1.0(set inopenhands/sdk/llm/llm.py) - Benchmarks override:
temperature=0.1(set in evaluation configs) - Result: Both parameters sent to Anthropic API → request rejected
From Anthropic's API docs:
You cannot specify both
temperatureandtop_pin the same request.
Current Code
# openhands-sdk/openhands/sdk/llm/llm.py
class LLM(BaseModel):
top_p: float | None = Field(
default=1.0, # ← Always sends this
ge=0,
le=1,
description="Top-p (nucleus) sampling parameter..."
)Proposed Minimal Fix
Option A: Don't send top_p for Claude models when temperature is set
Add logic to skip top_p for Anthropic models when temperature is explicitly provided:
# In model_features.py or similar
def should_send_top_p(model: str, temperature: float | None) -> bool:
"""Return False if model doesn't support both temperature and top_p together."""
# Anthropic models (Claude) don't accept both parameters
if 'claude' in model.lower() or 'anthropic' in model.lower():
return temperature is None
return TrueOption B: Set top_p default to None for Claude models
def get_default_top_p(model: str) -> float | None:
"""Return None for Claude models to avoid conflicts with temperature."""
if 'claude' in model.lower() or 'anthropic' in model.lower():
return None
return 1.0 # Keep existing default for other modelsScope
This issue is ONLY about fixing Claude Sonnet 4.6 compatibility.
- ✅ Fix Claude models specifically
- ✅ Minimal code change
- ✅ No breaking changes for other models
- ❌ NOT addressing broader default value changes (see PR Set default temperature to None instead of 0.0 #1989 for that)
Related
- Broader fix: PR Set default temperature to None instead of 0.0 #1989 (changes defaults for all models, has breaking change concerns)
- Related issue: RFC: Always set temperature to default? #1913
Impact
Currently blocking production evaluations:
- 100% failure rate for Claude Sonnet 4.6 evaluations
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working