Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
18 commits
Select commit Hold shift + click to select a range
81c933e
Update integration tests to use claude-sonnet-4-6
openhands-agent Feb 18, 2026
8d79409
fix: install litellm before resolving model configs in integration wo…
openhands-agent Feb 18, 2026
45ec698
fix: make litellm import lazy in resolve_model_config.py
openhands-agent Feb 18, 2026
f595d48
ci: add temporary push trigger for testing workflow changes
openhands-agent Feb 18, 2026
bcc0ab4
Fix: Run only integration tests on push trigger
openhands-agent Feb 18, 2026
bca02e8
Fix claude-sonnet-4-6 config: set top_p=None to avoid conflict
openhands-agent Feb 18, 2026
878ae4c
Add supports_top_p feature for claude-sonnet-4-6
openhands-agent Feb 18, 2026
2a3c92d
Merge branch 'main' into update-integration-test-model-to-sonnet-4-6
xingyaoww Feb 19, 2026
1162f3b
Revert "Add supports_top_p feature for claude-sonnet-4-6"
xingyaoww Feb 19, 2026
91d72a3
Revert "Fix claude-sonnet-4-6 config: set top_p=None to avoid conflict"
xingyaoww Feb 19, 2026
c52d469
Revert "Fix: Run only integration tests on push trigger"
xingyaoww Feb 19, 2026
eb6a3c1
Revert "ci: add temporary push trigger for testing workflow changes"
xingyaoww Feb 19, 2026
9adeca1
Revert "fix: make litellm import lazy in resolve_model_config.py"
xingyaoww Feb 19, 2026
ccd9d13
Revert "fix: install litellm before resolving model configs in integr…
xingyaoww Feb 19, 2026
f7a176e
add sonnet to extended thinking and prompt caching models
xingyaoww Feb 19, 2026
2f8bba5
fix: make litellm import lazy in resolve_model_config.py
openhands-agent Feb 19, 2026
7013308
ci: restore Blacksmith runners for integration tests
openhands-agent Feb 19, 2026
8c636ca
Merge branch 'main' into update-integration-test-model-to-sonnet-4-6
xingyaoww Feb 19, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 4 additions & 4 deletions .github/workflows/integration-runner.yml
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ on:
model_ids:
description: >-
Comma-separated model IDs to test (from resolve_model_config.py).
Example: claude-sonnet-4-5-20250929,glm-4.7. Defaults to a standard set.
Example: claude-sonnet-4-6,glm-4.7. Defaults to a standard set.
required: false
default: ''
type: string
Expand Down Expand Up @@ -50,7 +50,7 @@ on:
env:
N_PROCESSES: 4 # Global configuration for number of parallel processes for evaluation
# Default models for scheduled/label-triggered runs (subset of models from resolve_model_config.py)
DEFAULT_MODEL_IDS: claude-sonnet-4-5-20250929,deepseek-v3.2-reasoner,kimi-k2-thinking,gemini-3-pro
DEFAULT_MODEL_IDS: claude-sonnet-4-6,deepseek-v3.2-reasoner,kimi-k2-thinking,gemini-3-pro

jobs:
setup-matrix:
Expand Down Expand Up @@ -215,7 +215,7 @@ jobs:
(github.event_name == 'schedule' && github.repository == 'OpenHands/software-agent-sdk')
) && needs.setup-matrix.result == 'success'
needs: [setup-matrix, post-label-comment, post-dispatch-comment]
runs-on: ubuntu-22.04
runs-on: blacksmith-4vcpu-ubuntu-2204
permissions:
contents: read
id-token: write
Expand Down Expand Up @@ -367,7 +367,7 @@ jobs:
github.event_name == 'workflow_dispatch' ||
(github.event_name == 'schedule' && github.repository == 'OpenHands/software-agent-sdk')
)
runs-on: ubuntu-24.04
runs-on: blacksmith-2vcpu-ubuntu-2404
permissions:
contents: read
pull-requests: write
Expand Down
2 changes: 2 additions & 0 deletions openhands-sdk/openhands/sdk/llm/utils/model_features.py
Original file line number Diff line number Diff line change
Expand Up @@ -71,6 +71,7 @@ class ModelFeatures:
# Anthropic Opus 4.5 and 4.6
"claude-opus-4-5",
"claude-opus-4-6",
"claude-sonnet-4-6",
# Nova 2 Lite
"nova-2-lite",
]
Expand All @@ -96,6 +97,7 @@ class ModelFeatures:
"claude-haiku-4-5",
"claude-opus-4-5",
"claude-opus-4-6",
"claude-sonnet-4-6",
]

# Models that support a top-level prompt_cache_retention parameter
Expand Down
Loading