What happened?
I'm not sure if the problem is with qwen code or my model, but when I limit the context, compression seems useless.
I use llama.cpp locally with unsloth Qwen3-Coder-Next-UD-Q4_K_XL.gguf.
First compression was 82k -> 25k which looks pretty normal. The second is already useless 81651 to 81273
Log from qwen-code:
ℹIMPORTANT: This conversation approached the input token limit for unsloth/Qwen3-Coder-Next. A compressed context will be sent for future
messages (compressed from: 81651 to 81273 tokens).
Then qwen code continued its execution and ended with an error
✕ [API Error: 400 request (100582 tokens) exceeds the available context size (100096 tokens), try increasing it]
settings.json:
{
"$version": 3,
"general": {
"language": "ru"
},
"env": {
"LOCAL_LLM_API_KEY": "local-llm"
},
"modelProviders": {
"openai": [
{
"id": "unsloth/Qwen3-Coder-Next",
"name": "unsloth/Qwen3-Coder-Next",
"description": "Local Qwen model via OpenAI-compatible API",
"baseUrl": "http://192.168.0.33:8001/v1",
"envKey": "LOCAL_LLM_API_KEY",
"generationConfig": {
"contextWindowSize": 95000
}
}
]
},
"security": {
"auth": {
"selectedType": "openai"
}
},
"model": {
"name": "unsloth/Qwen3-Coder-Next",
"chatCompression": {
"contextPercentageThreshold": 0.85
},
"generationConfig": {
"timeout": 1200000,
"maxRetries": 3
}
},
"tools": {
"approvalMode": "default"
}
}
│ Status │
│ │
│ Qwen Code 0.10.5 (135b47db) │
│ Runtime Node.js v24.11.0 / npm 11.6.1 │
│ OS darwin arm64 (24.5.0) │
│ │
│ Auth openai (http://192.168.0.33:8001/v1) │
│ Model unsloth/Qwen3-Coder-Next │
│ Session ID 178d7455-45b3-4be0-9d2c-716be523109f │
│ Sandbox no sandbox │
│ Proxy no proxy │
│ Memory Usage 361.9 MB
What happened?
I'm not sure if the problem is with qwen code or my model, but when I limit the context, compression seems useless.
I use llama.cpp locally with unsloth Qwen3-Coder-Next-UD-Q4_K_XL.gguf.
First compression was 82k -> 25k which looks pretty normal. The second is already useless 81651 to 81273
Log from qwen-code:
Then qwen code continued its execution and ended with an error
settings.json: