Useless compression and buggy contextWindowSize

### What happened?

I'm not sure if the problem is with qwen code or my model, but when I limit the context, compression seems useless.
I use llama.cpp locally with unsloth Qwen3-Coder-Next-UD-Q4_K_XL.gguf.
First compression was 82k -> 25k which looks pretty normal. The second is already useless 81651 to 81273
Log from qwen-code:
```
ℹIMPORTANT: This conversation approached the input token limit for unsloth/Qwen3-Coder-Next. A compressed context will be sent for future
    messages (compressed from: 81651 to 81273 tokens).
```

Then qwen code continued its execution and ended with an error 
```
  ✕ [API Error: 400 request (100582 tokens) exceeds the available context size (100096 tokens), try increasing it]
```

settings.json:
```
{
  "$version": 3,
  "general": {
    "language": "ru"
  },
  "env": {
    "LOCAL_LLM_API_KEY": "local-llm"
  },
  "modelProviders": {
    "openai": [
      {
        "id": "unsloth/Qwen3-Coder-Next",
        "name": "unsloth/Qwen3-Coder-Next",
        "description": "Local Qwen model via OpenAI-compatible API",
        "baseUrl": "http://192.168.0.33:8001/v1",
        "envKey": "LOCAL_LLM_API_KEY",
        "generationConfig": {
          "contextWindowSize": 95000
        }
      }
    ]
  },
  "security": {
    "auth": {
      "selectedType": "openai"
    }
  },
  "model": {
    "name": "unsloth/Qwen3-Coder-Next",
    "chatCompression": {
      "contextPercentageThreshold": 0.85
    },
    "generationConfig": {
      "timeout": 1200000,
      "maxRetries": 3
    }
  },
  "tools": {
    "approvalMode": "default"
  }
}
```
```
 │ Status                                                                                           │
  │                                                                                                  │
  │ Qwen Code                         0.10.5 (135b47db)                                              │
  │ Runtime                           Node.js v24.11.0 / npm 11.6.1                                  │
  │ OS                                darwin arm64 (24.5.0)                                          │
  │                                                                                                  │
  │ Auth                              openai (http://192.168.0.33:8001/v1)                           │
  │ Model                             unsloth/Qwen3-Coder-Next                                       │
  │ Session ID                        178d7455-45b3-4be0-9d2c-716be523109f                           │
  │ Sandbox                           no sandbox                                                     │
  │ Proxy                             no proxy                                                       │
  │ Memory Usage                      361.9 MB

```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Useless compression and buggy contextWindowSize #1924

What happened?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Useless compression and buggy contextWindowSize #1924

Description

What happened?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions