-
Notifications
You must be signed in to change notification settings - Fork 10.4k
Description
Pulled the code to have opencode help me get images working with my "OpenAI Compatible" models endpoint I'm forced into using from my IT department. Great and terrible. Got it working, I will submit a doc PR for the missing information outlined here.
Issue Type
Documentation
Description
The modalities property for enabling vision/multimodal support in custom OpenAI-compatible providers is not documented, making it difficult for users to configure image input support for their models.
Problem
When configuring a custom provider using @ai-sdk/openai-compatible, users cannot find documentation on how to enable vision/image input support. The configuration property exists and works, but is not mentioned in:
- The OpenCode docs at https://opencode.ai/docs/providers
- The built-in config documentation
- Custom provider examples
This leads users to believe image support is not available for custom providers, or to try incorrect properties like:
supportsImageInput: true(doesn't work)capabilities: { vision: true }(doesn't work)
Error Message
Without the modalities config, users see:
ERROR: Cannot read "clipboard" (this model does not support image input). Inform the user.
Solution
The correct configuration is the modalities property:
{
"provider": {
"my-provider": {
"npm": "@ai-sdk/openai-compatible",
"options": {
"baseURL": "https://api.example.com/v1"
},
"models": {
"vision-model": {
"name": "Vision Model",
"modalities": {
"input": ["text", "image"],
"output": ["text"]
}
}
}
}
}
}The modalities property accepts:
input: Array of supported input types:["text", "audio", "image", "video", "pdf"]output: Array of supported output types:["text", "audio", "image", "video", "pdf"]
Evidence
This property is:
- ✅ Defined in the schema:
packages/opencode/src/provider/models.ts:53-58 - ✅ Used in capability detection:
packages/opencode/src/provider/provider.ts:769-773 - ✅ Tested in the test suite:
packages/opencode/test/provider/provider.test.ts:1254-1257 - ✅ Working in production (verified)
But is:
- ❌ Not documented in the user-facing documentation
- ❌ Not mentioned in custom provider examples
- ❌ Not discoverable through normal config exploration
Related
- PR Feature/OpenAI compatible reasoning #5531 mentions multimodal support for
@ai-sdk/openai-compatible - I'm preparing a documentation PR to add this information
Impact
Without this documentation, users:
- Cannot use vision features with custom providers
- Waste time trying incorrect configuration properties
- May abandon OpenCode for other tools that have clearer documentation
- Create duplicate issues asking how to enable vision support