Skip to content

feat(vision-analysis): mmx-cli primary, MCP fallback, clipboard handling (v1.4)#60

Open
divitkashyap wants to merge 8 commits intoMiniMax-AI:mainfrom
divitkashyap:feat/vision-analysis-clipboard-v3
Open

feat(vision-analysis): mmx-cli primary, MCP fallback, clipboard handling (v1.4)#60
divitkashyap wants to merge 8 commits intoMiniMax-AI:mainfrom
divitkashyap:feat/vision-analysis-clipboard-v3

Conversation

@divitkashyap
Copy link
Copy Markdown
Contributor

@divitkashyap divitkashyap commented Apr 4, 2026

Summary

vision-analysis skill v1.4 — mmx-cli as primary tool, MCP as fallback, plus clipboard handling and OpenCode workarounds.

Changes

mmx-cli as primary tool (v1.4)

  • mmx vision describe is now the recommended primary tool — direct REST call to MiniMax VLM, no MCP transport issues, handles URLs and local files automatically
  • Uses direct HTTP to /v1/coding_plan/vlm endpoint — sidesteps all OpenCode MCP stdio bugs entirely
  • MCP (auto-skill-loader_minimax_understand_image) is now the fallback when mmx is not installed
  • npm install -g mmx-cli — works in any host (Claude Code, OpenCode, terminal)

Clipboard fix

  • Added scripts/clipboard_image.py — extracts clipboard images to temp files for analysis
  • Supports macOS (osascript), Linux (xclip/wl-paste), Windows (PowerShell)
  • Handles clipboard screenshot paths that don't exist on disk

OpenCode stdio bug workaround

Metadata bug fixes

  • Removed requires_mcp field — caused plugin to filter skill from get_available_skills
  • Removed sources list — failed z.record(z.string(), z.string()) schema validation

Trigger tightening

  • Tightened description to only fire when actual image is explicitly shared/referenced
  • Avoids false triggers on generic text like project advice, code reviews

Tool Priority

  1. mmx vision describe (recommended): npm install -g mmx-cli, set MINIMAX_API_KEY
  2. auto-skill-loader MCP fallback: set MINIMAX_TOKEN_PLAN_KEY, enable in opencode.json

Testing

  • mmx vision describe: works in Claude Code and terminal, URLs handled natively
  • Clipboard extraction: works on macOS
  • Image analysis via auto-skill-loader proxy: works
  • OpenCode direct minimax-coding-plan-mcp: broken stdio transport (use proxy or mmx instead)
  • Skill auto-triggers on image path/URL text: confirmed

Closes

Closes #35


Submitted by: divitkashyap

…ill-loader proxy

- Document OpenCode's broken stdio transport causing 'login fail' errors
- Recommend auto-skill-loader minimax_understand_image proxy instead
- Update tool references throughout SKILL.md
- Version bump to 1.2
- Add cross-promotion for auto-skill-loader
@divitkashyap divitkashyap changed the title feat(vision-analysis): add clipboard image handling for paste screens… feat(vision-analysis): mmx-cli primary, MCP fallback, clipboard handling (v1.4) Apr 10, 2026
@divitkashyap
Copy link
Copy Markdown
Contributor Author

Hey @zest01998 @liyuan97 — this PR is ready for review. Key changes in v1.4: mmx-cli is now the primary vision tool (direct REST, no MCP transport issues), with the auto-skill-loader MCP as fallback. Also includes clipboard handling fixes and metadata bug fixes. Let me know if you have any questions!

@divitkashyap
Copy link
Copy Markdown
Contributor Author

Hey @zest0198 @liyuan97 @AkairoDev — this PR is ready for review. Major changes since last update (v1.4):

  • mmx-cli is now the primary vision tool (direct REST, no MCP transport issues)
  • MCP is now the fallback when mmx is not installed
  • Clipboard handling, metadata fixes, and trigger tightening also included

The skill is installable via npx skills add MiniMax-AI/cli once merged. Happy to address any feedback!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

script用的相对路径,agent有概率无法找到

1 participant