Skip to content

feat: support calling same-device LLM CLIs as manageable sub-agents / workers #215

@huangrichao2020

Description

@huangrichao2020

Scenario

I already have multiple LLM CLIs / coding agents installed on the same machine, such as Codex, Claude Code, Qwen Code, OpenCode, etc. In many cases, GenericAgent does not need to reimplement every capability itself. It would be more useful if these local CLIs could be delegated to as callable “sub-agents” or backend workers.

The current difficulty is that a simple subprocess.run() is not enough. Long-running tasks need duplex communication, status queries, cancellation, log tracking, and final-result retrieval. Otherwise, once a long task is launched from Feishu/WeChat, the user can only wait for a black-box process to finish.

Current Pain Points

  • External CLIs can only be treated as one-shot commands, not manageable tasks.
  • There is no continuous way to read JSONL/stdout/stderr progress and forward it to the frontend.
  • There is no message-level way to add instructions, stop the task, or query task status.
  • If the process crashes, recoverable metadata such as task directory, pid, cwd, logs, and final answer is missing.
  • For ordinary developers, useful coding CLIs may already exist locally, but there is no natural way to plug them into GenericAgent.

Suggested Direction

Consider adding a generic LocalCliAgentAdapter / SubAgentRelay abstraction:

  • start(prompt, cwd, mode) -> task_id
  • status(task_id) -> running/success/error/stopped + stdout_tail + final
  • stop(task_id)
  • send(task_id, message), using duplex communication when the underlying CLI supports interactive/PTY mode, and falling back to appended prompts or new tasks otherwise
  • Each task writes a directory containing session.json, prompt.txt, stdout.jsonl, and final.md

A lightweight implementation approach is to use the filesystem as a task bus: when launching an external CLI, record pid/cwd/started_at/command/stdout_file/final_file, then let a watcher thread update status. The frontend only needs to poll status. This is more suitable for Feishu/WeChat-style message entry points than synchronously waiting on a subprocess.

Acceptance Criteria

  • GenericAgent can start a local CLI coding task and immediately return a task_id.
  • Progress and stdout tail can be queried while the task is running.
  • The task can be stopped without leaving orphan processes.
  • The final result file can be read after completion.
  • The design can later support Codex / Claude Code / Qwen Code / OpenCode and similar CLIs.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions