Skip to content

Conversation

@clkao
Copy link
Contributor

@clkao clkao commented Nov 4, 2025

Add a basic plugin system that allows hooking into pre/post setup and agent runs. Closes #39.

  • Turn dbt-mcp into a plugin so it's not a special case
  • Add an example superpowers plugin and enable claude skills

You can try with ade run ... --plugins superpowers

follow up once this is merged:

  • don't install claude again in superpowers setup once Dockerfile optimizations #15 is resolved.
  • make superpowers hook into prompt as -p mode doesn't trigger skills properly.

For reference: claude conversation for this PR

@clkao
Copy link
Contributor Author

clkao commented Nov 4, 2025

@bstancil can you help check --plugins dbt-mcp works as expected if you have the env handy?

@clkao
Copy link
Contributor Author

clkao commented Nov 4, 2025

Quick notes about the superpowers claude skills plugin:

  1. we need to a hook to decorate prompt, until claude -p loads and uses skill reliably.
  2. even when skills are loaded, claude might decide it's trivial task and go ahead fixing things.
  3. haiku-4.5 36% -> 45% with superpowers
  4. sonnet-4.5 55% -> 53% (no significant changes with superpowers)
  5. these are from just single trial and single run, but it suggested smaller model benefit from the superpower skills (mostly it's the systematic-debug skill)
  6. adding dbt-specific & data analytics skills would be the next step.

here's the current patch to get claude consider skills

diff --git a/ade_bench/agents/installed_agents/claude_code/claude_code_agent.py b/ade_bench/agents/installed_agents/claude_code/claude_code_agent.py
index bc5d092..4c2c4b4 100644
--- a/ade_bench/agents/installed_agents/claude_code/claude_code_agent.py
+++ b/ade_bench/agents/installed_agents/claude_code/claude_code_agent.py
@@ -32,7 +32,19 @@ class ClaudeCodeAgent(AbstractInstalledAgent):

     def _run_agent_commands(self, task_prompt: str) -> list[TerminalCommand]:
         header = "echo 'AGENT RESPONSE: ' && "
-        escaped_prompt = shlex.quote(task_prompt)
+        critical_instructions = """<CRITICAL_INSTRUCTION>
+BEFORE doing ANYTHING else, you MUST:
+1. Look at your available skills list
+2. Identify if any skill matches this task
+3. If a relevant skill exists, invoke it with the Skill tool
+4. If no relevant skill exists, proceed normally
+
+This is MANDATORY. Do NOT start working until you've checked for skills.
+</CRITICAL_INSTRUCTION>
+
+Now here's the actual task:"""
+
+        escaped_prompt = shlex.quote("\n".join([critical_instructions, task_prompt]))
         command = f"{header} claude --output-format stream-json --verbose -p {escaped_prompt}"

         if self._model_name:

@clkao clkao force-pushed the feature/plugin-system branch 2 times, most recently from b4f7775 to 7036672 Compare November 6, 2025 18:03
@clkao clkao marked this pull request as ready for review November 6, 2025 19:47
clkao and others added 18 commits November 7, 2025 17:44
…owers before we have proper hooks for post-agent-install

Also have git otherwise Claude get pretty confused.
Remove the --use-mcp CLI flag and use_mcp parameter throughout the codebase.
The used_mcp field in trial results is now derived from whether the dbt-mcp
plugin actually ran during the trial, determined by PluginRegistry.did_plugin_run().

Changes:
- Remove --use-mcp from both CLI commands (ade run and run_harness.py)
- Remove use_mcp parameter from Harness and AbstractInstalledAgent
- Add plugins_run tracking set to PluginRegistry
- Add did_plugin_run() method to check if specific plugin executed
- Update trial result creation to derive used_mcp from plugin execution
- Add validation to error out when non-existing plugins are specified

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
The SetupOrchestrator created during trial execution was missing the
enabled_plugins parameter, causing plugins to never run. Store
enabled_plugins as instance variable and pass to both orchestrator
instantiations.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
@clkao clkao force-pushed the feature/plugin-system branch from 7c0c2d6 to 411dc9a Compare November 8, 2025 01:45
Copy link
Collaborator

@bstancil bstancil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aiight, finally catching up to this. Three comments:

  1. Nitpick - for the CLI, i'd prefer the convention of spaces between params. can we do that, where it'd be --plugins dbt-mcp superpowers

  2. For the logging, since this is still early, I'm ok breaking the current logging fuction so that there's not a "used_mcp" value and instead there's a "plugins" value, and that's a list of the plugins that got used. So you could derive the dbt MCP one from that.

  3. Passing the plugins as a param in the setup orchestrator feels like wrong pattern here. all the other details like db, trial, etc, are handed via the run variant and the variant config. It seems like we should set the plugins that are used in that way, and then just reference that in the same way that we reference the other run params.

@bstancil
Copy link
Collaborator

bstancil commented Dec 4, 2025

@clkao - We're getting closer to getting more folks involved in this, so wanted to see where you are on this and the plugin approach? I've fully migrated things over to the CLI, so this should be a bit simpler now. But there might be more people poking around after this (https://www.getdbt.com/resources/webinars/analytics-data-engineer-bench), as a heads up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

plugin system for environment prep

2 participants