Skip to content

fraolBatole/aligned

Repository files navigation

Aligned

Goal-alignment verification for LLM agents.

Prevents agents from drifting away from the user's original intent by formalizing goals, verifying each action against them, and tracking progress.

Why I Built This

I was migrating a large research infrastructure to a new server. The codebase was big enough that the agents I worked with would wander off constantly, exploring dozens of files that had nothing to do with what I asked, and sometimes editing files when all I wanted was an analysis of the repo.

The problem got worse when I hit my token limits and the client downgraded me to a smaller model. Smaller agents tend to over-reach. They try to be helpful by doing extra things you never asked for, touching code that should not be touched, and quietly drifting from the original task until you notice the damage.

I wanted a tool that keeps the agent aligned throughout the entire session. Something that says: here is the goal, here is what is in scope, here is what you must not do. And then checks every action against that before the agent executes it. That is what Aligned does.

How It Works

Formalize → Verify → Track

  1. The agent calls formalize_goal with the user's instruction. The server produces a structured goal spec (intent, success criteria, scope, invariants, anti-goals).
  2. Before each tool call, the agent calls verify_action with the tool name and params. The server returns a verdict: aligned, drifting, misaligned, or orthogonal.
  3. Periodically, the agent calls check_progress to see which success criteria are met and what gaps remain.

Installation

npm install aligned-mcp

Or run directly:

npx aligned-mcp

Configuration

Set these environment variables:

Variable Required Description
ALIGNED_LLM_PROVIDER Yes "anthropic" or "openai"
ANTHROPIC_API_KEY If using Anthropic Anthropic API key
OPENAI_API_KEY If using OpenAI OpenAI API key
ALIGNED_MODEL No Model name (default: claude-sonnet-4-20250514 / gpt-4o)
ALIGNED_LOG_LEVEL No "debug", "info", or "warn" (default: "info")
ALIGNED_HISTORY_LIMIT No Max actions kept in history (default: 50)

MCP Client Setup

Claude Desktop

Add to claude_desktop_config.json:

{
  "mcpServers": {
    "aligned": {
      "command": "npx",
      "args": ["-y", "aligned-mcp"],
      "env": {
        "ALIGNED_LLM_PROVIDER": "anthropic",
        "ANTHROPIC_API_KEY": "sk-ant-..."
      }
    }
  }
}

VS Code (Copilot / Continue / etc.)

Add to your MCP config (.vscode/mcp.json or equivalent):

{
  "servers": {
    "aligned": {
      "command": "npx",
      "args": ["-y", "aligned-mcp"],
      "env": {
        "ALIGNED_LLM_PROVIDER": "openai",
        "OPENAI_API_KEY": "sk-..."
      }
    }
  }
}

Cursor

Add to .cursor/mcp.json:

{
  "mcpServers": {
    "aligned": {
      "command": "npx",
      "args": ["-y", "aligned-mcp"],
      "env": {
        "ALIGNED_LLM_PROVIDER": "anthropic",
        "ANTHROPIC_API_KEY": "sk-ant-..."
      }
    }
  }
}

Tools

formalize_goal

Call this first. Translates the user's instruction into a structured goal spec.

Input:  { instruction: string, context?: string }
Output: { goal_id, intent, success_criteria, scope, invariants, anti_goals }

verify_action

Checks if a proposed tool call aligns with the active goal. Uses lightweight pattern matching for read operations and full LLM verification for writes.

Input:  { goal_id, tool_name, tool_params, rationale? }
Output: { verdict, confidence, reasoning, drift_vector, suggestion, violated }

Verdicts:

  • aligned — action serves the goal
  • drifting — action is starting to diverge; course-correct
  • misaligned — action contradicts the goal or violates invariants
  • orthogonal — action is unrelated; justify or skip

update_goal

Refines the active goal. Requires an explicit reason to prevent silent drift.

Input:  { goal_id, refinement, reason }
Output: Updated goal spec

get_goal

Returns the active goal spec (no LLM call).

Input:  { goal_id }
Output: Current goal spec

check_progress

Evaluates which success criteria are met and estimates completion.

Input:  { goal_id, completed_actions: [{ tool_name, tool_params, result_summary? }] }
Output: { criteria_status, completion_estimate, remaining_gaps }

Development

git clone https://github.com/your-org/aligned.git
cd aligned
npm install
npm run build
npm run dev    # run with tsx (hot reload)
npm test       # run tests

Architecture

src/
├── index.ts              # Entry point (stdio transport)
├── server.ts             # MCP tool registration
├── goal/
│   ├── types.ts          # Zod schemas & TypeScript types
│   ├── formalize.ts      # Goal formalization (LLM-backed)
│   ├── store.ts          # In-memory session state
│   └── update.ts         # Goal refinement
├── verify/
│   ├── verifier.ts       # Two-tier verification engine
│   ├── risk-tier.ts      # Tool classification (light vs full)
│   └── prompts.ts        # LLM prompt templates
├── progress/
│   └── tracker.ts        # Progress evaluation
├── llm/
│   ├── provider.ts       # LLM provider interface & factory
│   ├── anthropic.ts      # Anthropic Claude adapter
│   ├── openai.ts         # OpenAI adapter
│   └── schema-util.ts    # Zod → JSON Schema conversion
└── util/
    └── logger.ts         # Structured stderr logging

License

MIT

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published