Skip to content

feat(agent): add agent mode planner and execution tooling#73

Closed
oshorefueled wants to merge 1 commit intomainfrom
feat/agent-mode
Closed

feat(agent): add agent mode planner and execution tooling#73
oshorefueled wants to merge 1 commit intomainfrom
feat/agent-mode

Conversation

@oshorefueled
Copy link
Contributor

@oshorefueled oshorefueled commented Mar 19, 2026

Why

VectorLint could only evaluate one document at a time, which misses cross-document issues like terminology drift and corpus-level structural gaps. This change adds an agent mode that can route rules appropriately and perform cross-document analysis while preserving the existing lint workflow.

What

  • Add a new src/agent/ module with:
    • Rule routing planner (runPlanner) with explicit mode overrides (lint/agent) and LLM-based fallback classification
    • Agent executor (runAgentExecutor) using Vercel AI SDK tool-use loop
    • Result merger (mergeFindings) and shared agent finding/task schemas
    • Read-only tool suite: read_file, search_content, search_files, list_directory, and lint
  • Add CLI agent-mode wiring:
    • New --mode <mode> option
    • mode support in CLI option schema and evaluation options
    • Agent-mode branch in orchestrator (evaluateFiles) that runs planner + lint tasks + agent tasks
  • Ensure agent-mode lint execution reuses the existing lint pipeline (evaluateFile) so filtering, scoring, and formatter behavior remain consistent
  • Extend rule frontmatter metadata with optional mode: lint | agent
  • Extend output support:
    • Line output can render agent findings
    • JSON issue schema supports source: 'lint' | 'agent'
  • Add getModel() on VercelAIProvider for agent-mode model access
  • Add full agent-focused test coverage under tests/agent/* plus prompt schema mode tests

Scope

In scope

  • Agent mode scaffolding and orchestration in CLI
  • Planner, executor, merger, and tooling modules
  • Schema and output updates needed for agent findings
  • Automated test coverage for new modules

Out of scope

  • Artifact publishing
  • Bugsy review trigger integration changes
  • Service/platform webhook wiring beyond CLI-local behavior

Behavior impact

  • Default behavior remains unchanged (lint mode path).
  • Passing --mode agent activates planner-driven hybrid evaluation:
    • Rules may be routed to lint tasks or agent tasks.
    • Lint-path reporting/filters/scoring remain aligned with existing behavior.
    • Agent tasks can use read-only repo tools for cross-document evidence gathering.
  • Rules can now force routing via frontmatter mode.

Risk

  • Agent mode remains a new execution path and may need further iteration on result UX for mixed findings.
  • Agent-mode smoke command requires standard VectorLint config files present in repo root.

How to test / verify

Checks run

  • npm run test:run (pass)
  • npm run lint (pass)
  • npm run build (pass)

Manual verification

  1. Run lint mode as before and confirm existing output behavior is unchanged.
  2. Run vectorlint --mode agent <target-files> with a configured environment.
  3. Confirm:
  • Rules with mode: agent skip LLM planning and run as agent tasks.
  • Rules with mode: lint skip LLM planning and run as lint tasks.
  • Rules without mode are classified by planner.
  1. Verify agent findings appear in line output and include source: "agent" in JSON output.

Notes:

  • A local smoke attempt using npm run dev -- --mode agent ./presets/VectorLint/ai-pattern.md failed in this workspace due missing .vectorlint.ini/VECTORLINT.md, which is expected without repo-local config bootstrap.

Rollback

  • Revert the two commits on feat/agent-mode to restore lint-only execution path.
  • No data migrations or persistent state changes are introduced.

- add agent module foundations: finding/task types, planner routing, executor loop, merger, and tool suite

- wire --mode through CLI and orchestrator, including agent-mode branch and reporter/json output support

- add comprehensive tests for agent modules and prompt mode schema updates
@coderabbitai
Copy link

coderabbitai bot commented Mar 19, 2026

📝 Walkthrough

Walkthrough

This PR introduces a complete agent evaluation system that routes rules to lint or agent execution modes, provides file system tools for agent code analysis, and integrates agent findings into the orchestrator's pipeline with merged output combining both evaluation sources.

Changes

Cohort / File(s) Summary
Agent Core System
src/agent/agent-executor.ts, src/agent/planner.ts, src/agent/types.ts, src/agent/merger.ts, src/agent/index.ts
Introduces agent executor with LLM-based evaluation, task planner for routing rules, unified type schemas for findings/task plans, and finding merger combining lint and agent results.
Agent Tools
src/agent/tools/read-file.ts, src/agent/tools/search-content.ts, src/agent/tools/search-files.ts, src/agent/tools/list-directory.ts, src/agent/tools/lint-tool.ts, src/agent/tools/path-utils.ts, src/agent/tools/index.ts
Implements six file system and lint tools with path traversal guards, pagination support, and dual-mode search (ripgrep vs JS fallback).
CLI Integration
src/cli/commands.ts, src/cli/orchestrator.ts, src/cli/types.ts
Adds --mode CLI option, routes execution to runAgentMode() when agent mode is selected, constructs agent tools, and outputs merged findings.
Schema & Provider Updates
src/schemas/cli-schemas.ts, src/schemas/prompt-schemas.ts, src/providers/vercel-ai-provider.ts
Extends CLI and prompt meta schemas with optional mode field, adds getModel() accessor to VercelAIProvider.
Output Formatting
src/output/json-formatter.ts, src/output/reporter.ts
Adds optional source field to Issue interface and new printAgentFinding() function for agent finding output.
Test Suite
tests/agent/agent-executor.test.ts, tests/agent/planner.test.ts, tests/agent/tools/*.test.ts, tests/agent/types.test.ts, tests/prompt-schema.test.ts
Comprehensive test coverage for agent executor, planner, all tools, type schemas, and mode field validation.

Sequence Diagram(s)

sequenceDiagram
    participant CLI as CLI / Commands
    participant Orch as Orchestrator
    participant Planner as Task Planner
    participant LLM as Language Model
    participant Lint as Lint Evaluator
    participant Agent as Agent Executor
    participant Tools as File System Tools
    participant Merger as Finding Merger
    participant Output as Output Formatter

    CLI->>Orch: evaluateFiles(targets, {mode: 'agent'})
    Orch->>Planner: runPlanner(rules, targetFiles, provider)
    Planner->>LLM: runPromptStructured(classify rules)
    LLM-->>Planner: {ruleId, classification}
    Planner-->>Orch: {lintTasks, agentTasks}
    
    rect rgba(100, 150, 200, 0.5)
    Note over Orch,Lint: Lint Path
    par
        loop for each lintTask
            Orch->>Lint: evaluate(file, rule)
            Lint-->>Orch: violations, score
        end
    and
        loop for each agentTask
            Orch->>Agent: runAgentExecutor(rule, tools)
            Agent->>LLM: generateText(prompt, tools)
            LLM->>Tools: call read_file / search_content / etc
            Tools-->>LLM: file content / search results
            LLM-->>Agent: findings
            Agent-->>Orch: AgentRunResult
        end
    end
    end
    
    Orch->>Merger: mergeFindings(lintResults, agentResults)
    Merger-->>Orch: MergedFinding[]
    Orch->>Output: format & print merged findings
    Output-->>CLI: output (console + JSON)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~65 minutes

Possibly related PRs

Suggested reviewers

  • ayo6706

Poem

🐰 A rabbit hops through agent lands,
With tools and plans held in paws' hands,
Where files speak truth and patterns shine,
Agent and lint modes now align,
Cross-document wisdom takes its stand!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 11.54% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The PR title 'feat(agent): add agent mode planner and execution tooling' is a concise, clear summary that accurately reflects the main changes: introduction of agent mode with planner and executor tools.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/agent-mode
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Tip

Migrating from UI to YAML configuration.

Use the @coderabbitai configuration command in a PR comment to get a dump of all your UI settings in YAML format. You can then edit this YAML file and upload it to the root of your repository to configure CodeRabbit programmatically.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 11

🧹 Nitpick comments (13)
tests/agent/list-directory.test.ts (1)

39-46: Remove redundant mkdirSync call.

Line 40 creates subdir again, but it's already created in beforeEach (line 9). While harmless due to recursive: true, it's unnecessary.

♻️ Suggested fix
   it('lists a specific subdirectory', async () => {
-    mkdirSync(path.join(TMP, 'subdir'), { recursive: true });
     writeFileSync(path.join(TMP, 'subdir', 'nested.md'), '');
     const tool = createListDirectoryTool(TMP);
     const result = await tool.execute({ path: 'subdir' });
     expect(result).toContain('nested.md');
     expect(result).not.toContain('file.md');
   });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/agent/list-directory.test.ts` around lines 39 - 46, Remove the
redundant mkdirSync call in the test: the subdirectory is already created in
beforeEach so delete the mkdirSync(path.join(TMP, 'subdir'), { recursive: true
}) line in the 'lists a specific subdirectory' test where
createListDirectoryTool(TMP) and tool.execute({ path: 'subdir' }) are used,
leaving only the writeFileSync for nested.md and the assertions.
src/agent/tools/path-utils.ts (1)

16-23: Consider case-insensitive comparison on Windows.

On case-insensitive filesystems (Windows, some macOS configurations), paths like /Repo/file and /repo/file refer to the same location but would fail the startsWith check. This could allow bypasses or cause false rejections depending on how paths are constructed.

For a read-only agent tool suite, this is low risk but worth noting for future hardening.

♻️ Optional: Normalize case on Windows
 export function isWithinRoot(absolutePath: string, root: string): boolean {
   const normalizedPath = path.resolve(absolutePath);
   const normalizedRoot = path.resolve(root);
+  // On case-insensitive systems, normalize case for comparison
+  const comparePath = process.platform === 'win32' ? normalizedPath.toLowerCase() : normalizedPath;
+  const compareRoot = process.platform === 'win32' ? normalizedRoot.toLowerCase() : normalizedRoot;
   return (
-    normalizedPath.startsWith(normalizedRoot + path.sep) ||
-    normalizedPath === normalizedRoot
+    comparePath.startsWith(compareRoot + path.sep) ||
+    comparePath === compareRoot
   );
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/agent/tools/path-utils.ts` around lines 16 - 23, The isWithinRoot
function uses startsWith which is case-sensitive and can misclassify equivalent
paths on case-insensitive systems; update isWithinRoot to perform a
case-insensitive comparison on Windows (process.platform === 'win32') by
normalizing the resolved paths (normalizedPath and normalizedRoot) to a common
case (e.g., .toLowerCase()) before doing the startsWith or equality check while
preserving the existing use of path.resolve and path.sep so behavior on POSIX
systems remains unchanged.
tests/agent/read-file.test.ts (1)

41-47: Consider adding edge case tests for offset validation.

To complement the offset/limit pagination test, consider adding tests for:

  • offset: 0 (invalid per 1-indexed semantics)
  • Offset beyond file length (should throw per implementation)

This would ensure the validation behavior is locked in.

♻️ Suggested additional tests
   it('returns actionable notice when file is truncated', async () => {
     // ...
   });
+
+  it('throws for offset beyond end of file', async () => {
+    writeFileSync(path.join(TMP, 'short.md'), 'Line 1\nLine 2');
+    const tool = createReadFileTool(TMP);
+    await expect(tool.execute({ path: 'short.md', offset: 100 })).rejects.toThrow('beyond end of file');
+  });
 });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/agent/read-file.test.ts` around lines 41 - 47, Add two edge-case tests
for createReadFileTool's execute method: one calling tool.execute({ path:
'big.md', offset: 0 }) asserting it returns/contains an error or actionable
message for invalid 1-indexed offset (offset:0 disallowed), and another calling
tool.execute with offset larger than the file's total lines (e.g., offset > 250)
asserting it throws or returns the expected error per the implementation;
reference createReadFileTool and the tool.execute invocation to locate where to
add these tests and mirror the existing 'big.md' setup in the current test.
tests/agent/path-utils.test.ts (1)

15-19: Strengthen the ~ expansion assertion.

The test only verifies that ~ is absent and file.md is present, but doesn't confirm the path actually resolves to the home directory. This could miss bugs where ~ is stripped incorrectly.

♻️ Suggested stronger assertion
+import * as os from 'node:os';
+
 describe('resolveToCwd', () => {
   // ...
   it('expands ~ to home directory', () => {
     const result = resolveToCwd('~/file.md', '/repo');
-    expect(result).toContain('file.md');
-    expect(result).not.toContain('~');
+    expect(result).toBe(`${os.homedir()}/file.md`);
   });
 });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/agent/path-utils.test.ts` around lines 15 - 19, The test for
resolveToCwd should assert that '~' is expanded to the actual user home
directory rather than merely removed; update the test using
resolveToCwd('~/file.md', '/repo') to compare against the expected home path
(e.g., path.join(os.homedir(), 'file.md')) or assert the result
startsWith(os.homedir()), referencing the resolveToCwd function and using
os.homedir() (and path.join) to build the precise expected value so the
assertion guarantees correct expansion.
tests/agent/search-files.test.ts (1)

20-46: Add test for path traversal rejection.

Other tool tests (list-directory, read-file) include tests verifying that paths outside the root throw errors. For consistency and to ensure the security boundary is tested, consider adding a similar test here.

♻️ Suggested additional test
   it('returns no files found message when no matches', async () => {
     const tool = createSearchFilesTool(TMP);
     const result = await tool.execute({ pattern: '**/*.xyz' });
     expect(result).toContain('No files found');
   });
+
+  it('throws for paths outside root', async () => {
+    const tool = createSearchFilesTool(TMP);
+    await expect(tool.execute({ pattern: '*.md', path: '../outside' })).rejects.toThrow();
+  });
 });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/agent/search-files.test.ts` around lines 20 - 46, Add a test to verify
createSearchFilesTool rejects path traversal by calling
createSearchFilesTool(TMP).execute with a path that tries to escape the root
(e.g., '../outside' or '../../') and assert that it throws or returns an error
indicating invalid/out-of-bounds path; target the same test suite and use the
existing pattern of other tests (e.g., list-directory/read-file) to check for
thrown exceptions or error messages when path traversal is attempted,
referencing createSearchFilesTool and tool.execute to locate the code under
test.
src/agent/tools/list-directory.ts (1)

18-19: Remove unnecessary await Promise.resolve().

This appears to exist solely to satisfy the async signature. Since all I/O is synchronous, either remove async from the signature and change the return type, or keep it for consistency with other tools (which may need async operations in the future).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/agent/tools/list-directory.ts` around lines 18 - 19, The unnecessary
await Promise.resolve() inside the execute method should be removed: if you want
this tool to remain asynchronous for future I/O, keep the async execute({ path:
dirPath, limit }) signature and simply delete the await Promise.resolve() line;
alternatively, if you prefer a synchronous implementation, remove the async
modifier from execute and adjust its return type so it returns the directory
listing value directly. Ensure you update the execute function (the execute
method in this module) and any callers/typings that expect a Promise if you
change the signature.
src/agent/tools/read-file.ts (1)

9-14: Remove unused parameters property from interface.

The parameters property with hardcoded defaults (line 21) isn't used anywhere. The execute method already defines its parameter types. This appears to be dead code or a leftover from a different design.

♻️ Suggested interface cleanup
 export interface ReadFileTool {
   name: 'read_file';
   description: string;
-  parameters: {
-    path: string;
-    offset?: number;
-    limit?: number;
-  };
   execute(params: { path: string; offset?: number; limit?: number }): Promise<string>;
 }

And remove line 21:

-    parameters: { path: '', offset: undefined, limit: undefined },
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/agent/tools/read-file.ts` around lines 9 - 14, The interface declares an
unused "parameters" property alongside the "execute" method; remove the dead
"parameters" field from the interface so only the execute(params: { path:
string; offset?: number; limit?: number }): Promise<string> signature remains,
and also delete any related hardcoded defaults that referenced "parameters"
(search for the "parameters" identifier and the object with path/offset/limit
defaults) to keep the interface and implementation consistent.
src/agent/tools/search-files.ts (1)

17-30: Consider streaming or early termination for large result sets.

The current implementation collects all matches before applying the limit. For patterns like **/* in large repositories, this could consume significant memory before truncation occurs.

fast-glob doesn't support early termination natively, so this is acceptable for the initial implementation, but worth noting for future optimization if agent mode is used on large codebases.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/agent/tools/search-files.ts` around lines 17 - 30, The current execute
method collects all matches via fg(...) before applying effectiveLimit which can
OOM on large repos; change the fg call in execute to use fast-glob's streaming
API (fg.stream or similar) with the same options (cwd: searchRoot, ignore,
onlyFiles, followSymbolicLinks) and consume the async iterator, pushing results
into the matches array until you reach effectiveLimit then break to stop further
iteration; keep resolveToCwd, isWithinRoot and DEFAULT_LIMIT logic unchanged and
ensure the final return uses the truncated matches array.
src/agent/tools/search-content.ts (1)

22-29: Consider caching hasRipgrep() result.

hasRipgrep() spawns a subprocess on every execute() call. Since ripgrep availability doesn't change during runtime, memoize the result.

♻️ Proposed caching
+let ripgrepAvailable: boolean | null = null;
+
 function hasRipgrep(): boolean {
+  if (ripgrepAvailable !== null) return ripgrepAvailable;
   try {
     const result = spawnSync('rg', ['--version'], { encoding: 'utf-8' });
-    return result.status === 0;
+    ripgrepAvailable = result.status === 0;
+    return ripgrepAvailable;
   } catch {
+    ripgrepAvailable = false;
     return false;
   }
 }

Also applies to: 157-161

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/agent/tools/search-content.ts` around lines 22 - 29, hasRipgrep()
currently spawns a subprocess on every call; memoize its result by adding a
module-level cache variable (e.g., let cachedHasRipgrep: boolean | undefined)
and change hasRipgrep() to return the cached value if defined, otherwise compute
once via spawnSync, store the result in cachedHasRipgrep and return it; update
any callers (notably the execute-related code paths that call hasRipgrep()) to
continue calling hasRipgrep() so they get the memoized result.
src/cli/commands.ts (1)

59-59: Consider specifying valid mode values in help text.

The help text describes what "agent" does but doesn't list valid values. Users might not know "lint" is also an option.

📝 Improved help text
-    .option('--mode <mode>', 'Evaluation mode: "agent" enables cross-document evaluation')
+    .option('--mode <mode>', 'Evaluation mode: "lint" (default) or "agent" (cross-document)')
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/cli/commands.ts` at line 59, Update the CLI option declaration for
.option('--mode <mode>') to include the valid mode values in the help text
(e.g., "agent" and "lint") so users see allowed choices; locate the
.option('--mode <mode>', 'Evaluation mode: "agent" enables cross-document
evaluation') call in commands.ts and change the description string to enumerate
the valid modes and briefly what each does (for example: 'Evaluation mode
(choices: "agent" | "lint"): "agent" enables cross-document evaluation, "lint"
runs single-document lint checks').
src/agent/agent-executor.ts (1)

183-190: Silent error handling loses observability.

Both parse failures (line 183-184) and execution errors (line 188-189) silently return empty findings without logging. This makes debugging difficult when the agent fails to produce expected results.

Consider logging errors at warning level for observability:

♻️ Suggested improvement
     if (!parsed.success) {
+      console.warn(`[vectorlint] Agent output parse failed for rule ${rule.meta.id}: ${parsed.error.message}`);
       return { findings: [], ruleId: rule.meta.id };
     }

     return { findings: parsed.data.findings, ruleId: rule.meta.id };
-  } catch {
+  } catch (error: unknown) {
+    const message = error instanceof Error ? error.message : String(error);
+    console.warn(`[vectorlint] Agent execution failed for rule ${rule.meta.id}: ${message}`);
     return { findings: [], ruleId: rule.meta.id };
   }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/agent/agent-executor.ts` around lines 183 - 190, The code silently
returns empty findings on parse failures and exceptions (the parsed check and
the catch block), losing observability; update the parsed.failure branch to log
a warning including parsed error/details and rule.meta.id (e.g.,
processLogger.warn or logger.warn) and update the catch to catch (err) and log a
warning with the thrown error and rule.meta.id before returning { findings: [],
ruleId: rule.meta.id } so both parse failures and runtime exceptions are visible
for debugging.
src/cli/orchestrator.ts (1)

1145-1185: Consider parallelizing lint task execution.

Lint tasks execute sequentially (nested for-loops at lines 1145-1146), while agent tasks run in parallel (line 1215). For consistency and performance with many files, consider using the existing runWithConcurrency helper:

♻️ Optional refactor sketch
const lintWorkItems = plan.lintTasks.flatMap(task => 
  task.targetFiles.map(file => ({ task, file }))
);

await runWithConcurrency(lintWorkItems, options.concurrency, async ({ task, file }) => {
  // existing evaluation logic
});
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/cli/orchestrator.ts` around lines 1145 - 1185, The current nested loops
over plan.lintTasks and task.targetFiles run evaluations sequentially; change
this to parallelized processing using the existing runWithConcurrency helper by
transforming tasks into work items (e.g., { task, file }) and calling
runWithConcurrency(workItems, options.concurrency, async ({ task, file }) => {
... }). Inside the worker use the same logic that creates an evaluator via
createEvaluator(Type.BASE, provider, task.rule, options.searchProvider), calls
evaluator.evaluate(relativeFile, content), updates lintFindings, token counters
(totalInputTokens/totalOutputTokens), counts violations with countViolations and
updates totalErrors/totalWarnings/hadSeverityErrors, and preserves the existing
error handling behavior (increment requestFailures and set hadOperationalErrors)
for each item.
src/agent/tools/lint-tool.ts (1)

61-72: Consider clarifying score semantics for check results.

The function returns score: 0 for all check (non-judge) results, while judge results return the actual final_score. This asymmetry may confuse tool consumers expecting consistent scoring semantics.

Consider either:

  1. Using calculateCheckScore() to compute an equivalent score for check results
  2. Returning null or -1 to indicate "not applicable" for check mode
  3. Adding a comment explaining the intentional difference
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/agent/tools/lint-tool.ts` around lines 61 - 72, The current return always
sets score: 0 for check-mode lint results (the block building { score: 0,
violationCount, violations }), which is inconsistent with judge results that use
final_score; update this to compute a meaningful check score by calling
calculateCheckScore(violations) and return that value in place of 0 (or if you
prefer explicit non-applicability return null or -1), and ensure the returned
object fields (score, violationCount, violations) reflect that change and any
callers expecting number handle the new sentinel if you choose null/-1.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/agent/agent-executor.ts`:
- Around line 160-171: The request currently uses the deprecated v5 field
experimental_output and a masking type assertion; replace the
experimental_output block with the v6 stable output option by setting output:
Output.object({ schema: AGENT_OUTPUT_SCHEMA }) on the request (and ensure you
remove the trailing "as unknown as Parameters<typeof generateText>[0]" type
assertion on the request creation), avoid mixing output with tools
incompatibilities, and update the response handling to read the structured
result from result.object instead of the previous fallback pattern; reference
generateText, AGENT_OUTPUT_SCHEMA, and result.object when making these changes.

In `@src/agent/tools/list-directory.ts`:
- Around line 27-29: The error message thrown in the existence check uses the
possibly-undefined input dirPath which can produce "undefined" in the message;
update the throw in the block that checks existsSync(absolutePath) to reference
the resolved absolutePath (or a fallback like `${dirPath ?? absolutePath}`) so
the error shows a meaningful path; locate the check around existsSync and modify
the throw in the list-directory tool (e.g., the function handling dirPath /
absolutePath) to include the resolved path instead of raw dirPath.
- Around line 52-54: The limit-reached check is using entries.length which
counts directory items read, not the number of successfully processed results;
change the condition to use results.length (the array of successfully
read/parsed entries returned by the try/catch) when deciding whether to append
the "[limit reached]" message so the message only appears if the actual returned
results count equals the effectiveLimit; update the check that currently
references entries.length to reference results.length while keeping the same
message and use of effectiveLimit and output variables.

In `@src/agent/tools/read-file.ts`:
- Around line 41-45: Validate that the provided offset (when not undefined/null)
is a positive integer and reject invalid values before computing startIndex:
check that offset is a number, Number.isInteger(offset) and offset >= 1; if the
check fails throw a descriptive Error mentioning the invalid offset. After
validation compute startIndex using zero-based logic (startIndex = offset ?
offset - 1 : 0) and keep the existing bounds check against totalLines.
Reference: offset, startIndex, totalLines.

In `@src/agent/tools/search-content.ts`:
- Around line 101-102: The code constructs a RegExp directly from user-supplied
pattern (pattern -> new RegExp) which can cause ReDoS; update the module (the
code that creates const regex) to first validate the pattern with a safety check
(e.g., use the safe-regex library or equivalent validateSafeRegex(pattern)) and
only call new RegExp(pattern, opts.ignoreCase ? 'i' : '') if it passes; if it
fails validation (or if you deliberately want literal matching), escape the
pattern with an escapeRegExp utility and then build the regex from the escaped
string, and ensure you reference the same opts.ignoreCase when constructing the
fallback regex; optionally wrap expensive regex execution in a timeout/abort
mechanism where regex is applied.

In `@src/cli/orchestrator.ts`:
- Around line 1215-1235: The catch inside the concurrent Promise.all mutates
shared variables requestFailures and hadOperationalErrors causing a race;
instead, stop mutating those globals inside the task, have each mapped async
task return an object that includes a failure flag (e.g. return { findings: [],
ruleId: task.rule.meta.id, failed: true }) when runAgentExecutor throws, and
after Promise.all(agentResults) aggregate failures and set hadOperationalErrors
based on agentResults.some(r => r.failed) and compute requestFailures =
agentResults.filter(r => r.failed).length; update any downstream code to ignore
the temporary failed flag if not needed.
- Around line 1247-1268: The output loops over merged findings currently filter
for source === 'agent' which drops lint findings; update the loops that iterate
over merged (the variable from mergeFindings()) to handle both sources instead
of filtering: remove the `if (finding.source === 'agent')` guard and dispatch on
finding.source—call printAgentFinding/addAgentFindingToJsonFormatter for 'agent'
findings and call the corresponding lint handlers (e.g.,
printLintFinding/addLintFindingToJsonFormatter) for 'lint' findings, with a safe
default branch for unknown sources; ensure the JSON branch adds both types to
the same formatter before calling formatter.toJson().

In `@src/output/reporter.ts`:
- Around line 231-234: The current loop over finding.references uses a truthy
check for ref.startLine which incorrectly hides valid numeric values like 0 and
ignores endLine; update the logic inside the for (const ref of
finding.references) loop (in reporter.ts) to use explicit undefined checks
(e.g., ref.startLine !== undefined) and when startLine is present include
endLine if it is !== undefined to format "file:start-end" (or "file:start" when
no endLine) into the location variable before logging.

In `@src/providers/vercel-ai-provider.ts`:
- Around line 29-31: The code adds getModel() only on VercelAIProvider but
callers use an unsafe cast in getModelForAgentMode(), violating the LLMProvider
contract; fix by updating types instead of casting—either add an optional
getModel?: () => LanguageModel to the LLMProvider interface, or define a new
AgentCapableProvider extends LLMProvider with getModel(): LanguageModel and
narrow the provider type in getModelForAgentMode() to AgentCapableProvider (or
explicitly to VercelAIProvider) before calling getModel; update references to
LLMProvider, VercelAIProvider, getModel, and getModelForAgentMode accordingly so
the compiler enforces presence of getModel for agent mode.

In `@tests/agent/agent-executor.test.ts`:
- Around line 61-62: The test uses an unnecessary cast "{} as never" which
bypasses TypeScript checks; replace the cast with a properly typed empty object
by declaring the variable with the Partial type (e.g., change the tools usage to
const tools: Partial<AgentTools> = {};), and likewise remove other occurrences
like provider: {} as never and replace with const provider:
Partial<ProviderType> = {} (use the actual provider type used in that test) so
the tests rely on correct typing instead of never casts.

In `@tests/agent/search-content.test.ts`:
- Around line 43-48: The test's glob pattern only matches root-level files, so
update the test that calls createSearchContentTool(...) and tool.execute({
pattern: 'API key', glob: '*.md' }) to use a recursive glob like '**/*.md' (or
alternatively create the test files at repo root) so the .md files under docs/
are actually matched; ensure you change the glob argument passed to tool.execute
to '**/*.md' to verify that skip.ts is excluded by the glob filter rather than
by no matches.

---

Nitpick comments:
In `@src/agent/agent-executor.ts`:
- Around line 183-190: The code silently returns empty findings on parse
failures and exceptions (the parsed check and the catch block), losing
observability; update the parsed.failure branch to log a warning including
parsed error/details and rule.meta.id (e.g., processLogger.warn or logger.warn)
and update the catch to catch (err) and log a warning with the thrown error and
rule.meta.id before returning { findings: [], ruleId: rule.meta.id } so both
parse failures and runtime exceptions are visible for debugging.

In `@src/agent/tools/lint-tool.ts`:
- Around line 61-72: The current return always sets score: 0 for check-mode lint
results (the block building { score: 0, violationCount, violations }), which is
inconsistent with judge results that use final_score; update this to compute a
meaningful check score by calling calculateCheckScore(violations) and return
that value in place of 0 (or if you prefer explicit non-applicability return
null or -1), and ensure the returned object fields (score, violationCount,
violations) reflect that change and any callers expecting number handle the new
sentinel if you choose null/-1.

In `@src/agent/tools/list-directory.ts`:
- Around line 18-19: The unnecessary await Promise.resolve() inside the execute
method should be removed: if you want this tool to remain asynchronous for
future I/O, keep the async execute({ path: dirPath, limit }) signature and
simply delete the await Promise.resolve() line; alternatively, if you prefer a
synchronous implementation, remove the async modifier from execute and adjust
its return type so it returns the directory listing value directly. Ensure you
update the execute function (the execute method in this module) and any
callers/typings that expect a Promise if you change the signature.

In `@src/agent/tools/path-utils.ts`:
- Around line 16-23: The isWithinRoot function uses startsWith which is
case-sensitive and can misclassify equivalent paths on case-insensitive systems;
update isWithinRoot to perform a case-insensitive comparison on Windows
(process.platform === 'win32') by normalizing the resolved paths (normalizedPath
and normalizedRoot) to a common case (e.g., .toLowerCase()) before doing the
startsWith or equality check while preserving the existing use of path.resolve
and path.sep so behavior on POSIX systems remains unchanged.

In `@src/agent/tools/read-file.ts`:
- Around line 9-14: The interface declares an unused "parameters" property
alongside the "execute" method; remove the dead "parameters" field from the
interface so only the execute(params: { path: string; offset?: number; limit?:
number }): Promise<string> signature remains, and also delete any related
hardcoded defaults that referenced "parameters" (search for the "parameters"
identifier and the object with path/offset/limit defaults) to keep the interface
and implementation consistent.

In `@src/agent/tools/search-content.ts`:
- Around line 22-29: hasRipgrep() currently spawns a subprocess on every call;
memoize its result by adding a module-level cache variable (e.g., let
cachedHasRipgrep: boolean | undefined) and change hasRipgrep() to return the
cached value if defined, otherwise compute once via spawnSync, store the result
in cachedHasRipgrep and return it; update any callers (notably the
execute-related code paths that call hasRipgrep()) to continue calling
hasRipgrep() so they get the memoized result.

In `@src/agent/tools/search-files.ts`:
- Around line 17-30: The current execute method collects all matches via fg(...)
before applying effectiveLimit which can OOM on large repos; change the fg call
in execute to use fast-glob's streaming API (fg.stream or similar) with the same
options (cwd: searchRoot, ignore, onlyFiles, followSymbolicLinks) and consume
the async iterator, pushing results into the matches array until you reach
effectiveLimit then break to stop further iteration; keep resolveToCwd,
isWithinRoot and DEFAULT_LIMIT logic unchanged and ensure the final return uses
the truncated matches array.

In `@src/cli/commands.ts`:
- Line 59: Update the CLI option declaration for .option('--mode <mode>') to
include the valid mode values in the help text (e.g., "agent" and "lint") so
users see allowed choices; locate the .option('--mode <mode>', 'Evaluation mode:
"agent" enables cross-document evaluation') call in commands.ts and change the
description string to enumerate the valid modes and briefly what each does (for
example: 'Evaluation mode (choices: "agent" | "lint"): "agent" enables
cross-document evaluation, "lint" runs single-document lint checks').

In `@src/cli/orchestrator.ts`:
- Around line 1145-1185: The current nested loops over plan.lintTasks and
task.targetFiles run evaluations sequentially; change this to parallelized
processing using the existing runWithConcurrency helper by transforming tasks
into work items (e.g., { task, file }) and calling runWithConcurrency(workItems,
options.concurrency, async ({ task, file }) => { ... }). Inside the worker use
the same logic that creates an evaluator via createEvaluator(Type.BASE,
provider, task.rule, options.searchProvider), calls
evaluator.evaluate(relativeFile, content), updates lintFindings, token counters
(totalInputTokens/totalOutputTokens), counts violations with countViolations and
updates totalErrors/totalWarnings/hadSeverityErrors, and preserves the existing
error handling behavior (increment requestFailures and set hadOperationalErrors)
for each item.

In `@tests/agent/list-directory.test.ts`:
- Around line 39-46: Remove the redundant mkdirSync call in the test: the
subdirectory is already created in beforeEach so delete the
mkdirSync(path.join(TMP, 'subdir'), { recursive: true }) line in the 'lists a
specific subdirectory' test where createListDirectoryTool(TMP) and
tool.execute({ path: 'subdir' }) are used, leaving only the writeFileSync for
nested.md and the assertions.

In `@tests/agent/path-utils.test.ts`:
- Around line 15-19: The test for resolveToCwd should assert that '~' is
expanded to the actual user home directory rather than merely removed; update
the test using resolveToCwd('~/file.md', '/repo') to compare against the
expected home path (e.g., path.join(os.homedir(), 'file.md')) or assert the
result startsWith(os.homedir()), referencing the resolveToCwd function and using
os.homedir() (and path.join) to build the precise expected value so the
assertion guarantees correct expansion.

In `@tests/agent/read-file.test.ts`:
- Around line 41-47: Add two edge-case tests for createReadFileTool's execute
method: one calling tool.execute({ path: 'big.md', offset: 0 }) asserting it
returns/contains an error or actionable message for invalid 1-indexed offset
(offset:0 disallowed), and another calling tool.execute with offset larger than
the file's total lines (e.g., offset > 250) asserting it throws or returns the
expected error per the implementation; reference createReadFileTool and the
tool.execute invocation to locate where to add these tests and mirror the
existing 'big.md' setup in the current test.

In `@tests/agent/search-files.test.ts`:
- Around line 20-46: Add a test to verify createSearchFilesTool rejects path
traversal by calling createSearchFilesTool(TMP).execute with a path that tries
to escape the root (e.g., '../outside' or '../../') and assert that it throws or
returns an error indicating invalid/out-of-bounds path; target the same test
suite and use the existing pattern of other tests (e.g.,
list-directory/read-file) to check for thrown exceptions or error messages when
path traversal is attempted, referencing createSearchFilesTool and tool.execute
to locate the code under test.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: e277c068-ead5-44b2-89ae-3ef6e9881b58

📥 Commits

Reviewing files that changed from the base of the PR and between 27097c3 and 6ab2178.

📒 Files selected for processing (30)
  • src/agent/agent-executor.ts
  • src/agent/index.ts
  • src/agent/merger.ts
  • src/agent/planner.ts
  • src/agent/tools/index.ts
  • src/agent/tools/lint-tool.ts
  • src/agent/tools/list-directory.ts
  • src/agent/tools/path-utils.ts
  • src/agent/tools/read-file.ts
  • src/agent/tools/search-content.ts
  • src/agent/tools/search-files.ts
  • src/agent/types.ts
  • src/cli/commands.ts
  • src/cli/orchestrator.ts
  • src/cli/types.ts
  • src/output/json-formatter.ts
  • src/output/reporter.ts
  • src/providers/vercel-ai-provider.ts
  • src/schemas/cli-schemas.ts
  • src/schemas/prompt-schemas.ts
  • tests/agent/agent-executor.test.ts
  • tests/agent/list-directory.test.ts
  • tests/agent/merger.test.ts
  • tests/agent/path-utils.test.ts
  • tests/agent/planner.test.ts
  • tests/agent/read-file.test.ts
  • tests/agent/search-content.test.ts
  • tests/agent/search-files.test.ts
  • tests/agent/types.test.ts
  • tests/prompt-schema.test.ts

Comment on lines +160 to +171
try {
const request = {
model,
system: systemPrompt,
prompt: `Evaluate the documentation according to the rule "${rule.meta.name}".`,
tools: sdkTools,
maxSteps: 25,
abortSignal: signal,
experimental_output: {
schema: AGENT_OUTPUT_SCHEMA,
},
} as unknown as Parameters<typeof generateText>[0];
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

Vercel AI SDK generateText experimental_output option TypeScript

💡 Result:

The experimental_output option in Vercel AI SDK's generateText (and streamText) is an experimental feature available in v5 for generating structured outputs. It has been replaced by the stable output option in v6+. In v5, pass a structured schema (e.g., Zod) to experimental_output: typescript import { generateText } from 'ai'; import { z } from 'zod'; import { openai } from '@ai-sdk/openai'; const { experimental_output } = await generateText({ model: openai('gpt-4o'), experimental_output: { schema: z.object({ name: z.string, age: z.number }) }, prompt: 'Generate a person object.' }); The result includes result.experimental_output with the parsed, validated object. Note: Combining experimental_output with tools can fail for some providers (e.g., Groq, Google) as they don't support JSON mode + tools simultaneously. In current versions (v6+), use output: Output.object({ schema }) instead for stable structured generation, supporting Zod/JSON schemas with tool calling. Official v5 docs confirm result.experimental_output in the generateText result object.

Citations:


🏁 Script executed:

# First, let's check the actual file and context around lines 160-171
sed -n '150,180p' src/agent/agent-executor.ts

Repository: TRocket-Labs/vectorlint

Length of output: 956


🏁 Script executed:

# Check the SDK version being used
cat package.json | grep -A 5 '"ai"'

Repository: TRocket-Labs/vectorlint

Length of output: 319


🏁 Script executed:

# Look for imports and understand what SDK is being used
head -50 src/agent/agent-executor.ts

Repository: TRocket-Labs/vectorlint

Length of output: 1738


🏁 Script executed:

# Check for any v6 migration patterns or comments in the codebase
rg -i "output.*schema|structured.*output" src/ --type ts -A 2 -B 2

Repository: TRocket-Labs/vectorlint

Length of output: 3026


🏁 Script executed:

# Check if there are other similar type assertions or experimental patterns
rg "as unknown as" src/ --type ts -B 2 -A 2

Repository: TRocket-Labs/vectorlint

Length of output: 2626


🏁 Script executed:

# Look at generateText return type handling to understand full context
sed -n '160,200p' src/agent/agent-executor.ts

Repository: TRocket-Labs/vectorlint

Length of output: 926


Migrate from deprecated v5 structured output API to v6 stable output option.

The code uses the v5 API pattern (experimental_output) in v6 SDK (v6.0.99). This feature was replaced in v6 by the stable output option. Additionally, combining experimental_output with tools is a known incompatibility that fails for some providers (e.g., Groq, Google).

Update the request object to use v6 API:

experimental_output: {
  schema: AGENT_OUTPUT_SCHEMA,
}

to:

output: Output.object({
  schema: AGENT_OUTPUT_SCHEMA,
})

Then remove the as unknown as Parameters<typeof generateText>[0] type assertion, which masks the API mismatch. Update the response handling to access result.object instead of the fallback pattern.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/agent/agent-executor.ts` around lines 160 - 171, The request currently
uses the deprecated v5 field experimental_output and a masking type assertion;
replace the experimental_output block with the v6 stable output option by
setting output: Output.object({ schema: AGENT_OUTPUT_SCHEMA }) on the request
(and ensure you remove the trailing "as unknown as Parameters<typeof
generateText>[0]" type assertion on the request creation), avoid mixing output
with tools incompatibilities, and update the response handling to read the
structured result from result.object instead of the previous fallback pattern;
reference generateText, AGENT_OUTPUT_SCHEMA, and result.object when making these
changes.

Comment on lines +27 to +29
if (!existsSync(absolutePath)) {
throw new Error(`Directory not found: ${dirPath}`);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Handle undefined dirPath in error message.

When dirPath is omitted (defaults to .), the error message would display "Directory not found: undefined". Consider using the resolved path or a fallback.

🐛 Proposed fix
       if (!existsSync(absolutePath)) {
-        throw new Error(`Directory not found: ${dirPath}`);
+        throw new Error(`Directory not found: ${dirPath ?? '.'}`);
       }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if (!existsSync(absolutePath)) {
throw new Error(`Directory not found: ${dirPath}`);
}
if (!existsSync(absolutePath)) {
throw new Error(`Directory not found: ${dirPath ?? '.'}`);
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/agent/tools/list-directory.ts` around lines 27 - 29, The error message
thrown in the existence check uses the possibly-undefined input dirPath which
can produce "undefined" in the message; update the throw in the block that
checks existsSync(absolutePath) to reference the resolved absolutePath (or a
fallback like `${dirPath ?? absolutePath}`) so the error shows a meaningful
path; locate the check around existsSync and modify the throw in the
list-directory tool (e.g., the function handling dirPath / absolutePath) to
include the resolved path instead of raw dirPath.

Comment on lines +52 to +54
if (entries.length > effectiveLimit) {
return `${output}\n\n[${effectiveLimit} entries limit reached. Use limit=${effectiveLimit * 2} for more.]`;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Limit comparison uses wrong count.

The condition compares entries.length > effectiveLimit, but results.length may be less than effectiveLimit if some entries were unreadable (caught in the try/catch at line 43). This could show a misleading "limit reached" message when fewer results were actually returned.

🐛 Proposed fix
-      if (entries.length > effectiveLimit) {
+      if (results.length >= effectiveLimit && entries.length > results.length) {
         return `${output}\n\n[${effectiveLimit} entries limit reached. Use limit=${effectiveLimit * 2} for more.]`;
       }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/agent/tools/list-directory.ts` around lines 52 - 54, The limit-reached
check is using entries.length which counts directory items read, not the number
of successfully processed results; change the condition to use results.length
(the array of successfully read/parsed entries returned by the try/catch) when
deciding whether to append the "[limit reached]" message so the message only
appears if the actual returned results count equals the effectiveLimit; update
the check that currently references entries.length to reference results.length
while keeping the same message and use of effectiveLimit and output variables.

Comment on lines +41 to +45
const startIndex = offset ? Math.max(0, offset - 1) : 0;

if (startIndex >= totalLines) {
throw new Error(`Offset ${offset} is beyond end of file (${totalLines} lines total)`);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Validate offset is a positive integer when provided.

The description states offset is "1-indexed", but invalid values are silently coerced:

  • offset: 0 → treated as falsy, starts from line 1 (unexpected for user who thought it's 0-indexed)
  • offset: -5Math.max(0, -6) = 0, silently ignored
  • offset: 1.5 → would work but semantics are unclear

Since the Zod schema in agent-executor.ts doesn't constrain the value, defensive validation here would prevent silent misbehavior.

🛡️ Proposed defensive validation
+      if (offset !== undefined && (offset < 1 || !Number.isInteger(offset))) {
+        throw new Error(`Invalid offset: ${offset}. Must be a positive integer (1-indexed).`);
+      }
+
       const startIndex = offset ? Math.max(0, offset - 1) : 0;
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const startIndex = offset ? Math.max(0, offset - 1) : 0;
if (startIndex >= totalLines) {
throw new Error(`Offset ${offset} is beyond end of file (${totalLines} lines total)`);
}
if (offset !== undefined && (offset < 1 || !Number.isInteger(offset))) {
throw new Error(`Invalid offset: ${offset}. Must be a positive integer (1-indexed).`);
}
const startIndex = offset ? Math.max(0, offset - 1) : 0;
if (startIndex >= totalLines) {
throw new Error(`Offset ${offset} is beyond end of file (${totalLines} lines total)`);
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/agent/tools/read-file.ts` around lines 41 - 45, Validate that the
provided offset (when not undefined/null) is a positive integer and reject
invalid values before computing startIndex: check that offset is a number,
Number.isInteger(offset) and offset >= 1; if the check fails throw a descriptive
Error mentioning the invalid offset. After validation compute startIndex using
zero-based logic (startIndex = offset ? offset - 1 : 0) and keep the existing
bounds check against totalLines. Reference: offset, startIndex, totalLines.

Comment on lines +101 to +102

const regex = new RegExp(pattern, opts.ignoreCase ? 'i' : '');
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

ReDoS vulnerability: user-supplied pattern used directly in RegExp constructor.

The pattern parameter comes from external input (agent tool call) and is passed directly to new RegExp(). A maliciously crafted regex pattern (e.g., (a+)+$ with specific input) can cause catastrophic backtracking, hanging the process.

Consider validating regex complexity or using a timeout wrapper. Alternatively, treat pattern as a literal string and escape special characters when ripgrep is unavailable.

🛡️ Proposed fix: escape pattern for literal matching
+function escapeRegex(str: string): string {
+  return str.replace(/[.*+?^${}()|[\]\\]/g, '\\$&');
+}
+
 function searchWithJs(
   pattern: string,
   searchRoot: string,
   opts: { glob?: string; ignoreCase?: boolean; limit?: number }
 ): string {
   const glob = opts.glob ?? '**/*.md';
   const files = fg.sync(glob, {
     cwd: searchRoot,
     ignore: ['**/node_modules/**', '**/.git/**'],
     absolute: true,
   });

-  const regex = new RegExp(pattern, opts.ignoreCase ? 'i' : '');
+  // Escape pattern to prevent ReDoS; treat as literal search
+  const regex = new RegExp(escapeRegex(pattern), opts.ignoreCase ? 'i' : '');

Alternatively, if regex support is intentional, consider using a library like safe-regex to validate pattern safety before construction.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const regex = new RegExp(pattern, opts.ignoreCase ? 'i' : '');
function escapeRegex(str: string): string {
return str.replace(/[.*+?^${}()|[\]\\]/g, '\\$&');
}
function searchWithJs(
pattern: string,
searchRoot: string,
opts: { glob?: string; ignoreCase?: boolean; limit?: number }
): string {
const glob = opts.glob ?? '**/*.md';
const files = fg.sync(glob, {
cwd: searchRoot,
ignore: ['**/node_modules/**', '**/.git/**'],
absolute: true,
});
// Escape pattern to prevent ReDoS; treat as literal search
const regex = new RegExp(escapeRegex(pattern), opts.ignoreCase ? 'i' : '');
🧰 Tools
🪛 ast-grep (0.41.1)

[warning] 101-101: Regular expression constructed from variable input detected. This can lead to Regular Expression Denial of Service (ReDoS) attacks if the variable contains malicious patterns. Use libraries like 'recheck' to validate regex safety or use static patterns.
Context: new RegExp(pattern, opts.ignoreCase ? 'i' : '')
Note: [CWE-1333] Inefficient Regular Expression Complexity [REFERENCES]
- https://owasp.org/www-community/attacks/Regular_expression_Denial_of_Service_-_ReDoS
- https://cwe.mitre.org/data/definitions/1333.html

(regexp-from-variable)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/agent/tools/search-content.ts` around lines 101 - 102, The code
constructs a RegExp directly from user-supplied pattern (pattern -> new RegExp)
which can cause ReDoS; update the module (the code that creates const regex) to
first validate the pattern with a safety check (e.g., use the safe-regex library
or equivalent validateSafeRegex(pattern)) and only call new RegExp(pattern,
opts.ignoreCase ? 'i' : '') if it passes; if it fails validation (or if you
deliberately want literal matching), escape the pattern with an escapeRegExp
utility and then build the regex from the escaped string, and ensure you
reference the same opts.ignoreCase when constructing the fallback regex;
optionally wrap expensive regex execution in a timeout/abort mechanism where
regex is applied.

Comment on lines +1247 to +1268
if (outputFormat === OutputFormat.Line) {
for (const finding of merged) {
if (finding.source === 'agent') {
printAgentFinding(finding.finding);
}
}
} else if (outputFormat === OutputFormat.Json || outputFormat === OutputFormat.RdJson) {
const formatter =
outputFormat === OutputFormat.RdJson
? new RdJsonFormatter()
: new JsonFormatter();

for (const finding of merged) {
if (finding.source === 'agent') {
addAgentFindingToJsonFormatter(formatter, finding.finding);
}
}

console.log(formatter.toJson());
} else if (outputFormat === OutputFormat.ValeJson) {
console.log(JSON.stringify({}));
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check if merger.ts preserves lint findings and how they should be output
ast-grep --pattern 'export function mergeFindings($$$) {
  $$$
}'

Repository: TRocket-Labs/vectorlint

Length of output: 49


🏁 Script executed:

rg "mergeFindings" --type ts

Repository: TRocket-Labs/vectorlint

Length of output: 836


🏁 Script executed:

rg "lintFindings" -A 3 -B 3 --type ts src/cli/orchestrator.ts

Repository: TRocket-Labs/vectorlint

Length of output: 707


🏁 Script executed:

rg "source === 'agent'" --type ts

Repository: TRocket-Labs/vectorlint

Length of output: 194


🏁 Script executed:

cat -n src/agent/merger.ts

Repository: TRocket-Labs/vectorlint

Length of output: 918


🏁 Script executed:

cat -n tests/agent/merger.test.ts

Repository: TRocket-Labs/vectorlint

Length of output: 2349


Lint findings are discarded from output despite being intentionally merged.

The mergeFindings() function explicitly includes lint findings with source: 'lint' in the merged array (see src/agent/merger.ts), as confirmed by tests that verify both lint and agent findings are combined. However, the output loops at lines 1248–1252 (line format) and 1259–1263 (JSON format) filter to only source === 'agent', discarding all lint findings from output.

Remove the source === 'agent' filter to include both lint and agent findings in the output.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/cli/orchestrator.ts` around lines 1247 - 1268, The output loops over
merged findings currently filter for source === 'agent' which drops lint
findings; update the loops that iterate over merged (the variable from
mergeFindings()) to handle both sources instead of filtering: remove the `if
(finding.source === 'agent')` guard and dispatch on finding.source—call
printAgentFinding/addAgentFindingToJsonFormatter for 'agent' findings and call
the corresponding lint handlers (e.g.,
printLintFinding/addLintFindingToJsonFormatter) for 'lint' findings, with a safe
default branch for unknown sources; ensure the JSON branch adds both types to
the same formatter before calling formatter.toJson().

Comment on lines +231 to +234
if (finding.references && finding.references.length > 0) {
for (const ref of finding.references) {
const location = ref.startLine ? `${ref.file}:${ref.startLine}` : ref.file;
console.log(` -> ${location}`);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Use explicit undefined checks when formatting reference line numbers.

On Line 233, truthiness can hide valid numeric values (e.g., 0). Also, endLine can be preserved when present.

Suggested fix
-    for (const ref of finding.references) {
-      const location = ref.startLine ? `${ref.file}:${ref.startLine}` : ref.file;
-      console.log(`    -> ${location}`);
-    }
+    for (const ref of finding.references) {
+      const location =
+        ref.startLine !== undefined
+          ? ref.endLine !== undefined
+            ? `${ref.file}:${ref.startLine}-${ref.endLine}`
+            : `${ref.file}:${ref.startLine}`
+          : ref.file;
+      console.log(`    -> ${location}`);
+    }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if (finding.references && finding.references.length > 0) {
for (const ref of finding.references) {
const location = ref.startLine ? `${ref.file}:${ref.startLine}` : ref.file;
console.log(` -> ${location}`);
if (finding.references && finding.references.length > 0) {
for (const ref of finding.references) {
const location =
ref.startLine !== undefined
? ref.endLine !== undefined
? `${ref.file}:${ref.startLine}-${ref.endLine}`
: `${ref.file}:${ref.startLine}`
: ref.file;
console.log(` -> ${location}`);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/output/reporter.ts` around lines 231 - 234, The current loop over
finding.references uses a truthy check for ref.startLine which incorrectly hides
valid numeric values like 0 and ignores endLine; update the logic inside the for
(const ref of finding.references) loop (in reporter.ts) to use explicit
undefined checks (e.g., ref.startLine !== undefined) and when startLine is
present include endLine if it is !== undefined to format "file:start-end" (or
"file:start" when no endLine) into the location variable before logging.

Comment on lines +29 to +31
getModel(): LanguageModel {
return this.config.model;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check if other providers exist that might be used with agent mode
rg -l 'implements LLMProvider' src/providers/

Repository: TRocket-Labs/vectorlint

Length of output: 102


🏁 Script executed:

cat src/providers/llm-provider.ts

Repository: TRocket-Labs/vectorlint

Length of output: 449


🏁 Script executed:

rg -A 3 "getModel" src/ -t ts

Repository: TRocket-Labs/vectorlint

Length of output: 1960


🏁 Script executed:

fd "orchestrator" src/ -t ts

Repository: TRocket-Labs/vectorlint

Length of output: 236


🏁 Script executed:

cat -n src/providers/vercel-ai-provider.ts

Repository: TRocket-Labs/vectorlint

Length of output: 9380


🏁 Script executed:

rg -B 10 -A 15 "getModelForAgentMode" src/cli/orchestrator.ts

Repository: TRocket-Labs/vectorlint

Length of output: 1496


getModel() violates the LLMProvider interface contract.

The LLMProvider interface does not include getModel(). The method is only on VercelAIProvider, but the orchestrator accesses it via an unsafe cast: provider as unknown as { getModel?: () => LanguageModel } in getModelForAgentMode().

While a runtime type guard prevents calling undefined methods and throws an explicit error if the method is missing, this pattern violates the interface contract principle stated in the coding guidelines: "Add providers by implementing LLMProvider or SearchProvider interfaces."

Consider one of these approaches:

  1. Extend the interface: Add getModel?(): LanguageModel to LLMProvider for providers that support agent mode
  2. Separate interface: Create an AgentCapableProvider interface extending LLMProvider
  3. Explicit typing: Narrow the provider type to VercelAIProvider when mode === 'agent'
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/providers/vercel-ai-provider.ts` around lines 29 - 31, The code adds
getModel() only on VercelAIProvider but callers use an unsafe cast in
getModelForAgentMode(), violating the LLMProvider contract; fix by updating
types instead of casting—either add an optional getModel?: () => LanguageModel
to the LLMProvider interface, or define a new AgentCapableProvider extends
LLMProvider with getModel(): LanguageModel and narrow the provider type in
getModelForAgentMode() to AgentCapableProvider (or explicitly to
VercelAIProvider) before calling getModel; update references to LLMProvider,
VercelAIProvider, getModel, and getModelForAgentMode accordingly so the compiler
enforces presence of getModel for agent mode.

Comment on lines +61 to +62
tools: {} as never,
diffContext: 'Changed: docs/quickstart.md',
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify whether this unsafe-cast pattern appears elsewhere.
rg -n --type=ts '\bas\s+never\b' tests src

Repository: TRocket-Labs/vectorlint

Length of output: 261


🏁 Script executed:

cat -n tests/agent/agent-executor.test.ts | head -100

Repository: TRocket-Labs/vectorlint

Length of output: 3010


🏁 Script executed:

fd --type f -name "*.ts" src tests | xargs rg -l "AgentTools|executeAgent" | head -10

Repository: TRocket-Labs/vectorlint

Length of output: 342


🏁 Script executed:

rg -A 5 -B 5 "tools: \{\} as never" tests/agent/agent-executor.test.ts

Repository: TRocket-Labs/vectorlint

Length of output: 698


🏁 Script executed:

cat -n src/agent/agent-executor.ts | head -50

Repository: TRocket-Labs/vectorlint

Length of output: 2088


🏁 Script executed:

rg -B 10 -A 10 "export.*runAgentExecutor|function runAgentExecutor" src/agent/agent-executor.ts

Repository: TRocket-Labs/vectorlint

Length of output: 731


🏁 Script executed:

rg -n "AgentTools|type.*tools" src/agent/agent-executor.ts

Repository: TRocket-Labs/vectorlint

Length of output: 443


Remove unnecessary as never cast; Partial<AgentTools> already accepts empty objects.

Lines 61 and 84 cast {} as never which bypasses type checking. The function signature already expects tools: Partial<AgentTools>, and an empty object satisfies this without a cast. Replace with a properly typed variable:

const tools: Partial<AgentTools> = {};

Also appears at: tests/orchestrator-filtering.test.ts:41 with provider: {} as never.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/agent/agent-executor.test.ts` around lines 61 - 62, The test uses an
unnecessary cast "{} as never" which bypasses TypeScript checks; replace the
cast with a properly typed empty object by declaring the variable with the
Partial type (e.g., change the tools usage to const tools: Partial<AgentTools> =
{};), and likewise remove other occurrences like provider: {} as never and
replace with const provider: Partial<ProviderType> = {} (use the actual provider
type used in that test) so the tests rely on correct typing instead of never
casts.

Comment on lines +43 to +48
it('filters by glob pattern', async () => {
writeFileSync(path.join(tmpDir, 'docs', 'skip.ts'), 'API key = process.env.KEY');
const tool = createSearchContentTool(tmpDir);
const result = await tool.execute({ pattern: 'API key', glob: '*.md' });
expect(result).not.toContain('skip.ts');
});
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify how glob patterns behave in fast-glob for nested directories
# Check if *.md matches files in subdirectories or only root

# Look at how the glob is used in searchWithJs
rg -n -A5 'const glob = opts.glob' src/agent/tools/search-content.ts

# Check if fast-glob with *.md from cwd matches files in subdirectories
echo "Testing if fast-glob *.md matches nested files..."

Repository: TRocket-Labs/vectorlint

Length of output: 303


🏁 Script executed:

cat -n tests/agent/search-content.test.ts

Repository: TRocket-Labs/vectorlint

Length of output: 2350


Glob pattern *.md won't match nested files in docs/ subdirectory.

The test creates all files in docs/ (a.md, b.md, c.md, skip.ts) but searches with glob: '*.md', which only matches .md files in the root directory. Since no .md files exist at root level, the glob matches nothing, causing the search to return empty results. The assertion expect(result).not.toContain('skip.ts') passes, but for the wrong reason—not because glob filtering excludes .ts files, but because no files matched the pattern at all.

To test glob filtering properly, either use **/*.md to match nested files or place test files in the root directory.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/agent/search-content.test.ts` around lines 43 - 48, The test's glob
pattern only matches root-level files, so update the test that calls
createSearchContentTool(...) and tool.execute({ pattern: 'API key', glob: '*.md'
}) to use a recursive glob like '**/*.md' (or alternatively create the test
files at repo root) so the .md files under docs/ are actually matched; ensure
you change the glob argument passed to tool.execute to '**/*.md' to verify that
skip.ts is excluded by the glob filter rather than by no matches.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant