feat(agent): add agent mode for cross-document evaluation by oshorefueled · Pull Request #74 · TRocket-Labs/vectorlint

oshorefueled · 2026-03-19T23:24:51Z

Why

VectorLint could only evaluate files in isolation, which missed cross-document issues such as corpus-level consistency and structural gaps. This change adds an explicit agent mode so users can run multi-file evidence gathering while preserving the existing lint workflow as the default path.

What

GitHub-scoped PR contents: 28 files across 13 commits.
Add --mode CLI option with lint (default) and agent paths.
Introduce agent executor and finding model:
- src/agent/agent-executor.ts
- src/agent/types.ts
- src/agent/merger.ts
Add read-only agent tool suite under src/agent/tools/:
- lint, read_file, search_content, search_files, list_directory
Wire orchestrator agent branch to run one agent execution per rule and merge findings.
Add line/json output support for agent findings.
Expose provider language model access for agent tool-loop execution.
Add comprehensive agent tests in tests/agent/.
Harden failure behavior after local review:
- surface agent execution failures (no silent success)
- propagate operational errors to exit behavior
- improve fallback parity/validation in content search
- improve path safety and check-mode scoring behavior in lint tool

Scope

In scope

Agent-mode architecture and tooling in CLI runtime.
Output path updates for agent findings.
Tests for new agent modules.

Out of scope

Any write/edit/exec tools for agent mode.
Bugsy-triggered review workflow changes.
Posting implementation artifacts as PR comments.

Behavior impact

Existing users remain on lint mode by default (--mode lint).
--mode agent now enables cross-document evaluation through read-only tools.
Agent-mode runtime failures now surface as operational failures instead of being silently treated as zero findings.

Risk

New mode introduces additional execution path complexity (tool loop + multi-rule concurrency).
Misconfiguration risk in model/provider setup is mitigated by explicit operational failure reporting.
Path traversal risk is reduced with stricter root checks.

How to test / verify

Checks run

npm run test:run
npm run lint

Manual verification

Ensure local config exists (.vectorlint.ini and provider env) then run:
- npm run dev -- --mode agent <path>
Confirm agent output appears (line or json mode).
Validate non-agent behavior remains unchanged with default mode.

Rollback

Revert agent module additions and CLI --mode agent orchestration branch.
Revert provider interface extension (getLanguageModel) and output helper additions.
Re-run npm run lint && npm run test:run.

Summary by CodeRabbit

New Features
- Added --mode CLI option (lint | agent) and a full “agent” evaluation mode with integrated file reading, searching, directory listing, and rule-scoped linting.
- Agent findings include file/line context, messages, optional suggestions, and are rendered to the console or JSON.
Documentation
- Published an agentic capabilities execution log documenting agent-run output.
Tests
- Added comprehensive tests covering agent executor, tools, path utilities, types, and result merging.

coderabbitai · 2026-03-19T23:25:04Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d77e26c6-889c-4e70-8857-d07f49074afc

📥 Commits

Reviewing files that changed from the base of the PR and between 2a43d21 and fdb0d5a.

📒 Files selected for processing (5)

src/cli/commands.ts
src/cli/mode.ts
src/cli/orchestrator.ts
src/cli/types.ts
src/schemas/cli-schemas.ts

🚧 Files skipped from review as they are similar to previous changes (4)

src/cli/commands.ts
src/cli/types.ts
src/schemas/cli-schemas.ts
src/cli/orchestrator.ts

📝 Walkthrough

Walkthrough

Adds an agent execution path: an LLM-driven agent executor with workspace‑scoped tools (read/search/list/lint), type-validated agent finding schemas, CLI "agent" mode wiring, orchestration changes for concurrent rule runs, reporting/export adjustments, and comprehensive tests and docs.

Changes

Cohort / File(s)	Summary
Agent executor & orchestration `src/agent/agent-executor.ts`, `src/cli/orchestrator.ts`	New agent executor that builds prompts, exposes tools to the model, enforces output schema, and returns run results; orchestrator branches on `mode==='agent'`, runs agents concurrently, aggregates results, and alters printing/JSON output.
Agent subsystem exports & aggregation `src/agent/index.ts`, `src/agent/merger.ts`, `src/agent/types.ts`, `src/agent/tools/index.ts`	Barrel exports, type/Zod schemas for findings, and a simple collector that flattens agent run results.
Tools: read/search/list `src/agent/tools/read-file.ts`, `src/agent/tools/search-content.ts`, `src/agent/tools/search-files.ts`, `src/agent/tools/list-directory.ts`	Workspace-root constrained file read (line pagination/truncation), content search (ripgrep with JS fallback, context/limit), glob-based file discovery with limits, and sorted directory listing with dotfile support and truncation notices.
Tool: lint sub-tool `src/agent/tools/lint-tool.ts`	Rule-scoped lint tool exposing `execute({file, ruleId})`, blocks traversal, resolves rule, runs evaluator, normalizes judge-style vs violations-style outputs, and computes scores/violation lists.
Path utilities & safety `src/agent/tools/path-utils.ts`	Home expansion, cwd-relative resolution, and in-root verification using realpath normalization to prevent traversal/symlink escapes.
CLI mode, options & schemas `src/cli/mode.ts`, `src/cli/commands.ts`, `src/cli/types.ts`, `src/schemas/cli-schemas.ts`	Introduces `lint
Provider surface `src/providers/llm-provider.ts`, `src/providers/vercel-ai-provider.ts`	Adds `getLanguageModel()` to LLMProvider and implements it on VercelAIProvider so orchestrator/agent can obtain a model instance.
Output/reporting `src/output/reporter.ts`, `src/output/json-formatter.ts`	Adds `printAgentFinding` for inline/top-level rendering; JSON Issue shape extended with optional `source?: 'lint'
Tests `tests/agent/*` (many files)	Extensive Vitest coverage for executor, tools, path utils, types, merger, and listing/search behaviors.
Docs / logs `docs/logs/2026-03-17-agentic-capabilities.log.md`	Execution log documenting the agentic capabilities rollout and artifacts.

Sequence Diagram(s)

sequenceDiagram
    participant CLI as CLI
    participant Orch as Orchestrator
    participant Agent as Agent Executor
    participant LLM as Language Model
    participant Tools as Tool Set
    participant FS as File System

    CLI->>Orch: evaluateFiles(targets, { mode: "agent" })
    Orch->>Orch: build tools and diffContext
    Orch->>Orch: model = provider.getLanguageModel()
    Orch->>Agent: runAgentExecutor({ rule, cwd, model, tools, diffContext })
    Agent->>LLM: generateText(systemPrompt + toolSchemas, stepLimit)
    loop Agent-driven tool calls
        LLM->>Tools: invoke tool (read_file / search_content / list_directory / lint)
        Tools->>Tools: resolve path & isWithinRoot check
        Tools->>FS: read/list/search files
        FS-->>Tools: content/results
        Tools-->>LLM: tool response (paginated/truncated/no-match)
    end
    LLM-->>Agent: structured JSON { findings: [...] }
    Agent->>Agent: validate with AGENT_OUTPUT_SCHEMA
    Agent-->>Orch: { findings, ruleId, error? }
    Orch->>Orch: collectAgentFindings(allResults)
    Orch-->>CLI: print findings / emit JSON summary

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Possibly related PRs

feat: add zero-config style guide support with VECTORLINT.md #51: Related to composing system prompts and including VECTORLINT.md content used by the agent executor.
refactor(providers): migrate to Vercel AI SDK for unified LLM provider interface #62: Related to LLM provider API changes and VercelAIProvider model exposure (getLanguageModel()).
feat(token-usage): Add token usage tracking and cost calculation #40: Related to CLI/orchestrator changes that affect evaluation flow and mode handling.

Suggested reviewers

ayo6706

Poem

🐰 I hopped through roots and files today,

Tools in paw, I showed the way.
LLM asked, I fetched a line,
Findings stitched in tidy sign.
A rabbit cheers—safe paths, hooray!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 13.04% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the primary change: adding an agent mode for cross-document evaluation to the codebase.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/agent

📝 Coding Plan

Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 6

🧹 Nitpick comments (3)

tests/agent/list-directory.test.ts (1)

39-46: Minor redundancy: subdir is already created in beforeEach.

Line 40 re-creates subdir which is already set up by beforeEach at line 9. The recursive: true makes this harmless, but you could simplify by only writing the nested file.

♻️ Suggested simplification

   it('lists a specific subdirectory', async () => {
-    mkdirSync(path.join(TMP, 'subdir'), { recursive: true });
     writeFileSync(path.join(TMP, 'subdir', 'nested.md'), '');
     const tool = createListDirectoryTool(TMP);

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@tests/agent/list-directory.test.ts` around lines 39 - 46, The test "lists a
specific subdirectory" redundantly recreates the subdir; remove the
mkdirSync(path.join(TMP, 'subdir'), { recursive: true }) call and keep only
writeFileSync(path.join(TMP, 'subdir', 'nested.md'), '') so the test uses the
setup from beforeEach; locate the test using the it block name and the
createListDirectoryTool(TMP) / tool.execute({ path: 'subdir' }) calls to update
the snippet.

src/agent/tools/list-directory.ts (1)

18-59: Consider using async/await for consistency.

The function returns Promise<string> but wraps synchronous operations in Promise.resolve/Promise.reject. Other tools in this module (e.g., search-files.ts) use async/await. For consistency and readability, consider making this an async function.

♻️ Proposed refactor using async/await

-    execute({ path: dirPath, limit }) {
-      try {
-        const absolutePath = resolveToCwd(dirPath || '.', cwd);
-
-        if (!isWithinRoot(absolutePath, cwd)) {
-          return Promise.reject(new Error(`Path traversal blocked: ${dirPath} is outside the allowed root`));
-        }
-
-        if (!existsSync(absolutePath)) {
-          return Promise.reject(new Error(`Directory not found: ${dirPath}`));
-        }
+    async execute({ path: dirPath, limit }) {
+      const absolutePath = resolveToCwd(dirPath || '.', cwd);
+
+      if (!isWithinRoot(absolutePath, cwd)) {
+        throw new Error(`Path traversal blocked: ${dirPath} is outside the allowed root`);
+      }
+
+      if (!existsSync(absolutePath)) {
+        throw new Error(`Directory not found: ${dirPath}`);
+      }
         // ... rest of implementation using return instead of Promise.resolve
-        return Promise.resolve(output);
-      } catch (error) {
-        return Promise.reject(error instanceof Error ? error : new Error(String(error)));
-      }
+      return output;
     },

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/agent/tools/list-directory.ts` around lines 18 - 59, Convert the execute
method to an async function and stop wrapping sync results in
Promise.resolve/Promise.reject: change the execute signature to async execute({
path: dirPath, limit }) and inside use the existing try/catch but return plain
strings or throw Errors instead of Promise.resolve/Promise.reject; keep using
resolveToCwd, isWithinRoot, existsSync, readdirSync, statSync and path.join as
before, compute effectiveLimit from DEFAULT_LIMIT, build results, and when
errors occur throw the Error (or rethrow in catch using throw error instanceof
Error ? error : new Error(String(error))). Also remove any
Promise.resolve/Promise.reject uses and adjust the truncation message logic to
return the combined string directly.

src/agent/agent-executor.ts (1)

140-154: Consider adding fallback parsing for structured output robustness.

The Vercel AI SDK v6.x throws NoOutputGeneratedError when structured output parsing fails (commonly when finishReason is not "stop", especially with AI Gateway or certain provider configurations). While the existing try/catch handles the exception, the recommended pattern in the SDK's issue tracker is to catch NoOutputGeneratedError and fall back to manually parsing result.text as JSON when available. This improves robustness without changing the happy path.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/agent/agent-executor.ts` around lines 140 - 154, Catch the specific
NoOutputGeneratedError thrown by generateText and implement a fallback that
attempts to parse the raw text output (e.g., result.text or response.outputText)
as JSON when structured parsing fails; update the try/catch around generateText
in agent-executor (where generateText, AGENT_OUTPUT_SCHEMA, and
stepCountIs(MAX_AGENT_STEPS) are used) to detect NoOutputGeneratedError, attempt
JSON.parse on the raw result text to extract findings and ruleId, and only
rethrow if parsing is impossible so the existing happy path using
response.output.findings remains unchanged.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/agent/tools/path-utils.ts`:
- Around line 17-30: The containment check in isWithinRoot is unsafe because
normalizePath can use realpathSync for one input and path.resolve for the other;
fix by applying the same normalization strategy to both paths: attempt
realpathSync on both and if either call throws, fall back to using path.resolve
for both so both normalizedRoot and normalizedPath are produced by the same
method; keep the final startsWith/equals check and ensure you still append
path.sep when checking prefix so path boundary logic (normalizedRoot + path.sep)
remains correct.

In `@src/agent/tools/search-content.ts`:
- Line 159: Update the tool description string (the description property in the
search-content tool definition) to accurately state the default glob filter as
"**/*.md" instead of "*.md"; verify consistency with the other occurrences that
use "**/*.md" (seen near the uses at lines referencing the default glob in this
file) so the description matches the actual default behavior.
- Around line 104-109: The code constructs a RegExp directly from the untrusted
pattern (pattern / new RegExp(...)) which allows ReDoS; fix by validating or
replacing the engine before compiling: either validate the pattern with a safety
check (e.g., integrate safe-regex to reject dangerous patterns) or switch to a
backtracking-free engine (e.g., use re2 to instantiate the regex instead of
RegExp) and/or wrap regex execution in a short timeout guard to abort
long-running matches; update the logic around the RegExp creation in
search-content.ts where regex = new RegExp(pattern, opts.ignoreCase ? 'i' : ''),
rejecting unsafe patterns (returning the same error string) or using re2 and
ensure opts.ignoreCase behavior is preserved.

In `@src/agent/tools/search-files.ts`:
- Around line 25-30: The fast-glob results in searchFiles (function in
src/agent/tools/search-files.ts) return paths relative to searchRoot when a
subdirectory `path` is provided, but caller tools (read_file, lint) expect
repo-relative paths, so prepend the provided `path` prefix to each match before
returning; import Node's `path` module, update the function description to state
it returns repository-relative paths, and ensure both the code branches that map
`matches` (the arrays created around lines with fg(...) and the later mapping at
36-42) add `path.join(pathPrefix, match)` (or similar) only when a non-empty
`path` argument was passed.

In `@src/cli/orchestrator.ts`:
- Around line 190-196: The RdJson/ValeJson branches instantiate RdJsonFormatter
and ValeJsonFormatter but never add agent findings, causing formatter.toJson()
to emit empty output; update the branches handling OutputFormat.RdJson and
OutputFormat.ValeJson in orchestrator.ts (the sections creating RdJsonFormatter
and ValeJsonFormatter) to either (A) log a warning that RdJson/ValeJson are not
supported in agent mode and fall back to the existing JSON formatter path (e.g.,
reuse the code that populates findings for JSON output), or (B) map the
collected agent findings into the RdJsonFormatter/ValeJsonFormatter APIs before
calling formatter.toJson(); implement one of these fixes and ensure the
warning/fallback is clearly emitted when in agent mode so users don’t get empty
output.

In `@src/output/reporter.ts`:
- Line 229: The ternary that computes loc uses a truthy check that treats 0 as
absent; update the expression that sets loc to check explicitly for undefined
(reference.startLine !== undefined) so a valid startLine of 0 is preserved—loc
should remain `${reference.file}:${reference.startLine}` when startLine is any
number, and fall back to reference.file only when startLine is strictly
undefined; modify the assignment that references reference.startLine in
src/output/reporter.ts accordingly.

---

Nitpick comments:
In `@src/agent/agent-executor.ts`:
- Around line 140-154: Catch the specific NoOutputGeneratedError thrown by
generateText and implement a fallback that attempts to parse the raw text output
(e.g., result.text or response.outputText) as JSON when structured parsing
fails; update the try/catch around generateText in agent-executor (where
generateText, AGENT_OUTPUT_SCHEMA, and stepCountIs(MAX_AGENT_STEPS) are used) to
detect NoOutputGeneratedError, attempt JSON.parse on the raw result text to
extract findings and ruleId, and only rethrow if parsing is impossible so the
existing happy path using response.output.findings remains unchanged.

In `@src/agent/tools/list-directory.ts`:
- Around line 18-59: Convert the execute method to an async function and stop
wrapping sync results in Promise.resolve/Promise.reject: change the execute
signature to async execute({ path: dirPath, limit }) and inside use the existing
try/catch but return plain strings or throw Errors instead of
Promise.resolve/Promise.reject; keep using resolveToCwd, isWithinRoot,
existsSync, readdirSync, statSync and path.join as before, compute
effectiveLimit from DEFAULT_LIMIT, build results, and when errors occur throw
the Error (or rethrow in catch using throw error instanceof Error ? error : new
Error(String(error))). Also remove any Promise.resolve/Promise.reject uses and
adjust the truncation message logic to return the combined string directly.

In `@tests/agent/list-directory.test.ts`:
- Around line 39-46: The test "lists a specific subdirectory" redundantly
recreates the subdir; remove the mkdirSync(path.join(TMP, 'subdir'), {
recursive: true }) call and keep only writeFileSync(path.join(TMP, 'subdir',
'nested.md'), '') so the test uses the setup from beforeEach; locate the test
using the it block name and the createListDirectoryTool(TMP) / tool.execute({
path: 'subdir' }) calls to update the snippet.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 00616295-063b-477b-b4e0-26cada6dd396

📥 Commits

Reviewing files that changed from the base of the PR and between 27097c3 and 2a43d21.

📒 Files selected for processing (28)

docs/logs/2026-03-17-agentic-capabilities.log.md
src/agent/agent-executor.ts
src/agent/index.ts
src/agent/merger.ts
src/agent/tools/index.ts
src/agent/tools/lint-tool.ts
src/agent/tools/list-directory.ts
src/agent/tools/path-utils.ts
src/agent/tools/read-file.ts
src/agent/tools/search-content.ts
src/agent/tools/search-files.ts
src/agent/types.ts
src/cli/commands.ts
src/cli/orchestrator.ts
src/cli/types.ts
src/output/json-formatter.ts
src/output/reporter.ts
src/providers/llm-provider.ts
src/providers/vercel-ai-provider.ts
src/schemas/cli-schemas.ts
tests/agent/agent-executor.test.ts
tests/agent/list-directory.test.ts
tests/agent/merger.test.ts
tests/agent/path-utils.test.ts
tests/agent/read-file.test.ts
tests/agent/search-content.test.ts
tests/agent/search-files.test.ts
tests/agent/types.test.ts

coderabbitai · 2026-03-19T23:32:12Z

src/agent/tools/path-utils.ts

+export function isWithinRoot(absolutePath: string, root: string): boolean {
+  const normalizePath = (input: string): string => {
+    try {
+      return realpathSync(input);
+    } catch {
+      return path.resolve(input);
+    }
+  };
+
+  const normalizedPath = normalizePath(absolutePath);
+  const normalizedRoot = normalizePath(root);
+  return normalizedPath.startsWith(normalizedRoot + path.sep) ||
+    normalizedPath === normalizedRoot;
+}


⚠️ Potential issue | 🟠 Major

Inconsistent path normalization creates a security gap.

normalizePath applies realpathSync or path.resolve independently to each path. If one succeeds with realpathSync while the other falls back to path.resolve, the resulting paths use incompatible normalization strategies, making the startsWith check unreliable.

For example, if root resolves via realpathSync to /real/path but absolutePath doesn't exist yet and falls back to path.resolve returning /symlink/../real/path/file, the containment check may incorrectly pass or fail.

Consider normalizing both paths with the same strategy:

Proposed fix

export function isWithinRoot(absolutePath: string, root: string): boolean { - const normalizePath = (input: string): string => { - try { - return realpathSync(input); - } catch { - return path.resolve(input); - } - }; - - const normalizedPath = normalizePath(absolutePath); - const normalizedRoot = normalizePath(root); + let normalizedPath: string; + let normalizedRoot: string; + + try { + // Try realpath for both - ensures symlinks are resolved consistently + normalizedRoot = realpathSync(root); + normalizedPath = realpathSync(absolutePath); + } catch { + // Fallback: use path.resolve for both to maintain consistency + normalizedRoot = path.resolve(root); + normalizedPath = path.resolve(absolutePath); + } + return normalizedPath.startsWith(normalizedRoot + path.sep) || normalizedPath === normalizedRoot; }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

export function isWithinRoot(absolutePath: string, root: string): boolean {

const normalizePath = (input: string): string => {

try {

return realpathSync(input);

} catch {

return path.resolve(input);

}

};

const normalizedPath = normalizePath(absolutePath);

const normalizedRoot = normalizePath(root);

return normalizedPath.startsWith(normalizedRoot + path.sep) ||

normalizedPath === normalizedRoot;

}

export function isWithinRoot(absolutePath: string, root: string): boolean {

let normalizedPath: string;

let normalizedRoot: string;

try {

// Try realpath for both - ensures symlinks are resolved consistently

normalizedRoot = realpathSync(root);

normalizedPath = realpathSync(absolutePath);

} catch {

// Fallback: use path.resolve for both to maintain consistency

normalizedRoot = path.resolve(root);

normalizedPath = path.resolve(absolutePath);

}

return normalizedPath.startsWith(normalizedRoot + path.sep) ||

normalizedPath === normalizedRoot;

}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/agent/tools/path-utils.ts` around lines 17 - 30, The containment check in isWithinRoot is unsafe because normalizePath can use realpathSync for one input and path.resolve for the other; fix by applying the same normalization strategy to both paths: attempt realpathSync on both and if either call throws, fall back to using path.resolve for both so both normalizedRoot and normalizedPath are produced by the same method; keep the final startsWith/equals check and ensure you still append path.sep when checking prefix so path boundary logic (normalizedRoot + path.sep) remains correct.

coderabbitai · 2026-03-19T23:32:12Z

src/agent/tools/search-content.ts

+  let regex: RegExp;
+  try {
+    regex = new RegExp(pattern, opts.ignoreCase ? 'i' : '');
+  } catch {
+    return 'Invalid regex pattern';
+  }


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

How to prevent ReDoS attacks in Node.js regex?

💡 Result:

To prevent ReDoS (Regular Expression Denial of Service) attacks in Node.js, follow these key strategies: 1. Write safe regex patterns: Avoid nested quantifiers like * inside , + inside +, or overlapping alternatives that cause catastrophic backtracking. For example, replace /^([a-zA-Z]+)$/ with /^[a-zA-Z]+$/ or /^(?:[a-zA-Z]{1,10})+$/. Use bounded repetitions like {1,1000} instead of unbounded * or +. Simplify patterns and make optional parts mandatory where possible, e.g., /^\w+(?:[-\s]\w+)*$/. 2. Use static analysis tools: Check regex safety with packages like safe-regex or safe-regex2. Example: const safeRegex = require('safe-regex'); if (!safeRegex(pattern)) { throw new Error('Unsafe regex'); } 3. Switch to RE2 engine: Use the 're2' npm package, which implements a backtracking-free regex engine guaranteeing linear time matching. Drop-in replacement for RegExp: const RE2 = require('re2'); const re = new RE2('your-pattern'); const match = re.exec(input); Note: RE2 lacks some JS features like lookaheads and backreferences, which are often ReDoS-prone anyway. 4. Input validation and limits: Limit input length before regex processing (e.g., if (input.length > 1000) reject). Use Express body-parser limits: app.use(express.json({ limit: '100kb' }));. Implement rate limiting. 5. Alternatives to regex: Prefer native string methods like trim, split, or every for simple validation when possible. 6. Additional defenses: Use ESLint plugin-security to detect unsafe regex. Monitor regex performance with console.time. Consider WAFs like Cloudflare for filtering. No official Node.js documentation mandates specific ReDoS prevention, but these practices align with security best practices from OWASP, Node.js security resources, and community tools.

Citations:

1: https://medium.com/@arunangshudas/5-tips-to-avoid-regex-dos-attacks-in-node-js-ce1c1d7004a6

2: https://www.herodevs.com/blog-posts/preventing-redos-regular-expression-denial-of-service-attacks-in-express

3: https://owasp.org/www-community/attacks/Regular_expression_Denial_of_Service_-_ReDoS

4: https://www.nodejs-security.com/blog/regex-gone-wrong-parse-duration-npm-package-vulnerability

5: https://sonarsource.com/blog/vulnerable-regular-expressions-javascript

6: https://registry.npmjs.org/safe-regex

7: https://www.npmjs.com/package/re2?activeTab=code

🏁 Script executed:

# First, verify the code at lines 104-109 head -110 src/agent/tools/search-content.ts | tail -20

Repository: TRocket-Labs/vectorlint

Length of output: 537

🏁 Script executed:

# Check function context around the vulnerable code sed -n '90,115p' src/agent/tools/search-content.ts

Repository: TRocket-Labs/vectorlint

Length of output: 725

🏁 Script executed:

# Check dependencies for safe-regex, re2, or similar security tools cat package.json | grep -A 50 "dependencies"

Repository: TRocket-Labs/vectorlint

Length of output: 1103

ReDoS vulnerability from unsanitized user-supplied regex pattern.

The pattern parameter is passed directly to RegExp without validation. A malicious or complex pattern like (a+)+$ can cause catastrophic backtracking, freezing the process. No regex safety libraries are currently in dependencies.

Add regex validation before compilation. Options in order of strength:

Lightweight: Use the safe-regex npm package to validate pattern safety before instantiation

Strong: Switch to the re2 npm package (implements backtracking-free regex engine with O(n) guarantees)

Timeout guard: Add an execution timeout for regex operations to prevent indefinite blocking

Avoid simple regex-based pattern checks alone, as they cannot cover all ReDoS cases.

🧰 Tools

🪛 ast-grep (0.41.1)

[warning] 105-105: Regular expression constructed from variable input detected. This can lead to Regular Expression Denial of Service (ReDoS) attacks if the variable contains malicious patterns. Use libraries like 'recheck' to validate regex safety or use static patterns.
Context: new RegExp(pattern, opts.ignoreCase ? 'i' : '')
Note: [CWE-1333] Inefficient Regular Expression Complexity [REFERENCES]
- https://owasp.org/www-community/attacks/Regular_expression_Denial_of_Service_-_ReDoS
- https://cwe.mitre.org/data/definitions/1333.html

(regexp-from-variable)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/agent/tools/search-content.ts` around lines 104 - 109, The code constructs a RegExp directly from the untrusted pattern (pattern / new RegExp(...)) which allows ReDoS; fix by validating or replacing the engine before compiling: either validate the pattern with a safety check (e.g., integrate safe-regex to reject dangerous patterns) or switch to a backtracking-free engine (e.g., use re2 to instantiate the regex instead of RegExp) and/or wrap regex execution in a short timeout guard to abort long-running matches; update the logic around the RegExp creation in search-content.ts where regex = new RegExp(pattern, opts.ignoreCase ? 'i' : ''), rejecting unsafe patterns (returning the same error string) or using re2 and ensure opts.ignoreCase behavior is preserved.

coderabbitai · 2026-03-19T23:32:12Z

src/agent/tools/search-content.ts

+export function createSearchContentTool(cwd: string): SearchContentTool {
+  return {
+    name: 'search_content',
+    description: 'Search file contents for a pattern. Returns file:line: matchedtext format. Default glob filter: *.md. Supports regex patterns.',


⚠️ Potential issue | 🟡 Minor

Description mentions *.md but default is **/*.md.

The description says "Default glob filter: *.md" but lines 97 and 168 use **/*.md. Update the description for accuracy.

📝 Proposed fix

- description: 'Search file contents for a pattern. Returns file:line: matchedtext format. Default glob filter: *.md. Supports regex patterns.', + description: 'Search file contents for a pattern. Returns file:line: matchedtext format. Default glob filter: **/*.md. Supports regex patterns.',

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

description: 'Search file contents for a pattern. Returns file:line: matchedtext format. Default glob filter: *.md. Supports regex patterns.',

description: 'Search file contents for a pattern. Returns file:line: matchedtext format. Default glob filter: **/*.md. Supports regex patterns.',

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/agent/tools/search-content.ts` at line 159, Update the tool description string (the description property in the search-content tool definition) to accurately state the default glob filter as "**/*.md" instead of "*.md"; verify consistency with the other occurrences that use "**/*.md" (seen near the uses at lines referencing the default glob in this file) so the description matches the actual default behavior.

coderabbitai · 2026-03-19T23:32:12Z

src/agent/tools/search-files.ts

+      const matches = await fg(pattern, {
+        cwd: searchRoot,
+        ignore: ['**/node_modules/**', '**/.git/**'],
+        onlyFiles: true,
+        followSymbolicLinks: false,
+      });


⚠️ Potential issue | 🟠 Major

Path reference mismatch when searching subdirectories.

When path is provided (e.g., "docs"), fast-glob returns paths relative to searchRoot (e.g., "api.md"). However, the agent passes these paths directly to read_file and lint tools, which expect paths relative to cwd (e.g., "docs/api.md"). This causes file-not-found errors when searching in subdirectories.

The description on line 15 states "Returns paths relative to the search root," but the consuming tools in agent-executor.ts expect paths relative to the repository root.

🛠️ Proposed fix: Prepend relative path prefix to results

const matches = await fg(pattern, { cwd: searchRoot, ignore: ['**/node_modules/**', '**/.git/**'], onlyFiles: true, followSymbolicLinks: false, }); if (matches.length === 0) { return 'No files found matching pattern'; } const limited = matches.slice(0, effectiveLimit); - const output = limited.join('\n'); + // Prepend the relative search directory so paths are relative to cwd + const relativePrefix = searchDir ? path.relative(cwd, searchRoot) : ''; + const output = limited + .map(match => relativePrefix ? path.join(relativePrefix, match) : match) + .join('\n'); if (matches.length > effectiveLimit) { return `${output}\n\n[${effectiveLimit} results limit reached. Refine your pattern for more specific results.]`; } return output;

You'll also need to import path:

import fg from 'fast-glob'; +import * as path from 'node:path'; import { resolveToCwd, isWithinRoot } from './path-utils.js';

And update the description:

- description: 'Find files by glob pattern. Returns paths relative to the search root. Examples: **/*.md, docs/*.md, src/**/*.ts', + description: 'Find files by glob pattern. Returns paths relative to repo root. Examples: **/*.md, docs/*.md, src/**/*.ts',

Also applies to: 36-42

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/agent/tools/search-files.ts` around lines 25 - 30, The fast-glob results in searchFiles (function in src/agent/tools/search-files.ts) return paths relative to searchRoot when a subdirectory `path` is provided, but caller tools (read_file, lint) expect repo-relative paths, so prepend the provided `path` prefix to each match before returning; import Node's `path` module, update the function description to state it returns repository-relative paths, and ensure both the code branches that map `matches` (the arrays created around lines with fg(...) and the later mapping at 36-42) add `path.join(pathPrefix, match)` (or similar) only when a non-empty `path` argument was passed.

coderabbitai · 2026-03-19T23:32:12Z

src/cli/orchestrator.ts

+  } else if (outputFormat === OutputFormat.RdJson) {
+    const formatter = new RdJsonFormatter();
+    console.log(formatter.toJson());
+  } else if (outputFormat === OutputFormat.ValeJson) {
+    const formatter = new ValeJsonFormatter();
+    console.log(formatter.toJson());
+  }


⚠️ Potential issue | 🟡 Minor

Agent findings are not emitted for RdJson/ValeJson formats.

When outputFormat is RdJson or ValeJson, the formatters are created but no findings are added—formatter.toJson() outputs an empty structure. Users selecting these formats in agent mode will see empty output even when findings exist.

Consider either:

Logging a warning that these formats aren't supported in agent mode

Falling back to JSON format with a notice

Mapping agent findings to these formats if feasible

Suggested warning

} else if (outputFormat === OutputFormat.RdJson) { + console.warn('[vectorlint] RdJson output format is not supported in agent mode. Use --output json instead.'); const formatter = new RdJsonFormatter(); console.log(formatter.toJson()); } else if (outputFormat === OutputFormat.ValeJson) { + console.warn('[vectorlint] ValeJson output format is not supported in agent mode. Use --output json instead.'); const formatter = new ValeJsonFormatter(); console.log(formatter.toJson()); }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

} else if (outputFormat === OutputFormat.RdJson) {

const formatter = new RdJsonFormatter();

console.log(formatter.toJson());

} else if (outputFormat === OutputFormat.ValeJson) {

const formatter = new ValeJsonFormatter();

console.log(formatter.toJson());

}

} else if (outputFormat === OutputFormat.RdJson) {

console.warn('[vectorlint] RdJson output format is not supported in agent mode. Use --output json instead.');

const formatter = new RdJsonFormatter();

console.log(formatter.toJson());

} else if (outputFormat === OutputFormat.ValeJson) {

console.warn('[vectorlint] ValeJson output format is not supported in agent mode. Use --output json instead.');

const formatter = new ValeJsonFormatter();

console.log(formatter.toJson());

}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/cli/orchestrator.ts` around lines 190 - 196, The RdJson/ValeJson branches instantiate RdJsonFormatter and ValeJsonFormatter but never add agent findings, causing formatter.toJson() to emit empty output; update the branches handling OutputFormat.RdJson and OutputFormat.ValeJson in orchestrator.ts (the sections creating RdJsonFormatter and ValeJsonFormatter) to either (A) log a warning that RdJson/ValeJson are not supported in agent mode and fall back to the existing JSON formatter path (e.g., reuse the code that populates findings for JSON output), or (B) map the collected agent findings into the RdJsonFormatter/ValeJsonFormatter APIs before calling formatter.toJson(); implement one of these fixes and ensure the warning/fallback is clearly emitted when in agent mode so users don’t get empty output.

coderabbitai · 2026-03-19T23:32:12Z

src/output/reporter.ts

+  if (finding.suggestion) console.log(`    Suggestion: ${finding.suggestion}`);
+  if (finding.references && finding.references.length > 0) {
+    for (const reference of finding.references) {
+      const loc = reference.startLine ? `${reference.file}:${reference.startLine}` : reference.file;


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Verify whether startLine is unconstrained numeric (so 0 is currently representable) rg -n "startLine:\s*z\.number" src/agent/types.ts -C2

Repository: TRocket-Labs/vectorlint

Length of output: 338

🏁 Script executed:

sed -n '215,240p' src/output/reporter.ts

Repository: TRocket-Labs/vectorlint

Length of output: 799

🏁 Script executed:

cat -n src/agent/types.ts

Repository: TRocket-Labs/vectorlint

Length of output: 1569

🏁 Script executed:

rg "startLine.*[:\s]0" --type ts src/ tests/

Repository: TRocket-Labs/vectorlint

Length of output: 49

Use an explicit undefined check for startLine.

reference.startLine ? ... treats 0 as absent. Prefer checking reference.startLine !== undefined to handle all valid numeric values including 0.

💡 Suggested patch

- const loc = reference.startLine ? `${reference.file}:${reference.startLine}` : reference.file; + const loc = reference.startLine !== undefined + ? `${reference.file}:${reference.startLine}` + : reference.file;

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

const loc = reference.startLine ? `${reference.file}:${reference.startLine}` : reference.file;

const loc = reference.startLine !== undefined

? `${reference.file}:${reference.startLine}`

: reference.file;

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/output/reporter.ts` at line 229, The ternary that computes loc uses a truthy check that treats 0 as absent; update the expression that sets loc to check explicitly for undefined (reference.startLine !== undefined) so a valid startLine of 0 is preserved—loc should remain `${reference.file}:${reference.startLine}` when startLine is any number, and fall back to reference.file only when startLine is strictly undefined; modify the assignment that references reference.startLine in src/output/reporter.ts accordingly.

oshorefueled added 13 commits March 19, 2026 23:19

feat(agent): add agent finding schemas and types

16ff98f

feat(agent): add path utils for cwd-scoped tools

506aae1

feat(agent): add read_file tool with pagination

427951f

feat(agent): add search_files tool with glob support

0a49940

feat(agent): add list_directory tool

21c3236

feat(agent): add search_content tool with rg fallback

4abd1c8

feat(agent): add lint sub-tool and tool exports

69c9ced

feat(agent): add agent executor tool loop

b546480

feat(agent): add merger for agent findings

6f42a27

feat(agent): add reporter support for agent findings

676348b

feat(agent): wire CLI mode and agent orchestration

95edcc5

refactor(agent): surface failures and harden tool behavior

8ebcfb5

chore(agent): finalize execution log

2a43d21

coderabbitai bot reviewed Mar 19, 2026

View reviewed changes

refactor(cli): centralize evaluation mode constants

fdb0d5a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(agent): add agent mode for cross-document evaluation#74

feat(agent): add agent mode for cross-document evaluation#74
oshorefueled wants to merge 14 commits intomainfrom
feat/agent

oshorefueled commented Mar 19, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Mar 19, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Mar 19, 2026

Uh oh!

coderabbitai bot Mar 19, 2026

Uh oh!

coderabbitai bot Mar 19, 2026

Uh oh!

coderabbitai bot Mar 19, 2026

Uh oh!

coderabbitai bot Mar 19, 2026

Uh oh!

coderabbitai bot Mar 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	description: 'Search file contents for a pattern. Returns file:line: matchedtext format. Default glob filter: *.md. Supports regex patterns.',
	description: 'Search file contents for a pattern. Returns file:line: matchedtext format. Default glob filter: */.md. Supports regex patterns.',

Conversation

oshorefueled commented Mar 19, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why

What

Scope

Behavior impact

Risk

How to test / verify

Rollback

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

oshorefueled commented Mar 19, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 19, 2026 •

edited

Loading