refactor: standardize evaluation terminology to Check and Judge by ayo6706 · Pull Request #52 · TRocket-Labs/vectorlint

ayo6706 · 2026-01-09T09:34:56Z

This refactor updates the codebase to align internal terminology with our product types. It replaces the legacy terms "Semi-Objective" and "Subjective" with the canonical "Check" and "Judge" respectively.

Changes

Schemas & Types: Renamed SemiObjective* to Check* and Subjective* to Judge* in schema.ts and types.ts.
Scoring Engine: Updated scorer.ts functions (calculateCheckScore, calculateJudgeScore) to match the new naming.
Evaluators: Updated BaseEvaluator and AccuracyEvaluator to use the new types and scoring functions.
Orchestrator: Refactored orchestrator.ts to route results using isCheckResult / isJudgeResult.
Tests: updated scoring-types.test.ts to verify the new terminology.

Reasoning

Consistent terminology reduces cognitive load for developers and prevents confusion between internal code and user-facing documentation.

Summary by CodeRabbit

Refactor
- Unified public naming across prompts, evaluators, and scoring: "Subjective"/"SemiObjective" → "Judge"/"Check".
- Renamed public types, scoring functions, and result types to the new vocabulary for a consistent API.
- No changes to behavior or user-visible outputs; improves clarity and integration consistency for consumers.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

… to judge/check - Rename `buildSubjectiveLLMSchema()` to `buildJudgeLLMSchema()` - Rename `buildSemiObjectiveLLMSchema()` to `buildCheckLLMSchema()` - Rename `SubjectiveLLMResult` type to `JudgeLLMResult` - Rename `SemiObjectiveLLMResult` type to `CheckLLMResult` - Rename `SubjectiveResult` type to `JudgeResult` - Rename `SemiObjectiveItem` type to `CheckItem` - Rename `SemiObjectiveResult` type to `CheckResult` - Rename `isSubjectiveResult()` function to `isJudgeResult()` - Rename `isSemiObjectiveResult()` function to `isCheckResult()` - Update `EvaluationResult` union type to use new type names - Align schema naming with recent rule type terminology changes

- Rename SemiObjectiveItem to CheckItem in chunking/merger.ts - Rename SubjectiveResult to JudgeResult in cli/orchestrator.ts and cli/types.ts - Rename SemiObjectiveResult to CheckResult in evaluators - Rename buildSubjectiveLLMSchema to buildJudgeLLMSchema - Rename buildSemiObjectiveLLMSchema to buildCheckLLMSchema - Rename SubjectiveLLMResult to JudgeLLMResult - Rename SemiObjectiveLLMResult to CheckLLMResult - Rename calculateSubjectiveScore to calculateJudgeScore - Rename calculateSemiObjectiveScore to calculateCheckScore - Rename averageSubjectiveScores to averageJudgeScores - Rename isSubjectiveResult to isJudgeResult - Update all comments and documentation to use judge/check terminology - Update test files to reflect new type names - Ensures consistent naming convention across codebase following recent rule type refactoring

coderabbitai · 2026-01-09T09:35:07Z

📝 Walkthrough

Walkthrough

This PR renames evaluation-related types, functions, and predicates across the codebase, replacing "Subjective/Semi-Objective" terminology with "Judge/Check" naming (types, schema builders, scoring functions, predicates, and related imports/usages).

Changes

Cohort / File(s)	Change Summary
Schema & Type Definitions `src/prompts/schema.ts`	Renamed public types and builders: `SubjectiveLLMResult`→`JudgeLLMResult`, `SemiObjectiveLLMResult`→`CheckLLMResult`, `SubjectiveResult`→`JudgeResult`, `SemiObjectiveItem`→`CheckItem`, `SemiObjectiveResult`→`CheckResult`; `buildSubjectiveLLMSchema`→`buildJudgeLLMSchema`, `buildSemiObjectiveLLMSchema`→`buildCheckLLMSchema`; type guards `isSubjectiveResult`→`isJudgeResult`, `isSemiObjectiveResult`→`isCheckResult`.
Scoring Implementation & Exports `src/scoring/scorer.ts`, `src/scoring/index.ts`	Renamed scoring options and functions: `SubjectiveScoringOptions`→`JudgeScoringOptions`, `SemiObjectiveScoringOptions`→`CheckScoringOptions` (added `promptSeverity?`); `calculateSubjectiveScore`→`calculateJudgeScore`, `calculateSemiObjectiveScore`→`calculateCheckScore`, `averageSubjectiveScores`→`averageJudgeScores`; updated signatures, types, and aggregation logic.
Evaluation Layer `src/evaluators/base-evaluator.ts`, `src/evaluators/evaluator.ts`	Replaced `EvaluationResult` with `PromptEvaluationResult`; replaced Subjective/Semi-Objective flows with Judge/Check: `runJudgeEvaluation` returns `JudgeResult`, `runCheckEvaluation` returns `CheckResult`; switched schema builders and scorer invocations to new names and updated chunk/result typings.
Accuracy Evaluator `src/evaluators/accuracy-evaluator.ts`	Swapped `calculateSemiObjectiveScore` → `calculateCheckScore`; updated evaluate return typing to `Promise<PromptEvaluationResult>` and related imports.
CLI & Orchestration `src/cli/orchestrator.ts`, `src/cli/types.ts`	Replaced predicate `isSubjectiveResult` → `isJudgeResult` in routing/control flow; updated `ProcessCriterionParams.result` and `ValidationParams.result` types from `SubjectiveResult`→`JudgeResult`; comments/docs adjusted.
Utilities / Chunking `src/chunking/merger.ts`	Updated `mergeViolations` signature to accept `chunkViolations: CheckItem[][]` and return `CheckItem[]`; internal flatten/dedup logic unchanged.
Tests `tests/scoring-types.test.ts`	Updated test typings/imports to use `JudgeLLMResult` and `CheckLLMResult` in mock LLM responses and generic test types.

Sequence Diagram(s)

(omitted)

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Rename rule types from semi-objective/subjective to check/judge #49 — Broad renaming of Subjective/SemiObjective → Judge/Check across schemas, evaluators, and scoring; strong overlap.
feat: implement content chunking and dedicated scoring logic for evaluators #39 — Directly modifies mergeViolations and related chunking types (SemiObjectiveItem → CheckItem); overlaps chunking changes.
feat: improve Issue Location Accuracy with Line Numbering and Fuzzy Matching #46 — Renames schema types and predicates to judge/check, impacting src/prompts/schema.ts and consumers.

Suggested reviewers

oshorefueled

"I hopped through code with a twitchy nose,
Names swapped like carrots in tidy rows,
Judge and Check now dance and play,
Old labels hopped away today,
Bugs stay small, the build still glows. 🐇✨"

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the primary change: a systematic refactor standardizing evaluation terminology from 'Semi-Objective/Subjective' to 'Check/Judge' across the entire codebase, affecting schemas, types, scoring functions, and evaluators.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

📜 Recent review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ad4cca6 and ea9743b.

📒 Files selected for processing (1)

tests/scoring-types.test.ts

🚧 Files skipped from review as they are similar to previous changes (1)

tests/scoring-types.test.ts

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/prompts/schema.ts (1)
58-94: Schema/type drift: buildCheckLLMSchema() requires fields that CheckLLMResult marks optional.

Right now the schema requires quoted_text, context_before, context_after, description, analysis, suggestion, but CheckLLMResult allows most of these to be absent. This is a maintainability trap (types suggest “optional”, runtime schema rejects).
Proposed fix (align TS type to the schema)
 export type CheckLLMResult = {
   violations: Array<{
-    description: string;
-    analysis: string;
-    suggestion?: string;
-    quoted_text?: string;
-    context_before?: string;
-    context_after?: string;
+    line?: number;
+    description: string;
+    analysis: string;
+    suggestion: string;
+    quoted_text: string;
+    context_before: string;
+    context_after: string;
   }>;
 };

🧹 Nitpick comments (7)

src/prompts/schema.ts (2)
96-122: Optional: include line?: number in JudgeLLMResult violations for parity with the schema.

The judge schema allows line, but JudgeLLMResult drops it. That’s fine if intentionally unused, but it’s surprising given the schema.
Possible tweak
 export type JudgeLLMResult = {
   criteria: Array<{
@@
     violations: Array<{
+      line?: number;
       quoted_text: string;
       context_before: string;
       context_after: string;
       analysis: string;
       suggestion: string;
     }>;
   }>;
 };
123-173: Public API naming: export type EvaluationResult is easy to confuse with CLI’s aggregated EvaluationResult.

You’re already working around it via import-aliasing (EvaluationResult as PromptEvaluationResult) elsewhere. Consider renaming this to something like PromptEvaluationResult to reduce repeated aliasing and accidental misuse.
src/cli/orchestrator.ts (1)
538-645: Future-proof routing: prefer isCheckResult(result) over !isJudgeResult(result).

Today it’s equivalent (two-type union), but !isJudgeResult will silently treat any future third type as “Check”. Using the explicit guard makes intent and safety clearer.
Proposed change
-import { isJudgeResult } from '../prompts/schema';
+import { isCheckResult, isJudgeResult } from '../prompts/schema';
@@
-  // Handle Check Result
-  if (!isJudgeResult(result)) {
+  // Handle Check Result
+  if (isCheckResult(result)) {
     const severity = result.severity;
     const violationCount = result.violations.length;
@@
-  }
+  } else if (!isJudgeResult(result)) {
+    // If additional evaluation types are ever introduced, fail loudly.
+    throw new Error(`Unknown evaluation result type: ${(result as { type?: unknown }).type}`);
+  }
 
-  // Handle Judge Result
+  // Handle Judge Result
tests/scoring-types.test.ts (1)

80-135: Check mock result is aligned with CheckLLMResult; consider renaming id/name from semi → check for consistency.
src/scoring/scorer.ts (3)
48-100: Guard wordCount (and consider clamping strictness) to avoid surprising scores from invalid inputs.
Proposed diff
 export function calculateCheckScore(
   violations: CheckItem[],
   wordCount: number,
   options: CheckScoringOptions = {}
 ): CheckResult {
-  const strictness = resolveStrictness(options.strictness);
+  const strictness = resolveStrictness(options.strictness);
+  const safeWordCount = Math.max(1, wordCount);

   // Map items to violation format
   const mappedViolations = violations.map((item) => ({
     analysis: item.analysis,
@@
   }));

   // Density Calculation: Violations per 100 words
-  const density = (mappedViolations.length / wordCount) * 100;
+  const density = (mappedViolations.length / safeWordCount) * 100;
108-144: Use ?? instead of || for criterion weights to avoid treating weight: 0 as “unset”.
Proposed diff
-    const weight = definedCriterion?.weight || 1;
+    const weight = definedCriterion?.weight ?? 1;
147-165: averageJudgeScores warning on length mismatch is good; consider failing fast in tests/dev to avoid silently underweighting missing word counts.

Also applies to: 232-281

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0fddb2c and f79bb9b.

📒 Files selected for processing (9)

src/chunking/merger.ts
src/cli/orchestrator.ts
src/cli/types.ts
src/evaluators/accuracy-evaluator.ts
src/evaluators/base-evaluator.ts
src/prompts/schema.ts
src/scoring/index.ts
src/scoring/scorer.ts
tests/scoring-types.test.ts

🧰 Additional context used

📓 Path-based instructions (2)

src/**/*.ts

📄 CodeRabbit inference engine (AGENTS.md)

src/**/*.ts: Use TypeScript ESM with explicit imports and narrow types
Use 2-space indentation; avoid trailing whitespace
Maintain strict TypeScript with no any; use unknown + schema validation for external data
Use custom error types with proper inheritance; catch blocks use unknown type

Files:

src/cli/types.ts
src/cli/orchestrator.ts
src/chunking/merger.ts
src/evaluators/accuracy-evaluator.ts
src/evaluators/base-evaluator.ts
src/prompts/schema.ts
src/scoring/scorer.ts
src/scoring/index.ts

tests/**/*.test.ts

📄 CodeRabbit inference engine (AGENTS.md)

tests/**/*.test.ts: Write tests using Vitest framework with focus on config parsing, file discovery, schema/structured output, and locator
Use dependency injection in tests: mock providers; do not hit network in unit tests

Files:

tests/scoring-types.test.ts

🧠 Learnings (4)

📚 Learning: 2025-12-28T19:43:51.189Z

Learnt from: CR
Repo: TRocket-Labs/vectorlint PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-28T19:43:51.189Z
Learning: Applies to src/output/**/*.ts : IDs should be shown as `PromptId.CriterionId` in output

Applied to files:

src/cli/types.ts
src/prompts/schema.ts

📚 Learning: 2025-12-28T19:43:51.189Z

Learnt from: CR
Repo: TRocket-Labs/vectorlint PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-28T19:43:51.189Z
Learning: Applies to src/boundaries/**/*.ts : Use Zod schemas for boundary validation of all external data (files, CLI, env, APIs) at system boundaries

Applied to files:

src/evaluators/accuracy-evaluator.ts

📚 Learning: 2025-12-28T19:43:51.189Z

Learnt from: CR
Repo: TRocket-Labs/vectorlint PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-28T19:43:51.189Z
Learning: Applies to src/providers/**/*.ts : Depend on `LLMProvider` and `SearchProvider` interfaces; keep providers thin (transport only)

Applied to files:

tests/scoring-types.test.ts

📚 Learning: 2025-12-28T19:43:51.189Z

Learnt from: CR
Repo: TRocket-Labs/vectorlint PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-28T19:43:51.189Z
Learning: Applies to src/**/*.ts : Maintain strict TypeScript with no `any`; use `unknown` + schema validation for external data

Applied to files:

src/prompts/schema.ts

🧬 Code graph analysis (7)

src/cli/types.ts (1)

src/prompts/schema.ts (1)

JudgeResult (123-143)

src/cli/orchestrator.ts (1)

src/prompts/schema.ts (1)

isJudgeResult (175-179)

src/chunking/merger.ts (2)

src/chunking/index.ts (1)

mergeViolations (3-3)

src/prompts/schema.ts (1)

CheckItem (145-152)

src/evaluators/accuracy-evaluator.ts (1)

src/scoring/scorer.ts (1)

calculateCheckScore (48-100)

tests/scoring-types.test.ts (2)

src/providers/llm-provider.ts (1)

LLMResult (3-6)

src/prompts/schema.ts (2)

JudgeLLMResult (96-110)

CheckLLMResult (112-121)

src/prompts/schema.ts (1)

src/cli/types.ts (1)

EvaluationResult (33-41)

src/scoring/scorer.ts (2)

src/scoring/index.ts (5)

CheckScoringOptions (2-2)

JudgeScoringOptions (3-3)

calculateCheckScore (4-4)

calculateJudgeScore (5-5)

averageJudgeScores (6-6)

src/prompts/schema.ts (4)

CheckItem (145-152)

CheckResult (154-171)

JudgeLLMResult (96-110)

JudgeResult (123-143)

🔇 Additional comments (16)

src/prompts/schema.ts (2)

4-56: buildJudgeLLMSchema() looks consistent and strict in the right places.

The optional line field (present but not required) matches the intent of the rest of the schema and avoids hard-failing on missing line numbers.

175-185: Type guards look correct.

result.type === EvaluationType.JUDGE/CHECK is the cleanest discriminator here.

src/cli/types.ts (3)

10-12: Import aliasing is appropriate to avoid the EvaluationResult name collision.

No concerns with EvaluationResult as PromptEvaluationResult here.

103-106: Same for ValidationParams.result: JudgeResult.

Good tightening of types (validation is judge-only).

87-111: The ProcessCriterionParams.result narrowing to JudgeResult is correct and properly enforced.

The call to extractAndReportCriterion at line 662 is correctly placed inside the Judge Result branch (after the if (!isJudgeResult(result)) guard). No changes needed.

src/evaluators/accuracy-evaluator.ts (1)

13-14: Rename propagation looks correct: calculateCheckScore used in the no-claims path.

The call signature matches the provided scorer snippet (violations array + wordCount + options).

Also applies to: 66-76

src/cli/orchestrator.ts (1)

10-10: Import updated as expected for new terminology.

src/chunking/merger.ts (1)

1-5: Type rename and deduplication logic verified as correct.

The CheckItem type rename is fully applied with no remaining SemiObjectiveItem references. The merge/dedupe implementation correctly flattens input and deduplicates using the composite key (quoted_text + description + analysis). Call site in base-evaluator.ts properly passes CheckLLMResult["violations"][] which matches the CheckItem[][] signature.

tests/scoring-types.test.ts (3)

6-9: Type import rename looks consistent with the new public schema surface.

35-77: Judge mock result is aligned with JudgeLLMResult (score 1–4) and weighted scoring expectations.

137-159: Empty-violations case correctly exercises the “perfect score” path with CheckLLMResult.

src/scoring/index.ts (1)

1-7: Public scoring exports are cleanly updated to Check/Judge terminology.

src/evaluators/base-evaluator.ts (3)

4-12: Renames to Judge/Check schema + scoring functions are consistent and improve terminology alignment.

Also applies to: 22-26, 32-42

186-231: Check evaluation path correctly scores once from merged violations using original (non-line-numbered) word count.

117-176: Remove this concern — countWords() explicitly strips line number prefixes before counting words.

The countWords() function in src/chunking/utils.ts includes logic to remove line number prefixes using the regex /^\d+\t/gm before word counting. This was intentionally designed to ensure accurate word counts for line-numbered content, preventing any skew in chunk weighting.

Likely an incorrect or invalid review comment.

src/scoring/scorer.ts (1)

9-24: New scoring option types are clear and match the Check/Judge rename intent.

coderabbitai

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/prompts/schema.ts (1)
112-121: Critical: Schema/type mismatch for CheckLLMResult.

The buildCheckLLMSchema function marks suggestion, quoted_text, context_before, and context_after as required fields (lines 80-87), but the CheckLLMResult type marks them as optional. This creates a mismatch between what the LLM is forced to return and what TypeScript enforces.
🔧 Proposed fix to align type with schema
 export type CheckLLMResult = {
   violations: Array<{
     description: string;
     analysis: string;
-    suggestion?: string;
-    quoted_text?: string;
-    context_before?: string;
-    context_after?: string;
+    suggestion: string;
+    quoted_text: string;
+    context_before: string;
+    context_after: string;
   }>;
 };

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f79bb9b and ad4cca6.

📒 Files selected for processing (5)

src/cli/types.ts
src/evaluators/accuracy-evaluator.ts
src/evaluators/base-evaluator.ts
src/evaluators/evaluator.ts
src/prompts/schema.ts

🚧 Files skipped from review as they are similar to previous changes (2)

src/evaluators/accuracy-evaluator.ts
src/cli/types.ts

🧰 Additional context used

📓 Path-based instructions (1)

src/**/*.ts

📄 CodeRabbit inference engine (AGENTS.md)

src/**/*.ts: Use TypeScript ESM with explicit imports and narrow types
Use 2-space indentation; avoid trailing whitespace
Maintain strict TypeScript with no any; use unknown + schema validation for external data
Use custom error types with proper inheritance; catch blocks use unknown type

Files:

src/evaluators/evaluator.ts
src/prompts/schema.ts
src/evaluators/base-evaluator.ts

🧠 Learnings (1)

📚 Learning: 2025-12-28T19:43:51.189Z

Learnt from: CR
Repo: TRocket-Labs/vectorlint PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-28T19:43:51.189Z
Learning: Applies to src/output/**/*.ts : IDs should be shown as `PromptId.CriterionId` in output

Applied to files:

src/prompts/schema.ts

🧬 Code graph analysis (2)

src/evaluators/evaluator.ts (1)

src/prompts/schema.ts (1)

PromptEvaluationResult (173-173)

src/evaluators/base-evaluator.ts (4)

src/prompts/schema.ts (7)

PromptEvaluationResult (173-173)

JudgeResult (123-143)

buildJudgeLLMSchema (4-56)

JudgeLLMResult (96-110)

CheckResult (154-171)

buildCheckLLMSchema (58-94)

CheckLLMResult (112-121)

src/scoring/scorer.ts (3)

calculateJudgeScore (108-144)

averageJudgeScores (147-281)

calculateCheckScore (48-100)

src/output/line-numbering.ts (1)

prependLineNumbers (13-18)

src/chunking/utils.ts (1)

countWords (5-10)

🔇 Additional comments (6)

src/prompts/schema.ts (2)

4-94: LGTM!

The schema builders have been correctly renamed to buildJudgeLLMSchema and buildCheckLLMSchema, with appropriate internal schema names (vectorlint_judge_result and vectorlint_check_result). The structure and strict mode usage are appropriate.

96-110: LGTM!

The type definitions and type guards have been successfully renamed to use the new Check/Judge terminology. The discriminated union pattern with EvaluationType is correctly implemented in the type guards.

Also applies to: 123-185

src/evaluators/evaluator.ts (1)

1-9: LGTM!

The Evaluator interface has been correctly updated to use the renamed PromptEvaluationResult type. The refactoring is clean and maintains the same contract.

src/evaluators/base-evaluator.ts (3)

4-12: LGTM!

The imports and class-level changes correctly adopt the new Check/Judge terminology. All type imports, scoring function imports, comments, and method signatures are properly updated.

Also applies to: 23-26, 33-50

117-176: LGTM!

The runJudgeEvaluation method has been correctly updated to use the new terminology: buildJudgeLLMSchema, JudgeLLMResult, JudgeResult, calculateJudgeScore, and averageJudgeScores. The implementation logic remains unchanged and correct.

186-231: LGTM!

The runCheckEvaluation method has been correctly updated to use the new terminology: buildCheckLLMSchema, CheckLLMResult, CheckResult, and calculateCheckScore. The array type annotation CheckLLMResult["violations"][] is properly used for type safety.

- Rename test prompt id from "test-semi" to "test-check" - Update prompt metadata id to match new naming convention - Change prompt name from "Test Semi" to "Test Check" - Align test identifiers with refactored check/judge type naming

* refactor(prompts): rename schema types from subjective/semi-objective to judge/check - Rename `buildSubjectiveLLMSchema()` to `buildJudgeLLMSchema()` - Rename `buildSemiObjectiveLLMSchema()` to `buildCheckLLMSchema()` - Rename `SubjectiveLLMResult` type to `JudgeLLMResult` - Rename `SemiObjectiveLLMResult` type to `CheckLLMResult` - Rename `SubjectiveResult` type to `JudgeResult` - Rename `SemiObjectiveItem` type to `CheckItem` - Rename `SemiObjectiveResult` type to `CheckResult` - Rename `isSubjectiveResult()` function to `isJudgeResult()` - Rename `isSemiObjectiveResult()` function to `isCheckResult()` - Update `EvaluationResult` union type to use new type names - Align schema naming with recent rule type terminology changes * refactor(schema): rename subjective/semi-objective types to judge/check - Rename SemiObjectiveItem to CheckItem in chunking/merger.ts - Rename SubjectiveResult to JudgeResult in cli/orchestrator.ts and cli/types.ts - Rename SemiObjectiveResult to CheckResult in evaluators - Rename buildSubjectiveLLMSchema to buildJudgeLLMSchema - Rename buildSemiObjectiveLLMSchema to buildCheckLLMSchema - Rename SubjectiveLLMResult to JudgeLLMResult - Rename SemiObjectiveLLMResult to CheckLLMResult - Rename calculateSubjectiveScore to calculateJudgeScore - Rename calculateSemiObjectiveScore to calculateCheckScore - Rename averageSubjectiveScores to averageJudgeScores - Rename isSubjectiveResult to isJudgeResult - Update all comments and documentation to use judge/check terminology - Update test files to reflect new type names - Ensures consistent naming convention across codebase following recent rule type refactoring * chore: Rename EvalutionResult to PromptEvaluationResult * test(scoring-types): Update check evaluation test identifiers - Rename test prompt id from "test-semi" to "test-check" - Update prompt metadata id to match new naming convention - Change prompt name from "Test Semi" to "Test Check" - Align test identifiers with refactored check/judge type naming

ayo6706 added 2 commits January 9, 2026 10:26

coderabbitai bot reviewed Jan 9, 2026

View reviewed changes

chore: Rename EvalutionResult to PromptEvaluationResult

ad4cca6

coderabbitai bot reviewed Jan 9, 2026

View reviewed changes

oshorefueled approved these changes Jan 13, 2026

View reviewed changes

oshorefueled merged commit 03ed298 into main Jan 13, 2026
3 checks passed

oshorefueled deleted the refactor/check-judge-terminology branch January 13, 2026 11:04

This was referenced Mar 2, 2026

Add PAT (Pay A Tax) technique for LLM evaluation with gate-check structured output #65

Merged

Only score based on surfaced violations only #69

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor: standardize evaluation terminology to Check and Judge#52

refactor: standardize evaluation terminology to Check and Judge#52
oshorefueled merged 4 commits intomainfrom
refactor/check-judge-terminology

ayo6706 commented Jan 9, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Jan 9, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ayo6706 commented Jan 9, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Reasoning

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Jan 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ayo6706 commented Jan 9, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 9, 2026 •

edited

Loading