Skip to content

feat: config file handling — cascading settings from multiple sources #10

@Jamie-BitFlight

Description

@Jamie-BitFlight

Summary

Add a configuration system that loads settings from multiple file formats and locations with cascading priority, enabling per-project customization of detection behavior without modifying the core script.

Config File Sources (Priority Order)

Load and merge config from these locations, highest priority wins:

  1. CLI/environmentHALLUCINATION_DETECTOR_CONFIG env var pointing to a file path
  2. Project root.hallucination-detectorrc.cjs (CJS module, allows dynamic config)
  3. Project rootproject.jsonhallucination-detector key
  4. Project rootpyproject.toml[tool.hallucination-detector] section
  5. Home directory~/.hallucination-detectorrc.cjs (user-level defaults)
  6. Built-in defaults — current hardcoded values (zero-config still works)

File Format Examples

.hallucination-detectorrc.cjs (most flexible):

module.exports = {
  categories: {
    speculation_language: {
      enabled: true,
      severity: 'warning',      // 'error' | 'warning' | 'info'
      customPatterns: [
        { pattern: /\bperhaps\b/i, evidence: 'perhaps' },
        { pattern: /\bmight be\b/i, evidence: 'might be' },
      ],
      responseTemplate: 'STOP: Speculation detected — "${evidence}" at position ${offset}. Restate using observed facts.',
    },
    pseudo_quantification: {
      enabled: false,  // disable entire category
    },
  },
};

pyproject.toml:

[tool.hallucination-detector]
severity = "warning"

[tool.hallucination-detector.categories.speculation_language]
enabled = true
severity = "error"
custom-patterns = [
  { pattern = "\\bperhaps\\b", evidence = "perhaps" },
]

[tool.hallucination-detector.categories.pseudo_quantification]
enabled = false

project.json:

{
  "hallucination-detector": {
    "severity": "warning",
    "categories": {
      "speculation_language": {
        "enabled": true,
        "customPatterns": [
          { "pattern": "\\bperhaps\\b", "evidence": "perhaps" }
        ]
      }
    }
  }
}

Configurable Settings

Global Settings

  • severity — default severity for all categories ('error' | 'warning' | 'info'). Default: 'error'
  • maxTriggersPerResponse — max trigger matches before stopping analysis (performance guard). Default: 20
  • maxBlocksPerSession — cumulative blocks before escalating or suppressing. Default: unlimited
  • outputFormat — response format ('text' | 'json' | 'jsonl'). Default: 'text'
  • debug — enable verbose logging to stderr. Default: false

Per-Category Settings

  • enabled — enable/disable individual categories. Default: true for all
  • severity — override global severity per category
  • customPatterns — array of { pattern, evidence } objects to add to built-in patterns
  • replacePatterns — if true, customPatterns replaces built-in patterns instead of extending. Default: false
  • responseTemplate — custom response template string with ${evidence}, ${offset}, ${kind} interpolation
  • suppressionRules — custom suppression rules (e.g., additional evidence markers, context patterns)

Filtering Settings

  • ignorePatterns — glob patterns for file paths to skip when scanning (e.g., ['**/test/**', '*.spec.*'])
  • ignoreBlocks — additional code block markers to strip (beyond markdown fences)
  • evidenceMarkers — custom strings that count as "evidence nearby" for causality suppression (e.g., ['according to', 'as shown in'])
  • allowlist — phrases that should never trigger (exact match or regex)

Response Settings

  • responseTemplates — override all response templates per category
  • includeContext — include surrounding text in trigger reports. Default: true
  • contextLines — number of surrounding lines to include. Default: 2

Implementation Notes

  • Config loading should happen once at startup, not per-invocation
  • Use require() for .cjs files (already CJS environment)
  • Parse TOML with a lightweight parser (no heavy dependencies — consider bundling a minimal parser or requiring @iarna/toml as optional peer dep)
  • JSON parsing is built-in
  • Deep merge configs with later sources overriding earlier ones
  • Validate config schema and emit clear error messages for invalid config
  • Export loadConfig() and mergeConfig() for testability

Acceptance Criteria

  • Zero-config still works identically to current behavior
  • .hallucination-detectorrc.cjs loaded from project root
  • pyproject.toml [tool.hallucination-detector] section parsed
  • project.json hallucination-detector key parsed
  • ~/.hallucination-detectorrc.cjs loaded as user defaults
  • HALLUCINATION_DETECTOR_CONFIG env var overrides all file sources
  • Custom patterns extend built-in patterns by default
  • replacePatterns: true replaces instead of extending
  • Per-category enable/disable works
  • Per-category severity override works
  • Custom response templates with variable interpolation
  • Allowlist prevents specific phrases from triggering
  • Config validation with clear error messages
  • Tests cover config loading, merging, and override behavior
  • Documentation updated with config file reference

Related Issues

Sub-issues

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestimpact: infrastructureAdds new subsystem (config, MCP, state). Medium risk.phase: 2-config-foundationPhase 2: Config system (unlocks phases 4-8)risk: mediumMedium risk — new subsystem or structural changetopic: configConfiguration system

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions