Skip to content

feat(hashline): Phase 1 — Core algorithm module (hash function + apply engine) #210

@randomm

Description

@randomm

Context

Part of the hashline EPIC (#209). This is Phase 1 — the foundation that Phases 2 and 3 depend on. You are building the pure algorithmic core with no tool wiring yet.

Read the EPIC (#209) first for the full picture.

What You Are Building

A single new file: packages/opencode/src/tool/hashline.ts

This module owns:

  1. Line hash computation — turns a line of text into a single CJK character hash
  2. Apply engine — takes file content + a list of edits, validates all anchors, applies them bottom-up
  3. Error types — structured errors the edit tool will surface to the model

No tool definition in this file. No imports from tool.ts. Pure logic only.

The Hashing Algorithm

// Normalize a line before hashing.
// Intentionally collapses all whitespace (including indentation) so re-indented
// lines produce the same hash. Do NOT use this form for display — only for hashing.
export function normalizeLine(line: string): string {
  return line.replace(/\r$/, "").replace(/\s+/g, " ").trim()
}

// Hash a single line to a CJK unicode character.
// Output: a single CJK character in range U+4E00–U+9FFF, e.g. "丐", "丑", "丒"
// Uses Bun's built-in xxHash32 — no npm package needed. Synchronous.
export function hashLine(line: string): string {
  const normalized = normalizeLine(line)
  const hash32 = Bun.hash.xxHash32(normalized, 0)
  return String.fromCharCode(0x4E00 + (hash32 % 20992))
}

Why CJK? 20,992 possible values vs 256 for 2-char hex — dramatically fewer collisions. CJK characters are visually distinct from ASCII so anchors stand out unambiguously in model output.

The Edit Operations

// A parsed line anchor: line number + CJK hash character
export type Anchor = { line: number; hashChar: string }

// Parse "14丐" → { line: 14, hashChar: "丐" }
// Throws descriptively if format is invalid
export function parseAnchor(ref: string): Anchor {
  const match = ref.match(/^(\d+)([\u4E00-\u9FFF])$/)
  if (!match) throw new Error(`Invalid anchor: "${ref}" — expected a line number followed by a single CJK character, e.g. "14丐"`)
  return { line: parseInt(match[1], 10), hashChar: match[2] }
}

export type HashlineEdit =
  | { op: "set_line"; anchor: string; new_text: string }
  | { op: "replace_lines"; start_anchor: string; end_anchor: string; new_text: string }
  | { op: "insert_after"; anchor: string; text: string }

The Apply Engine

// Apply a list of hashline edits to file content.
// Returns the new file content as a string.
// Synchronous — Bun.hash.xxHash32 is synchronous.
// Throws HashlineMismatchError if any anchor hash doesn't match.
// Throws HashlineNoOpError if edits produce no change.
export function applyHashlineEdits(content: string, edits: HashlineEdit[]): string

Implementation steps (must follow this exact order):

Step 1 — Split and build the line map

const lines = content.split("\n")
// NOTE: "a\nb\n".split("\n") → ["a", "b", ""] — a trailing empty string appears
// for files ending with \n. This is correct. lines.join("\n") restores the original.
// Build 1-indexed map: lineNumber → CJK hash char
const lineMap = new Map<number, string>()
lines.forEach((line, i) => lineMap.set(i + 1, hashLine(line)))

Also build a snapshot of currentLines immediately (used in error reporting):

const currentLines = lines.map((line, i) => `${i + 1}${hashLine(line)}${line}`).join("\n")

Step 2 — Validate ALL anchors before mutating anything

For each edit, for each anchor ref in that edit, call parseAnchor() then:

  • lineMap.get(line) === hashCharmatch, ok
  • No match but hash found at exactly one other line → relocated, record new line number
  • No match but hash found at multiple other lines → ambiguous, treat as mismatch
  • No match, hash found nowhere → mismatch, collect it

If ANY mismatches or ambiguous relocations exist, throw HashlineMismatchError. Never partially apply.

Step 3 — Resolve relocations

Update each relocated edit's effective line number to the new location before applying.

Step 4 — Sort edits bottom-up

Sort by resolved target line number, descending, so splicing line N doesn't invalidate higher lines:

function sortKey(edit: HashlineEdit): number {
  if (edit.op === "replace_lines") return parseAnchor(edit.end_anchor).line
  if (edit.op === "insert_after") return parseAnchor(edit.anchor).line
  return parseAnchor(edit.anchor).line  // set_line
}
edits.sort((a, b) => sortKey(b) - sortKey(a))

Note: replace_lines sorts by end_anchor (highest line it touches). Overlapping edits are not validated — behaviour is undefined if two edits target the same line.

Step 5 — Apply edits

// set_line
if (edit.new_text === "") {
  lines.splice(anchor.line - 1, 1)          // truly delete the line
} else {
  lines[anchor.line - 1] = edit.new_text    // replace content
}

// replace_lines
const replacement = edit.new_text === "" ? [] : edit.new_text.split("\n")
// "" → [] deletes the range; non-empty splits into replacement lines
lines.splice(start.line - 1, end.line - start.line + 1, ...replacement)

// insert_after
const inserted = edit.text.split("\n")
lines.splice(anchor.line, 0, ...inserted)   // insert after anchor.line (0-indexed: anchor.line, not -1)

Step 6 — Check for no-op

const result = lines.join("\n")
if (result === content) throw new HashlineNoOpError("edits produced identical content")
return result

Error Types

export class HashlineMismatchError extends Error {
  constructor(
    public readonly mismatches: Array<{
      expected: Anchor      // what the model sent
      actual: string        // the CJK char actually at that line number
      lineContent: string   // current content of that line
    }>,
    public readonly currentLines: string  // full file as "LINENUM꜀CONTENT\n..." (same format as hashline_read)
  ) {
    // Build human-readable message using → prefix for mismatched line numbers
    // Example:
    // "2 line(s) have changed since last read. Use the updated LINE꜀ references shown below (→ marks changed lines):\n\n1丐function hello() {\n→2丑  return 42;\n3丒}"
    super(/* your message here */)
  }
}

export class HashlineNoOpError extends Error {
  constructor(public readonly detail: string) {
    super(`No changes made. ${detail}`)
  }
}

Tests

Create file: packages/opencode/test/tool/hashline.test.ts

Pattern (pure logic — no Instance.provide needed):

import { describe, expect, test } from "bun:test"
import { normalizeLine, hashLine, parseAnchor, applyHashlineEdits, HashlineMismatchError, HashlineNoOpError } from "../../src/tool/hashline"

Required test cases (18 total):

  1. normalizeLine — strips trailing \r
  2. normalizeLine — collapses multiple internal spaces to single space
  3. normalizeLine — strips leading and trailing whitespace
  4. hashLine — returns a single character with charCodeAt(0) in range [0x4E00, 0x9FFF]
  5. hashLine — is stable: same input always produces same output
  6. parseAnchor — parses valid anchor "14丐"{ line: 14, hashChar: "丐" }
  7. parseAnchor — throws on colon-format "14:a3" (wrong format)
  8. parseAnchor — throws on string with no CJK char
  9. applyHashlineEditsset_line replaces the correct line
  10. applyHashlineEditsset_line with new_text: "" removes the line entirely (line count decreases by 1)
  11. applyHashlineEditsreplace_lines replaces a contiguous range correctly
  12. applyHashlineEditsreplace_lines with new_text: "" deletes the range entirely (not replace with blank line)
  13. applyHashlineEditsinsert_after inserts at the correct position
  14. applyHashlineEdits — two edits on different lines both apply correctly (bottom-up ordering)
  15. applyHashlineEdits — throws HashlineMismatchError when hash doesn't match
  16. applyHashlineEdits — relocates a line when hash found at different line number
  17. applyHashlineEdits — throws HashlineMismatchError (not relocates) when hash is ambiguous (multiple matches)
  18. applyHashlineEdits — throws HashlineNoOpError when edits produce identical content
  19. applyHashlineEdits — atomicity: no mutation when validation fails
  20. applyHashlineEdits — trailing newline preserved: editing a file ending with \n produces output also ending with \n

File Conventions

  • Prefer const over let; use ternary/early returns instead of reassignment
  • Avoid else — prefer early returns
  • No // TODO, // FIXME, as any, @ts-ignore
  • Use type inference — avoid explicit annotations unless needed for exports
  • Max 500 lines per file

Acceptance Criteria

  • packages/opencode/src/tool/hashline.ts created
  • Exports: hashLine, normalizeLine, parseAnchor, applyHashlineEdits, HashlineMismatchError, HashlineNoOpError, HashlineEdit, Anchor
  • All 20 tests pass: bun test test/tool/hashline.test.ts
  • bun run typecheck passes with 0 errors
  • No as any, @ts-ignore, or type suppressions
  • File is under 500 lines

Quality Gates (Non-Negotiable)

  • TDD: Write failing tests first, then implement
  • Coverage: Every exported function has tests
  • Local verification: bun run typecheck && bun test pass before completion

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions