Skip to content

JSLEEKR/mocklm

Repository files navigation

mocklm

LLM response mocking for tests — streaming SSE, tool_use blocks, fuzzy matching.

CI npm License TypeScript Zero Dependencies Tests


Why This Exists

Every AI agent framework needs tests. But calling real LLM APIs in CI is:

  • Slow — 1-5 seconds per call
  • Expensive — $0.01-0.10 per test run adds up fast
  • Non-deterministic — Same prompt, different response every time
  • Fragile — API rate limits, outages, model deprecations

Generic HTTP mocking tools (msw, nock) don't understand LLM protocol semantics — streaming SSE chunks, tool_use content blocks, multi-turn conversation state, finish reasons, or token counting. You end up writing 100+ lines of boilerplate per mock response.

mocklm gives you one-line mock definitions that speak LLM natively.

// Instead of 50 lines of raw HTTP mock setup...
mock.onMessage('What files exist?').respondWith({
  toolUse: { name: 'list_files', input: { path: '.' } },
  text: 'Here are the files in the current directory.',
});

Features

  • Provider-aware — Native Anthropic Messages API + OpenAI Chat Completions formats
  • Streaming SSE — Proper event types (message_start, content_block_delta, chat.completion.chunk)
  • Tool-use mockingtool_use, tool_result, tool_calls with auto-generated IDs
  • Fuzzy matching — Exact, substring (case-insensitive), regex with capture groups
  • Record/Replay — Capture real API responses, replay deterministically
  • Fixture management — Compressed gzip storage with versioning and diffing
  • AssertionstoHaveReceivedMessage, toHaveRespondedWithTool, assertAllMatched
  • Test runner integration — Auto beforeEach/afterEach cleanup for Vitest and Jest
  • Zero dependencies — Only dev dependencies (vitest, typescript)

Installation

npm install --save-dev mocklm

Quick Start

Anthropic Mock

import { MockLM } from 'mocklm';

// Create a mock for Anthropic
const mock = new MockLM('anthropic');

// Register mock rules
mock.onMessage('hello').respondWithText('Hi there!');

mock.onMessage(/search for (.+)/).respondWith({
  toolUse: { name: 'web_search', input: { query: 'cats' } },
});

// Process a request (simulates what happens when your code calls the API)
const result = mock.processRequest(JSON.stringify({
  model: 'claude-sonnet-4-20250514',
  messages: [{ role: 'user', content: 'hello world' }],
}));

// result.body contains a proper Anthropic Messages API response
const response = JSON.parse(result.body);
console.log(response.content[0].text); // "Hi there!"

OpenAI Mock

import { MockLM } from 'mocklm';

const mock = new MockLM('openai');

mock.onMessage('explain').respondWithText('Here is the explanation...');

const result = mock.processRequest(JSON.stringify({
  model: 'gpt-4',
  messages: [{ role: 'user', content: 'explain quantum computing' }],
}));

const response = JSON.parse(result.body);
console.log(response.choices[0].message.content); // "Here is the explanation..."

API Reference

MockLM

The main class for creating and managing LLM mocks.

import { MockLM } from 'mocklm';

const mock = new MockLM('anthropic'); // or 'openai'

mock.onMessage(pattern: string | RegExp): RuleBuilder

Register a mock rule that matches requests containing the given pattern.

  • String — Case-insensitive substring match
  • RegExp — Full regex matching with capture groups
mock.onMessage('hello').respondWithText('Hi!');
mock.onMessage(/^search for (.+)/).respondWithText('Found it!');

mock.onAnyMessage(): RuleBuilder

Register a catch-all rule (lower priority than specific matches).

mock.onAnyMessage().respondWithText('Default response');

mock.onConversation(): ConversationBuilder

Build multi-turn conversation mocks.

mock.onConversation()
  .userSays('Plan a trip')
  .assistantResponds('Where would you like to go?')
  .userSays('Paris')
  .assistantResponds({
    text: 'Great choice!',
    toolUse: { name: 'search_flights', input: { dest: 'CDG' } },
  });

mock.processRequest(body: string): Result | null

Process a JSON request body and return the mock response.

const result = mock.processRequest(JSON.stringify({
  model: 'claude-sonnet-4-20250514',
  messages: [{ role: 'user', content: 'hello' }],
}));

// result.statusCode — 200 or 404
// result.headers — Content-Type headers
// result.body — JSON or SSE response body

mock.install() / mock.uninstall()

Install/uninstall the HTTP interceptor to automatically intercept http.request and https.request calls.

mock.install();
// Your code makes API calls normally — mocklm intercepts them
mock.uninstall();

mock.reset()

Clear all rules, request logs, and state.

mock.getRules() / mock.getRequestLogs()

Access registered rules and request history.


RuleBuilder

Fluent builder returned by onMessage().

.respondWithText(text: string): MockLM

Respond with a simple text message.

.respondWith(def: MockResponseDef): MockLM

Respond with a full response definition.

mock.onMessage('find').respondWith({
  text: 'Let me search for that',
  toolUse: { name: 'search', input: { query: 'test' } },
  stopReason: 'tool_use',
  usage: { inputTokens: 50, outputTokens: 20 },
});

.respondWithStream(def: StreamResponseDef): MockLM

Respond with a streaming SSE response.

// Auto-chunked text
mock.onMessage('explain').respondWithStream({
  text: 'Here is the explanation...',
  chunkSize: 5,
});

// Manual chunks
mock.onMessage('greet').respondWithStream({
  chunks: ['Hello', ' there', '!'],
  delayMs: 50,
});

// Tool use in stream
mock.onMessage('search').respondWithStream({
  text: 'Searching...',
  toolUse: { name: 'search', input: { q: 'test' } },
});

.times(n: number) / .once()

Limit how many times a rule can match.

mock.onMessage('hello').once().respondWithText('First time only');

.priority(p: number)

Set matching priority (higher = matched first).


MockResponseDef

interface MockResponseDef {
  text?: string;                    // Text content
  toolUse?: ToolUseBlock | ToolUseBlock[];  // Tool use blocks
  toolResult?: ToolResultBlock;     // Tool result block
  stopReason?: string;              // 'end_turn', 'tool_use', 'max_tokens'
  model?: string;                   // Override response model
  usage?: { inputTokens: number; outputTokens: number };
}

interface ToolUseBlock {
  name: string;
  input: Record<string, unknown>;
  id?: string;  // Auto-generated if not provided
}

Assertions

// Check received messages
mock.assertions.toHaveReceivedMessage('hello');
mock.assertions.toHaveReceivedNMessages(3);

// Check responses
mock.assertions.toHaveRespondedWithTool('web_search');
mock.assertions.toHaveRequestedModel('claude-sonnet-4-20250514');

// Verify all mocks were used
mock.assertAllMatched();

// Get raw data
const messages = mock.assertions.receivedMessages();
const tools = mock.assertions.respondedTools();

Record/Replay

Recording

import { Recorder } from 'mocklm';

const recorder = new Recorder({
  provider: 'anthropic',
  outputDir: './fixtures',
  compress: true,  // gzip compression (default)
});

recorder.start();

// Record a call
recorder.record({
  provider: 'anthropic',
  request: { model: 'claude-sonnet-4-20250514', messages: [...] },
  response: actualApiResponse,
  stream: false,
});

const filePath = recorder.stop(); // Saves fixtures

Replaying

import { Replayer, createStore, addEntry } from 'mocklm';

const replayer = new Replayer({
  fixtureDir: './fixtures',
  strict: true,  // Throw on unmatched requests
});

replayer.load('fixtures_anthropic.json.gz');

// Sequential replay
const entry = replayer.next();

// Pattern-based replay
const match = replayer.findMatch('claude-sonnet-4-20250514', 'hello');

// Coverage stats
console.log(replayer.stats()); // { total: 5, used: 2, remaining: 3 }

Fixture Store

import { createStore, addEntry, saveStore, loadStore, diffStores } from 'mocklm';

// Create and populate
let store = createStore('anthropic');
store = addEntry(store, {
  provider: 'anthropic',
  request: { model: 'claude-sonnet-4-20250514', messages: [...], stream: false },
  response: responseData,
  timestamp: Date.now(),
});

// Save (auto-compresses .gz files)
saveStore(store, './fixtures/data.json.gz');

// Load
const loaded = loadStore('./fixtures/data.json.gz');

// Diff two stores
const diff = diffStores(oldStore, newStore);
console.log(`Added: ${diff.added.length}, Removed: ${diff.removed.length}`);

Fuzzy Matching

import { createPattern, exactPattern, matchText, similarity } from 'mocklm';

// Substring match (case-insensitive)
matchText('Hello World', createPattern('hello')); // true

// Exact match
matchText('Hello', exactPattern('Hello')); // true
matchText('Hello World', exactPattern('Hello')); // false

// Regex match
matchText('search for cats', createPattern(/search for (.+)/)); // true

// Similarity score (Dice coefficient)
similarity('hello world', 'hello worlds'); // ~0.9

SSE Generation

import { generateAnthropicSSE, generateOpenAISSE, serializeAllSSE } from 'mocklm';

// Generate Anthropic SSE events
const events = generateAnthropicSSE({
  text: 'Hello!',
  chunkSize: 5,
});
// Returns: message_start, content_block_start, content_block_delta(s),
//          content_block_stop, message_delta, message_stop

// Generate OpenAI SSE events
const openaiEvents = generateOpenAISSE({
  text: 'Hello!',
  toolUse: { name: 'search', input: { q: 'test' } },
});

// Serialize to wire format
const body = serializeAllSSE(events, 'anthropic');
// "event: message_start\ndata: {...}\n\n..."

Test Runner Integration

import { setupMockLM } from 'mocklm/testing';

describe('my agent', () => {
  const mock = setupMockLM({ provider: 'anthropic' });
  // Auto beforeEach: install()
  // Auto afterEach: uninstall() + reset()

  it('searches when asked', () => {
    mock.onMessage(/find/).respondWith({
      toolUse: { name: 'search', input: { q: 'test' } },
    });

    // ... your test code calls the LLM API ...
    // mocklm intercepts and returns the mock response

    mock.assertions.toHaveRespondedWithTool('search');
  });
});

Provider Format Reference

Anthropic Response Shape

{
  "id": "msg_000001",
  "type": "message",
  "role": "assistant",
  "content": [
    { "type": "text", "text": "Hello!" },
    { "type": "tool_use", "id": "toolu_000002", "name": "search", "input": { "q": "test" } }
  ],
  "model": "claude-sonnet-4-20250514",
  "stop_reason": "tool_use",
  "usage": { "input_tokens": 25, "output_tokens": 10 }
}

OpenAI Response Shape

{
  "id": "chatcmpl_000001",
  "object": "chat.completion",
  "model": "gpt-4",
  "choices": [{
    "index": 0,
    "message": {
      "role": "assistant",
      "content": "Hello!",
      "tool_calls": [{
        "id": "call_000002",
        "type": "function",
        "function": { "name": "search", "arguments": "{\"q\":\"test\"}" }
      }]
    },
    "finish_reason": "tool_calls"
  }],
  "usage": { "prompt_tokens": 25, "completion_tokens": 10, "total_tokens": 35 }
}

Architecture

src/
  index.ts                 Public API exports
  mock-server.ts           HTTP interceptor + MockLM class
  providers/
    types.ts               Shared type definitions
    anthropic.ts           Anthropic Messages API adapter
    openai.ts              OpenAI Chat Completions adapter
  matching/
    matcher.ts             Request matching engine
    fuzzy.ts               Fuzzy/regex/exact matching
  streaming/
    sse.ts                 SSE event generation
    chunker.ts             Text chunking for streams
  recording/
    recorder.ts            Record API calls to fixtures
    replayer.ts            Replay from fixtures
  fixtures/
    store.ts               Compressed fixture storage
  assertions/
    matchers.ts            Custom test assertions
  testing/
    setup.ts               Vitest/Jest integration

Development

# Install
npm install

# Run tests
npm test

# Type check
npx tsc --noEmit

# Watch mode
npm run test:watch

License

MIT -- JSLEEKR 2026

About

LLM response mocking for tests — streaming SSE, tool_use blocks, fuzzy matching. msw for LLMs.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors