LLM response mocking for tests — streaming SSE, tool_use blocks, fuzzy matching.
Every AI agent framework needs tests. But calling real LLM APIs in CI is:
- Slow — 1-5 seconds per call
- Expensive — $0.01-0.10 per test run adds up fast
- Non-deterministic — Same prompt, different response every time
- Fragile — API rate limits, outages, model deprecations
Generic HTTP mocking tools (msw, nock) don't understand LLM protocol semantics — streaming SSE chunks, tool_use content blocks, multi-turn conversation state, finish reasons, or token counting. You end up writing 100+ lines of boilerplate per mock response.
mocklm gives you one-line mock definitions that speak LLM natively.
// Instead of 50 lines of raw HTTP mock setup...
mock.onMessage('What files exist?').respondWith({
toolUse: { name: 'list_files', input: { path: '.' } },
text: 'Here are the files in the current directory.',
});- Provider-aware — Native Anthropic Messages API + OpenAI Chat Completions formats
- Streaming SSE — Proper event types (
message_start,content_block_delta,chat.completion.chunk) - Tool-use mocking —
tool_use,tool_result,tool_callswith auto-generated IDs - Fuzzy matching — Exact, substring (case-insensitive), regex with capture groups
- Record/Replay — Capture real API responses, replay deterministically
- Fixture management — Compressed gzip storage with versioning and diffing
- Assertions —
toHaveReceivedMessage,toHaveRespondedWithTool,assertAllMatched - Test runner integration — Auto
beforeEach/afterEachcleanup for Vitest and Jest - Zero dependencies — Only dev dependencies (vitest, typescript)
npm install --save-dev mocklmimport { MockLM } from 'mocklm';
// Create a mock for Anthropic
const mock = new MockLM('anthropic');
// Register mock rules
mock.onMessage('hello').respondWithText('Hi there!');
mock.onMessage(/search for (.+)/).respondWith({
toolUse: { name: 'web_search', input: { query: 'cats' } },
});
// Process a request (simulates what happens when your code calls the API)
const result = mock.processRequest(JSON.stringify({
model: 'claude-sonnet-4-20250514',
messages: [{ role: 'user', content: 'hello world' }],
}));
// result.body contains a proper Anthropic Messages API response
const response = JSON.parse(result.body);
console.log(response.content[0].text); // "Hi there!"import { MockLM } from 'mocklm';
const mock = new MockLM('openai');
mock.onMessage('explain').respondWithText('Here is the explanation...');
const result = mock.processRequest(JSON.stringify({
model: 'gpt-4',
messages: [{ role: 'user', content: 'explain quantum computing' }],
}));
const response = JSON.parse(result.body);
console.log(response.choices[0].message.content); // "Here is the explanation..."The main class for creating and managing LLM mocks.
import { MockLM } from 'mocklm';
const mock = new MockLM('anthropic'); // or 'openai'Register a mock rule that matches requests containing the given pattern.
- String — Case-insensitive substring match
- RegExp — Full regex matching with capture groups
mock.onMessage('hello').respondWithText('Hi!');
mock.onMessage(/^search for (.+)/).respondWithText('Found it!');Register a catch-all rule (lower priority than specific matches).
mock.onAnyMessage().respondWithText('Default response');Build multi-turn conversation mocks.
mock.onConversation()
.userSays('Plan a trip')
.assistantResponds('Where would you like to go?')
.userSays('Paris')
.assistantResponds({
text: 'Great choice!',
toolUse: { name: 'search_flights', input: { dest: 'CDG' } },
});Process a JSON request body and return the mock response.
const result = mock.processRequest(JSON.stringify({
model: 'claude-sonnet-4-20250514',
messages: [{ role: 'user', content: 'hello' }],
}));
// result.statusCode — 200 or 404
// result.headers — Content-Type headers
// result.body — JSON or SSE response bodyInstall/uninstall the HTTP interceptor to automatically intercept http.request and https.request calls.
mock.install();
// Your code makes API calls normally — mocklm intercepts them
mock.uninstall();Clear all rules, request logs, and state.
Access registered rules and request history.
Fluent builder returned by onMessage().
Respond with a simple text message.
Respond with a full response definition.
mock.onMessage('find').respondWith({
text: 'Let me search for that',
toolUse: { name: 'search', input: { query: 'test' } },
stopReason: 'tool_use',
usage: { inputTokens: 50, outputTokens: 20 },
});Respond with a streaming SSE response.
// Auto-chunked text
mock.onMessage('explain').respondWithStream({
text: 'Here is the explanation...',
chunkSize: 5,
});
// Manual chunks
mock.onMessage('greet').respondWithStream({
chunks: ['Hello', ' there', '!'],
delayMs: 50,
});
// Tool use in stream
mock.onMessage('search').respondWithStream({
text: 'Searching...',
toolUse: { name: 'search', input: { q: 'test' } },
});Limit how many times a rule can match.
mock.onMessage('hello').once().respondWithText('First time only');Set matching priority (higher = matched first).
interface MockResponseDef {
text?: string; // Text content
toolUse?: ToolUseBlock | ToolUseBlock[]; // Tool use blocks
toolResult?: ToolResultBlock; // Tool result block
stopReason?: string; // 'end_turn', 'tool_use', 'max_tokens'
model?: string; // Override response model
usage?: { inputTokens: number; outputTokens: number };
}
interface ToolUseBlock {
name: string;
input: Record<string, unknown>;
id?: string; // Auto-generated if not provided
}// Check received messages
mock.assertions.toHaveReceivedMessage('hello');
mock.assertions.toHaveReceivedNMessages(3);
// Check responses
mock.assertions.toHaveRespondedWithTool('web_search');
mock.assertions.toHaveRequestedModel('claude-sonnet-4-20250514');
// Verify all mocks were used
mock.assertAllMatched();
// Get raw data
const messages = mock.assertions.receivedMessages();
const tools = mock.assertions.respondedTools();import { Recorder } from 'mocklm';
const recorder = new Recorder({
provider: 'anthropic',
outputDir: './fixtures',
compress: true, // gzip compression (default)
});
recorder.start();
// Record a call
recorder.record({
provider: 'anthropic',
request: { model: 'claude-sonnet-4-20250514', messages: [...] },
response: actualApiResponse,
stream: false,
});
const filePath = recorder.stop(); // Saves fixturesimport { Replayer, createStore, addEntry } from 'mocklm';
const replayer = new Replayer({
fixtureDir: './fixtures',
strict: true, // Throw on unmatched requests
});
replayer.load('fixtures_anthropic.json.gz');
// Sequential replay
const entry = replayer.next();
// Pattern-based replay
const match = replayer.findMatch('claude-sonnet-4-20250514', 'hello');
// Coverage stats
console.log(replayer.stats()); // { total: 5, used: 2, remaining: 3 }import { createStore, addEntry, saveStore, loadStore, diffStores } from 'mocklm';
// Create and populate
let store = createStore('anthropic');
store = addEntry(store, {
provider: 'anthropic',
request: { model: 'claude-sonnet-4-20250514', messages: [...], stream: false },
response: responseData,
timestamp: Date.now(),
});
// Save (auto-compresses .gz files)
saveStore(store, './fixtures/data.json.gz');
// Load
const loaded = loadStore('./fixtures/data.json.gz');
// Diff two stores
const diff = diffStores(oldStore, newStore);
console.log(`Added: ${diff.added.length}, Removed: ${diff.removed.length}`);import { createPattern, exactPattern, matchText, similarity } from 'mocklm';
// Substring match (case-insensitive)
matchText('Hello World', createPattern('hello')); // true
// Exact match
matchText('Hello', exactPattern('Hello')); // true
matchText('Hello World', exactPattern('Hello')); // false
// Regex match
matchText('search for cats', createPattern(/search for (.+)/)); // true
// Similarity score (Dice coefficient)
similarity('hello world', 'hello worlds'); // ~0.9import { generateAnthropicSSE, generateOpenAISSE, serializeAllSSE } from 'mocklm';
// Generate Anthropic SSE events
const events = generateAnthropicSSE({
text: 'Hello!',
chunkSize: 5,
});
// Returns: message_start, content_block_start, content_block_delta(s),
// content_block_stop, message_delta, message_stop
// Generate OpenAI SSE events
const openaiEvents = generateOpenAISSE({
text: 'Hello!',
toolUse: { name: 'search', input: { q: 'test' } },
});
// Serialize to wire format
const body = serializeAllSSE(events, 'anthropic');
// "event: message_start\ndata: {...}\n\n..."import { setupMockLM } from 'mocklm/testing';
describe('my agent', () => {
const mock = setupMockLM({ provider: 'anthropic' });
// Auto beforeEach: install()
// Auto afterEach: uninstall() + reset()
it('searches when asked', () => {
mock.onMessage(/find/).respondWith({
toolUse: { name: 'search', input: { q: 'test' } },
});
// ... your test code calls the LLM API ...
// mocklm intercepts and returns the mock response
mock.assertions.toHaveRespondedWithTool('search');
});
});{
"id": "msg_000001",
"type": "message",
"role": "assistant",
"content": [
{ "type": "text", "text": "Hello!" },
{ "type": "tool_use", "id": "toolu_000002", "name": "search", "input": { "q": "test" } }
],
"model": "claude-sonnet-4-20250514",
"stop_reason": "tool_use",
"usage": { "input_tokens": 25, "output_tokens": 10 }
}{
"id": "chatcmpl_000001",
"object": "chat.completion",
"model": "gpt-4",
"choices": [{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello!",
"tool_calls": [{
"id": "call_000002",
"type": "function",
"function": { "name": "search", "arguments": "{\"q\":\"test\"}" }
}]
},
"finish_reason": "tool_calls"
}],
"usage": { "prompt_tokens": 25, "completion_tokens": 10, "total_tokens": 35 }
}src/
index.ts Public API exports
mock-server.ts HTTP interceptor + MockLM class
providers/
types.ts Shared type definitions
anthropic.ts Anthropic Messages API adapter
openai.ts OpenAI Chat Completions adapter
matching/
matcher.ts Request matching engine
fuzzy.ts Fuzzy/regex/exact matching
streaming/
sse.ts SSE event generation
chunker.ts Text chunking for streams
recording/
recorder.ts Record API calls to fixtures
replayer.ts Replay from fixtures
fixtures/
store.ts Compressed fixture storage
assertions/
matchers.ts Custom test assertions
testing/
setup.ts Vitest/Jest integration
# Install
npm install
# Run tests
npm test
# Type check
npx tsc --noEmit
# Watch mode
npm run test:watchMIT -- JSLEEKR 2026