Show HN: mcp-recorder – VCR.py for MCP servers. Record, replay, verify
caballeto Friday, March 06, 2026Hi HN, I'm Vlad. I've been building MCP servers and related tooling for a while now, and I kept hitting a class of bug that no unit test caught: someone on the team renames a tool parameter or tweaks a tool description, all the tests pass, but the AI agent that was calling that tool silently breaks. This happens because the model reads tool descriptions and parameter schemas to decide which tool to call and how, so a renamed parameter or a reworded description isn't just a cosmetic change — it directly affects the model's behavior.
The MCP spec doesn't have tool versioning available yet, and there's no static artifact describing what a server exposes. The tools/list just returns whatever's in memory at runtime and there's nothing to commit or diff against, which means changes slip through that can break downstream workflows without noticing.
The same problem for HTTP was already solved a long time ago with VCR.py, and I realized the same pattern works here. mcp-recorder captures the full MCP interaction sequence — initialize, tools/list, tools/call — into a JSON cassette file. Because it records complete protocol exchanges rather than just schema snapshots, you're testing actual behavior: if a tool call that used to return a specific format now returns something different, or a capability quietly disappears during the handshake, the cassette catches it. From that single recording you can replay it as a mock server (no API keys, fully deterministic), or verify your changed server against it and catch any diff:
Verifying golden.json against node dist/index.js
1. initialize [PASS]
2. tools/list [PASS]
3. tools/call [search] [FAIL]
$.result.content[0].text: "old output" != "new output"
4. tools/call [analyze] [PASS]
Result: 3/4 passed, 1 failedNon-zero exit code on any mismatch, so it plugs straight into CI.
You can try it right now with minimal setup, there's a public demo server and a scenarios file included:
pip install mcp-recorder mcp-recorder record-scenarios scenarios.yml mcp-recorder verify --cassette cassettes/demo_walkthrough.json \ --target https://mcp.devhelm.io
It works with both HTTP and stdio transports. Scenarios are defined in YAML so it works with MCP servers in any language, and there's a pytest plugin if you want tighter integration. Secret redaction and environment variable interpolation are built in.
To make sure this actually works on real codebases, I submitted several PRs to production MCP servers: monday.com's MCP server (https://github.com/mondaycom/mcp/pull/222), Tavily's MCP server (https://github.com/tavily-ai/tavily-mcp/pull/113), and Firecrawl's MCP server (https://github.com/firecrawl/firecrawl-mcp-server/pull/175). They went from zero schema coverage to full tool surface verification with a clean schema diff available on each tool change. One big benefit is that you can do verification and replay with no API keys — deterministic responses, no live requests to real servers.
I wrote up a deeper dive into the schema drift problem and the VCR pattern for MCP here: https://devhelm.io/blog/regression-testing-mcp-servers
mcp-recorder is MIT-licensed and on PyPI. Source is at https://github.com/devhelmhq/mcp-recorder — issues and PRs are welcome.
I'm building more tooling around MCP and agent reliability, so if you're dealing with similar problems, I'd genuinely like to hear what's been painful for you.