Hazumi News | Show HN: XR2 – A/B test your LLM prompts and track which ones convert

I built a prompt management platform after running an AI SaaS (148k users). The biggest pain wasn't the model — it was iterating on prompts without deploying code.

Existing tools (Langfuse, PromptLayer) are great for tracing LLM calls. But we needed something different: which prompt version leads to more signups? More purchases? What's the conversion rate per variant?

xR2 lets you:

Store and version prompts outside your codebase Run A/B tests between prompt variants Track events (signup, purchase, etc.) and attribute them back to the prompt version Get statistical significance before picking a winner REST API + SDKs for Python, TypeScript, n8n, Make. Free tier available.

Built with Next.js, Supabase, deployed on Cloudflare. Solo founder.

Site: https://xr2.uk Docs: https://docs.xr2.uk

Would love feedback — especially on what's missing for your use case.

Story

Show HN: XR2 – A/B test your LLM prompts and track which ones convert