Hazumi News | Show HN: RAG Doctor – CLI tool to diagnose broken RAG pipelines

Hi HN,

I’ve been working with a lot of Retrieval-Augmented Generation pipelines recently and kept running into the same debugging problem.

When a RAG system produces bad answers, people usually blame the LLM. But in many cases the issue is somewhere in the pipeline itself.

Things like:

documents not chunked correctly embedding models mismatched retrieval not happening before generation context windows overflowing vector database configuration problems prompt injection exposure

These kinds of issues are surprisingly hard to detect in large codebases.

So I started building a small CLI tool called RAG Doctor that analyzes a project and tries to detect structural problems in RAG pipelines.

The idea is similar to ESLint, but for RAG architectures.

The tool parses the codebase, runs a rule engine, and reports potential issues in the pipeline.

One design choice I made early on was to keep the analysis deterministic. AI is not used to generate findings, only to explain them in human language. This keeps the results reproducible and makes the tool usable in CI workflows.

It’s still early, but I’m curious whether others have run into similar debugging problems when building RAG systems.

If you’ve been working on RAG infrastructure, I’d love to hear what kinds of issues you see most often.

Repo: https://github.com/NeuroForgeLabs/rag-doctor

Any feedback would be appreciated.

Summary

Ragdoctor.dev is a personal website and blog that covers various topics related to programming, software development, and web technologies, providing tutorials, insights, and practical advice to help developers improve their skills and knowledge.

Story

Show HN: RAG Doctor – CLI tool to diagnose broken RAG pipelines