Story

Show HN: We built an OCR API to stop babysitting extraction pipelines

tchop2020 Tuesday, January 20, 2026

Hey HN — we built an OCR API for teams who are tired of constantly maintaining extraction pipelines.

If you’re running OCR in production, this is what it changes for you: - You don’t have to maintain custom orchestration code across multiple OCR models. We run a managed, multi-model consensus engine instead. - You don’t have to hand-tune prompts or schemas. The system auto-optimizes prompts and schemas against observed extraction failures. - You don’t get silent failures. Every field comes with explicit error flags so downstream systems know what to trust. - You don’t have to keep tweaking the pipeline as documents change. The extraction loop converges automatically toward higher accuracy over time.

Under the hood, we run extraction, analyze failure patterns, and adapt prompts and schemas without manual intervention. The result is fewer retries, fewer production surprises, and less ongoing maintenance.

It’s API-first and free to try with a single API call.

Docs: https://www.deepread.tech/docs?utm_source=hackernews&utm_med...

Happy to discuss the technical details, tradeoffs, and the joy (and pain) of building a less-lousy OCR

1 0
Read on Hacker News