Show HN: I Stopped Hoping My LLM Would Cooperate

42 validation errors in one run. Claude apologising instead of writing HTML. OAuth tokens expiring mid-digest.

Then I fixed the constraints. Eight days, zero failures, zero intervention.

The secret wasn't better prompts... it was treating the LLM as a constrained function: schema-validated tool calls that reject malformed output and force retries, two-pass architecture separating editorial judgment from formatting, and boring DevOps (retry logic, rate limiting, structured logging).

The Claude invocation is ~30 lines in a 2000-line system. Most of the work is everything around it.

https://seanfloyd.dev/blog/llm-reliability https://github.com/SeanLF/claude-rss-news-digest

Story

Show HN: I Stopped Hoping My LLM Would Cooperate