Show HN: I built a tamper-evident evidence system for AI agents

The demo loads two runs directly in your browser — no signup, no uploads, no network calls after page load.

Frank: a conservative agent. Verification returns VALID. Phil: an aggressive agent with tampered evidence. Verification returns INVALID and points to the exact line where the chain breaks.

The problem I was solving: when an AI agent does something unexpected in production, the post-mortem usually comes down to "trust our logs." I wanted evidence that could cross trust boundaries — from engineering to security, compliance, or regulators — without asking anyone to trust a dashboard.

How it works:

- Every action, policy decision, and state transition is recorded into a hash-chained NDJSON event log - Logs are sealed into evidence packs (ZIP) with manifests and signatures - A verifier (also in the demo) validates integrity offline and returns VALID / INVALID / PARTIAL with machine-readable reason codes - The same inputs always produce the same artifacts — so diffs are meaningful and replay is deterministic

The verifier and the UI are deliberately separated. The UI can be wrong. The verifier will still accept or reject based on cryptographic proof.

Built this before the recent public incidents around autonomous agents made it topical. Happy to answer questions about the architecture, the proof boundary design, or the gaps I'm still working on.

Summary

The article discusses the Guardian Replay website, which allows users to revisit and analyze past Guardian articles, videos, and other content. The website provides a comprehensive archive of the Guardian's digital output, enabling readers to explore and engage with the publication's journalistic work in new ways.

Story

Show HN: I built a tamper-evident evidence system for AI agents