Show HN: Hekate – A Zero-Copy ZK Engine Overcoming the Memory Wall
y00zzeek Sunday, January 18, 2026Most ZK proving systems are optimized for server-grade hardware with massive RAM. When scaling to industrial-sized traces (2^20+ rows), they often hit a "Memory Wall" where allocation and data movement become a larger bottleneck than the actual computation.
I have been developing Hekate, a ZK engine written in Rust that utilizes a Zero-Copy streaming model and a hybrid tiled evaluator. To test its limits, I ran a head-to-head benchmark against Binius64 on an Apple M3 Max laptop using Keccak-256.
The results highlight a significant architectural divergence:
At 2^15 rows: Binius64 is faster (147ms vs 202ms), but Hekate is already 10x more memory efficient (44MB vs ~400MB).
At 2^20 rows: Binius64 hits 72GB of RAM usage, entering swap hell on a laptop. Hekate processes the same workload in 4.74s using just 1.4GB of RAM.
At 2^24 rows (16.7M steps): Hekate finishes in 88s with a peak RAM of 21.5GB. Binius64 is unable to complete the task due to OOM/Swap on this hardware.
The core difference is "Materialization vs. Streaming". While many engines materialize and copy massive polynomials in RAM during Sumcheck and PCS operations, Hekate streams them through the CPU cache in tiles. This shifts the unit economics of ZK proving from $2.00/hour high-memory cloud instances to $0.10/hour commodity hardware or local edge devices.
I am looking for feedback from the community, especially those working on binary fields, GKR, and memory-constrained SNARK/STARK implementations.