Story

Post-transformer inference: 224× compression of Llama-70B with improved accuracy

anima-core Wednesday, December 10, 2025
25 9
zenodo.org
Visit article Read on Hacker News Comments 9