Post-transformer inference: 224× compression of Llama-70B with improved accuracy
Published: (December 9, 2025 at 08:25 PM EST)
1 min read
Source: Hacker News
Source: Hacker News
Article Information
- Article URL: https://zenodo.org/records/17873275
- Comments URL: https://news.ycombinator.com/item?id=46212969
- Points: 23
- Comments: 9