224× Compression of Llama-70B with Higher Accuracy (Paper and Code)
Source: Hacker News
Article URL: https://zenodo.org/records/17873275 Comments URL: https://news.ycombinator.com/item?id=46212969 Points: 14
Source: Hacker News
Article URL: https://zenodo.org/records/17873275 Comments URL: https://news.ycombinator.com/item?id=46212969 Points: 14
We propose Cross-Attention-based Non-local Knowledge Distillation (CanKD), a novel feature-based knowledge distillation framework that leverages cross-attention...
Generative AI has rapidly evolved into one of the most disruptive technologies shaping today’s digital landscape. From automated content creation to intelligent...
Recent advances in diffusion transformers have empowered video generation models to generate high-quality video clips from texts or images. However, world model...
Novel View Synthesis (NVS) has traditionally relied on models with explicit 3D inductive biases combined with known camera parameters from Structure-from-Motion...