model compression | EUNO.NEWS

1 week ago · ai

Sopro TTS: A 169M model with zero-shot voice cloning that runs on the CPU

Article URL: https://github.com/samuel-vitorino/sopro Comments URL: https://news.ycombinator.com/item?id=46546113 Points: 33 Comments: 10...

#text-to-speech #voice cloning #zero-shot #cpu inference #open-source #deep learning #speech synthesis #model compression #machine learning
1 month ago · ai

The OptiPFair Series #1: Forging the Future with Small Models — An Architectural Analysis with Pere Martra

Originally published on Principia Agentica The OptiPFair Series – Episode 1 A deep‑dive exploration of Small Language Models SLM optimization. The AI race has...

#small language models #model optimization #pruning #bias removal #efficiency #LLM #AI fairness #model compression
1 month ago · ai

AdaSPEC: Selective Knowledge Distillation for Efficient Speculative Decoders

Introduction AdaSPEC is a new method that speeds up large language models by using a small draft model for the initial generation pass, followed by verificatio...

#speculative decoding #knowledge distillation #large language models #inference acceleration #draft model #AdaSPEC #AI efficiency #model compression
1 month ago · ai

Z-Image GGUF Practical Guide: Unlock Top-Tier AI Art with Consumer GPUs (Beginner Version)

Introduction: Breaking the “GPU Anxiety” – Even 6 GB Can Run Large Models In the world of AI art generation, higher‑quality models usually come with massive si...

#AI art #GGUF quantization #ComfyUI #GPU optimization #model compression
1 month ago · ai

224× Compression of Llama-70B with Higher Accuracy (Paper and Code)

Article URL: https://zenodo.org/records/17873275 Comments URL: https://news.ycombinator.com/item?id=46212969 Points: 14 Comments: 5...

#model compression #Llama-70B #quantization #deep learning #paper #code
1 month ago · ai

[Paper] CanKD: Cross-Attention-based Non-local operation for Feature-based Knowledge Distillation

We propose Cross-Attention-based Non-local Knowledge Distillation (CanKD), a novel feature-based knowledge distillation framework that leverages cross-attention...

#knowledge distillation #cross-attention #computer vision #model compression #deep learning