LLM optimization | EUNO.NEWS

4 days ago · ai

Beyond Benchmaxxing: Why the Future of AI Is Inference-Time Search

Article URL: https://adlrocha.substack.com/p/adlrocha-beyond-benchmaxxing-why Comments URL: https://news.ycombinator.com/item?id=46486290 Points: 3 Comments: 0...

#inference-time search #AI performance #model benchmarking #LLM optimization #machine learning inference
6 days ago · ai

REFRAG y la dependencia crítica a los pesos del modelo

Introducción Llevamos todo el 2025 obsesionados con el tamaño de la ventana de contexto: 128 k, 1 millón, 2 millones de tokens. Los proveedores nos vendían la...

#LLM optimization #context window #relevance verification #model weight dependency #token efficiency
2 weeks ago · ai

Two Efficient Technologies to Reduce AI Token Costs: TOON and Microsoft's LLMLingua-2

'TOON Data Serialization and Microsoft’s LLMLingua‑2 Prompt Compressor

#token cost #prompt compression #LLMLingua-2 #TOON #LLM optimization #generative AI #cost reduction