Cohere launches a family of open multilingual models
Overview Enterprise AI company Cohere launched a new family of multilingual models, called Tiny Aya, on the sidelines of the India AI Summit. The models are op...
Overview Enterprise AI company Cohere launched a new family of multilingual models, called Tiny Aya, on the sidelines of the India AI Summit. The models are op...
Although learned representations underlie neural networks' success, their fundamental properties remain poorly understood. A striking example is the emergence o...
Diffusion language models are a promising alternative to autoregressive models due to their potential for faster generation. Among discrete diffusion approaches...
This paper proposes a novel method for Text Style Transfer (TST) based on parameter-efficient fine-tuning of Large Language Models (LLMs). Addressing the scarci...
News recommendation plays a critical role in online news platforms by helping users discover relevant content. Cross-domain news recommendation further requires...
We present a domain-grounded framework and benchmark for tool-aware plan generation in contact centers, where answering a query for business insights, our targe...
We present 'Testimole-conversational' a massive collection of discussion boards messages in the Italian language. The large size of the corpus, more than 30B wo...
Over the last years, state-tracking tasks, particularly permutation composition, have become a testbed to understand the limits of sequence models architectures...
Large Language Models (LLMs) have achieved remarkable progress, with Parameter-Efficient Fine-Tuning (PEFT) emerging as a key technique for downstream task adap...
The Transformer architecture has become the foundation of modern deep learning, yet its core self-attention mechanism suffers from quadratic computational compl...
The entropy rate of printed English is famously estimated to be about one bit per character, a benchmark that modern large language models (LLMs) have only rece...
Video Language Models (VideoLMs) empower AI systems to understand temporal dynamics in videos. To fit to the maximum context window constraint, current methods ...