[Paper] ARCADE: A City-Scale Corpus for Fine-Grained Arabic Dialect Tagging
The Arabic language is characterized by a rich tapestry of regional dialects that differ substantially in phonetics and lexicon, reflecting the geographic and c...
The Arabic language is characterized by a rich tapestry of regional dialects that differ substantially in phonetics and lexicon, reflecting the geographic and c...
Despite continuous advances in medical technology, the global distribution of health care resources remains uneven. The development of large language models (LL...
While confidence estimation is a promising direction for mitigating hallucinations in Large Language Models (LLMs), current research dominantly focuses on singl...
We present a training-free method for detecting valid mathematical reasoning in large language models through spectral analysis of attention patterns. By treati...
Population-based cancer registries depend on pathology reports as their primary diagnostic source, yet manual abstraction is resource-intensive and contributes ...
Large Language Models (LLMs) have become a mainstay for many everyday applications. However, as data evolve their knowledge quickly becomes outdated. Continual ...
Identifying relevant text spans is important for several downstream tasks in NLP, as it contributes to model explainability. While most span identification appr...
Ticket troubleshooting refers to the process of analyzing and resolving problems that are reported through a ticketing system. In large organizations offering a...
Language model (LM) probability is not a reliable quality estimator, as natural language is ambiguous. When multiple output options are valid, the model's proba...
Sequence modeling layers in modern language models typically face a trade-off between storage capacity and computational efficiency. While Softmax attention off...
Large Protein Language Models have shown strong potential for generative protein design, yet they frequently produce structural hallucinations, generating seque...
Large language models (LLMs) frequently produce contextual hallucinations, where generated content contradicts or ignores information explicitly stated in the p...