NLP — Page 4 | EUNO.NEWS

Sort:

2 weeks ago · ai · - · -

[Paper] Finding Duplicates in 1.1M BDD Steps: cukereuse, a Paraphrase-Robust Static Detector for Cucumber and Gherkin

Behaviour-Driven Development (BDD) suites accumulate step-text duplication whose maintenance cost is established in prior work. Existing detection techniques re...

#research #paper #ai #nlp
2 weeks ago · ai · - · -

[Paper] Discovering a Shared Logical Subspace: Steering LLM Logical Reasoning via Alignment of Natural-Language and Symbolic Views

Large Language Models (LLMs) still struggle with multi-step logical reasoning. Existing approaches either purely refine the reasoning chain in natural language ...

#research #paper #ai #nlp
2 weeks ago · ai · - · -

[Paper] Epistemic orientation in parliamentary discourse is associated with deliberative democracy

The pursuit of truth is central to democratic deliberation and governance, yet political discourse reflects varying epistemic orientations, ranging from evidenc...

#research #paper #ai #nlp
2 weeks ago · ai · - · -

[Paper] An Answer is just the Start: Related Insight Generation for Open-Ended Document-Grounded QA

Answering open-ended questions remains challenging for AI systems because it requires synthesis, judgment, and exploration beyond factual retrieval, and users o...

#research #paper #ai #nlp
2 weeks ago · ai · - · -

[Paper] Exploring Language-Agnosticity in Function Vectors: A Case Study in Machine Translation

Function vectors (FVs) are vector representations of tasks extracted from model activations during in-context learning. While prior work has shown that multilin...

#research #paper #ai #nlp
2 weeks ago · ai · - · -

[Paper] Chat2Workflow: A Benchmark for Generating Executable Visual Workflows with Natural Language

At present, executable visual workflows have emerged as a mainstream paradigm in real-world industrial deployments, offering strong reliability and controllabil...

#research #paper #ai #machine-learning #nlp #computer-vision
2 weeks ago · ai · - · -

[Paper] Pause or Fabricate? Training Language Models for Grounded Reasoning

Large language models have achieved remarkable progress on complex reasoning tasks. However, they often implicitly fabricate information when inputs are incompl...

#research #paper #ai #nlp
2 weeks ago · ai · - · -

[Paper] The signal is the ceiling: Measurement limits of LLM-predicted experience ratings from open-ended survey text

An earlier paper (Hong, Potteiger, and Zapata 2026) established that an unoptimized GPT 4.1 prompt predicts fan-reported experience ratings within one point 67%...

#research #paper #ai #nlp
2 weeks ago · ai · - · -

[Paper] Micro Language Models Enable Instant Responses

Edge devices such as smartwatches and smart glasses cannot continuously run even the smallest 100M-1B parameter language models due to power and compute constra...

#research #paper #ai #nlp
2 weeks ago · ai · - · -

[Paper] SafetyALFRED: Evaluating Safety-Conscious Planning of Multimodal Large Language Models

Multimodal Large Language Models are increasingly adopted as autonomous agents in interactive environments, yet their ability to proactively address safety haza...

#research #paper #ai #machine-learning #nlp
2 weeks ago · ai · - · -

[Paper] The 'Small World of Words' German Free-Association Norms

Free-association norms provide essential empirical data for investigating linguistic, semantic, and cultural phenomena in the cognitive sciences. Although large...

#research #paper #ai #nlp
2 weeks ago · ai · - · -

[Paper] What Makes an LLM a Good Optimizer? A Trajectory Analysis of LLM-Guided Evolutionary Search

Recent work has demonstrated the promise of orchestrating large language models (LLMs) within evolutionary and agentic optimization systems. However, the mechan...

#research #paper #ai #nlp
3 weeks ago · ai · - · -

[Paper] Sessa: Selective State Space Attention

Modern sequence models are dominated by Transformers, where self-attention mixes information from the visible context in an input-dependent way. However, when r...

#research #paper #ai #machine-learning #nlp
3 weeks ago · ai · - · -

[Paper] A multimodal and temporal foundation model for virtual patient representations at healthcare system scale

Modern medicine generates vast multimodal data across siloed systems, yet no existing model integrates the full breadth and temporal depth of the clinical recor...

#research #paper #ai #machine-learning #nlp
3 weeks ago · ai · - · -

[Paper] Latent Phase-Shift Rollback: Inference-Time Error Correction via Residual Stream Monitoring and KV-Cache Steering

Large language models frequently commit unrecoverable reasoning errors mid-generation: once a wrong step is taken, subsequent tokens compound the mistake rather...

#research #paper #ai #machine-learning #nlp
3 weeks ago · ai · - · -

[Paper] Dual Alignment Between Language Model Layers and Human Sentence Processing

A recent study (Kuribayashi et al., 2025) has shown that human sentence processing behavior, typically measured on syntactically unchallenging constructions, ca...

#research #paper #ai #nlp
3 weeks ago · ai · - · -

[Paper] GSQ: Highly-Accurate Low-Precision Scalar Quantization for LLMs via Gumbel-Softmax Sampling

Weight quantization has become a standard tool for efficient LLM deployment, especially for local inference, where models are now routinely served at 2-3 bits p...

#research #paper #ai #machine-learning #nlp
3 weeks ago · ai · - · -

[Paper] FUSE: Ensembling Verifiers with Zero Labeled Data

Verification of model outputs is rapidly emerging as a key primitive for both training and real-world deployment of large language models (LLMs). In practice, t...

#research #paper #ai #machine-learning #nlp
3 weeks ago · ai · - · -

[Paper] ClawEnvKit: Automatic Environment Generation for Claw-Like Agents

Constructing environments for training and evaluating claw-like agents remains a manual, human-intensive process that does not scale. We argue that what is need...

#research #paper #ai #machine-learning #nlp
3 weeks ago · ai · - · -

[Paper] Transition-Matrix Regularization for Next Dialogue Act Prediction in Counselling Conversations

This paper studies how empirical dialogue-flow statistics can be incorporated into Next Dialogue Act Prediction (NDAP). A KL regularization term is proposed tha...

#research #paper #ai #machine-learning #nlp
3 weeks ago · ai · - · -

[Paper] Different Paths to Harmful Compliance: Behavioral Side Effects and Mechanistic Divergence Across LLM Jailbreaks

Open-weight language models can be rendered unsafe through several distinct interventions, but the resulting models may differ substantially in capabilities, be...

#research #paper #ai #machine-learning #nlp
3 weeks ago · ai · - · -

[Paper] MASS-RAG: Multi-Agent Synthesis Retrieval-Augmented Generation

Large language models (LLMs) are widely used in retrieval-augmented generation (RAG) to incorporate external knowledge at inference time. However, when retrieve...

#research #paper #ai #nlp
3 weeks ago · ai · - · -

Qwen3.6-Max-Preview: Smarter, Sharper, Still Evolving

Article URL: https://qwen.ai/blog?id=qwen3.6-max-preview Comments URL: https://news.ycombinator.com/item?id=47834565 Points: 38 Comments: 8...

#Qwen3.6 #large language model #LLM #AI research #deep learning #NLP #model preview
3 weeks ago · ai · - · -

[Paper] DeInfer: Efficient Parallel Inferencing for Decomposed Large Language Models

Existing works on large language model (LLM) decomposition mainly focus on improving performance on downstream tasks, but they ignore the poor parallel inferenc...

#research #paper #ai #nlp
3 weeks ago · ai · - · -

[Paper] VIBE: Voice-Induced open-ended Bias Evaluation for Large Audio-Language Models via Real-World Speech

Large Audio-Language Models (LALMs) are increasingly integrated into daily applications, yet their generative biases remain underexplored. Existing speech fairn...

#research #paper #ai #nlp
3 weeks ago · ai · - · -

[Paper] DORA Explorer: Improving the Exploration Ability of LLMs Without Training

Despite the rapid progress, LLMs for sequential decision-making (i.e., LLM agents) still struggle to produce diverse outputs. This leads to insufficient explora...

#research #paper #ai #machine-learning #nlp
3 weeks ago · ai · - · -

[Paper] A Multi-Agent Approach for Claim Verification from Tabular Data Documents

We present a novel approach for claim verification from tabular data documents. Recent LLM-based approaches either employ complex pretraining/fine-tuning or dec...

#research #paper #ai #nlp
3 weeks ago · ai · - · -

[Paper] Demystifying the unreasonable effectiveness of online alignment methods

Iterative alignment methods based on purely greedy updates are remarkably effective in practice, yet existing theoretical guarantees of (O(log T)) KL-regularize...

#research #paper #ai #machine-learning #nlp
3 weeks ago · ai · - · -

[Paper] Calibrating Model-Based Evaluation Metrics for Summarization

Recent advances in summary evaluation are based on model-based metrics to assess quality dimensions, such as completeness, conciseness, and faithfulness. Howeve...

#research #paper #ai #nlp
3 weeks ago · ai · - · -

[Paper] Learning to Reason with Insight for Informal Theorem Proving

Although most of the automated theorem-proving approaches depend on formal proof systems, informal theorem proving can align better with large language models' ...

#research #paper #ai #machine-learning #nlp
3 weeks ago · ai · - · -

[Paper] No Universal Courtesy: A Cross-Linguistic, Multi-Model Study of Politeness Effects on LLMs Using the PLUM Corpus

This paper explores the response of Large Language Models (LLMs) to user prompts with different degrees of politeness and impoliteness. The Politeness Theory by...

#research #paper #ai #nlp
3 weeks ago · ai · - · -

[Paper] VEFX-Bench: A Holistic Benchmark for Generic Video Editing and Visual Effects

As AI-assisted video creation becomes increasingly practical, instruction-guided video editing has become essential for refining generated or captured footage t...

#research #paper #ai #machine-learning #nlp #computer-vision
3 weeks ago · ai · - · -

[Paper] From Benchmarking to Reasoning: A Dual-Aspect, Large-Scale Evaluation of LLMs on Vietnamese Legal Text

The complexity of Vietnam's legal texts presents a significant barrier to public access to justice. While Large Language Models offer a promising solution for l...

#research #paper #ai #machine-learning #nlp
3 weeks ago · ai · - · -

[Paper] SwanNLP at SemEval-2026 Task 5: An LLM-based Framework for Plausibility Scoring in Narrative Word Sense Disambiguation

Recent advances in language models have substantially improved Natural Language Understanding (NLU). Although widely used benchmarks suggest that Large Language...

#research #paper #ai #nlp
3 weeks ago · ai · - · -

[Paper] Do Vision-Language Models Truly Perform Vision Reasoning? A Rigorous Study of the Modality Gap

Reasoning in vision-language models (VLMs) has recently attracted significant attention due to its broad applicability across diverse downstream tasks. However,...

#research #paper #ai #nlp #computer-vision
3 weeks ago · ai · - · -

[Paper] Detecting and Suppressing Reward Hacking with Gradient Fingerprints

Reinforcement learning with verifiable rewards (RLVR) typically optimizes for outcome rewards without imposing constraints on intermediate reasoning. This leave...

#research #paper #ai #machine-learning #nlp
3 weeks ago · ai · - · -

[Paper] BAGEL: Benchmarking Animal Knowledge Expertise in Language Models

Large language models have shown strong performance on broad-domain knowledge and reasoning benchmarks, but it remains unclear how well language models handle s...

#research #paper #ai #machine-learning #nlp
3 weeks ago · ai · - · -

[Paper] Optimizing Korean-Centric LLMs via Token Pruning

This paper presents a systematic benchmark of state-of-the-art multilingual large language models (LLMs) adapted via token pruning - a compression technique tha...

#research #paper #ai #nlp
3 weeks ago · ai · - · -

[Paper] Beyond Surface Statistics: Robust Conformal Prediction for LLMs via Internal Representations

Large language models are increasingly deployed in settings where reliability matters, yet output-level uncertainty signals such as token probabilities, entropy...

#research #paper #ai #machine-learning #nlp
3 weeks ago · ai · - · -

[Paper] JumpLoRA: Sparse Adapters for Continual Learning in Large Language Models

Adapter-based methods have become a cost-effective approach to continual learning (CL) for Large Language Models (LLMs), by sequentially learning a low-rank upd...

#research #paper #ai #machine-learning #nlp
3 weeks ago · ai · - · -

[Paper] AtManRL: Towards Faithful Reasoning via Differentiable Attention Saliency

Large language models (LLMs) increasingly rely on chain-of-thought (CoT) reasoning to solve complex tasks. Yet ensuring that the reasoning trace both contribute...

#research #paper #ai #machine-learning #nlp
3 weeks ago · ai · - · -

[Paper] On the Rejection Criterion for Proxy-based Test-time Alignment

Recent works proposed test-time alignment methods that rely on a small aligned model as a proxy that guides the generation of a larger base (unaligned) model. T...

#research #paper #ai #nlp
3 weeks ago · ai · - · -

[Paper] Sentiment Analysis of German Sign Language Fairy Tales

We present a dataset and a model for sentiment analysis of German sign language (DGS) fairy tales. First, we perform sentiment analysis for three levels of vale...

#research #paper #ai #machine-learning #nlp
3 weeks ago · ai · - · -

[Paper] LLMSniffer: Detecting LLM-Generated Code via GraphCodeBERT and Supervised Contrastive Learning

The rapid proliferation of Large Language Models (LLMs) in software development has made distinguishing AI-generated code from human-written code a critical cha...

#research #paper #ai #nlp
3 weeks ago · ai · - · -

[Paper] Why Fine-Tuning Encourages Hallucinations and How to Fix It

Large language models are prone to hallucinating factually incorrect statements. A key source of these errors is exposure to new factual information through sup...

#research #paper #ai #machine-learning #nlp
3 weeks ago · ai · - · -

Understanding Transformers Part 8: Shared Weights in Self-Attention

!Cover image for Understanding Transformers Part 8: Shared Weights in Self-Attentionhttps://media2.dev.to/dynamic/image/width=1000,height=420,fit=cover,gravity=...

#transformers #self-attention #shared-weights #deep-learning #neural-networks #nlp
3 weeks ago · ai · - · -

[Paper] MM-WebAgent: A Hierarchical Multimodal Web Agent for Webpage Generation

The rapid progress of Artificial Intelligence Generated Content (AIGC) tools enables images, videos, and visualizations to be created on demand for webpage desi...

#research #paper #ai #machine-learning #nlp #computer-vision
3 weeks ago · ai · - · -

[Paper] Diagnosing LLM Judge Reliability: Conformal Prediction Sets and Transitivity Violations

LLM-as-judge frameworks are increasingly used for automatic NLG evaluation, yet their per-instance reliability remains poorly understood. We present a two-prong...

#research #paper #ai #machine-learning #nlp

Newer posts

Older posts