paper — Page 3 | EUNO.NEWS

Sort:

4 days ago · software · - · -

[Paper] SiblingRepair: Sibling-Based Multi-Hunk Repair with Large Language Models

Developers often make similar mistakes across code locations implementing related functionalities. These locations, called siblings, share similar issues and re...

#research #paper #software
4 days ago · ai · - · -

[Paper] Teaching LLMs Program Semantics via Symbolic Execution Traces

We introduce an evaluation framework of 500 C verification tasks across five property types (memory safety, overflow, termination, reachability, data races) bui...

#research #paper #ai #machine-learning
4 days ago · software · - · -

[Paper] Modeling Dependency-Propagated Ecosystem Impact of Changes in Maintenance Activities: Evaluating Support Strategies in the PyPI Network

Background: Open source software ecosystems exhibit dense dependency networks in which maintenance degradation of structurally central packages can propagate wi...

#research #paper #software
4 days ago · ai · - · -

[Paper] Beyond Accuracy: Policy Invariance as a Reliability Test for LLM Safety Judges

LLM-as-a-Judge pipelines have become the de facto evaluator for agent safety, yet existing benchmarks treat their verdicts as ground-truth proxies without check...

#research #paper #ai #machine-learning
4 days ago · ai · - · -

[Paper] BUILD-AND-FIND: An Effort-Aware Protocol for Evaluating Agent-Managed Codebases

Most coding-agent benchmarks ask whether generated code behaves correctly. That remains essential, but repository-level engineering is increasingly agent-manage...

#research #paper #ai #machine-learning
4 days ago · software · - · -

[Paper] Breaking, Stale, or Missing? Benchmarking Coding Agents on Project-Level Test Evolution

As production code evolves, the test suite must co-evolve to remain effective. Existing benchmarks for test evolution operate at method-level granularity with p...

#research #paper #software
4 days ago · devops · - · -

[Paper] TACO: A Toolsuite for the Verification of Threshold Automata

We present TACO, a toolsuite for the development and automatic verification of fault-tolerant and threshold-based distributed algorithms. Our toolsuite implemen...

#research #paper #devops
4 days ago · devops · - · -

[Paper] Tackling the Data-Parallel Load Balancing Bottleneck in LLM Serving: Practical Online Routing at Scale

Data-parallel (DP) load balancing has emerged as a first-order bottleneck in large-scale LLM serving. When a model is sharded across devices via tensor parallel...

#research #paper #devops
4 days ago · ai · - · -

[Paper] VibeServe: Can AI Agents Build Bespoke LLM Serving Systems?

For years, we have built LLM serving systems like any other critical infrastructure: a single general-purpose stack, hand-tuned over many engineer-years, meant ...

#research #paper #ai #machine-learning
4 days ago · devops · - · -

[Paper] FalconGEMM: Surpassing Hardware Peaks with Lower-Complexity Matrix Multiplication

Peak breaking Matrix Multiplication is a promising technique to improve the performance of DL, especially in LLM training and inference. We present FalconGEMM, ...

#research #paper #devops
4 days ago · ai · - · -

[Paper] Efficient event-driven retrieval in high-capacity kernel Hopfield networks

High-capacity associative memory models, such as Kernel Logistic Regression (KLR) Hopfield networks, have demonstrated strong storage capabilities but typically...

#research #paper #ai
4 days ago · ai · - · -

[Paper] MDN: Parallelizing Stepwise Momentum for Delta Linear Attention

Linear Attention (LA) offers a promising paradigm for scaling large language models (LLMs) to long sequences by avoiding the quadratic complexity of self-attent...

#research #paper #ai #machine-learning
5 days ago · ai · - · -

[Paper] Graph Normalization: Fast Binarizing Dynamics for Differentiable MWIS

We introduce Graph Normalization (GN), a principled dynamical system on graphs that serves as a differentiable approximation engine for the NP-hard Maximum Weig...

#research #paper #ai #machine-learning
5 days ago · ai · - · -

[Paper] Syn4D: A Multiview Synthetic 4D Dataset

Dense 3D reconstruction and tracking of dynamic scenes from monocular video remains an important open challenge in computer vision. Progress in this area has be...

#research #paper #ai #computer-vision
5 days ago · ai · - · -

[Paper] Taming Outlier Tokens in Diffusion Transformers

We study outlier tokens in Diffusion Transformers (DiTs) for image generation. Prior work has shown that Vision Transformers (ViTs) can produce a small number o...

#research #paper #ai #machine-learning #computer-vision
5 days ago · ai · - · -

[Paper] D-OPSD: On-Policy Self-Distillation for Continuously Tuning Step-Distilled Diffusion Models

The landscape of high-performance image generation models is currently shifting from the inefficient multi-step ones to the efficient few-step counterparts (e.g...

#research #paper #ai #computer-vision
5 days ago · ai · - · -

[Paper] S-LCG: Structured Linear Congruential Generator-Based Deterministic Algorithm for Search and Optimization

This study presents a novel deterministic optimization algorithm based on a special variant of the Linear Congruential Generator (LCG). While conventional algor...

#research #paper #ai
5 days ago · ai · - · -

[Paper] Implicit Representations of Grammaticality in Language Models

Grammaticality and likelihood are distinct notions in human language. Pretrained language models (LMs), which are probabilistic models of language fitted to max...

#research #paper #ai #nlp
5 days ago · ai · - · -

[Paper] Grokability in five inequalities

In this note, we report five mathematical discoveries made in collaboration with Grok, all of which have been subsequently verified by the authors. These includ...

#research #paper #ai #machine-learning
5 days ago · ai · - · -

[Paper] Almost-Orthogonality in Lp Spaces: A Case Study with Grok

Carbery proposed the following sharpened form of triangle inequality for many functions: for any pge 2 and any finite sequence (f_j)_jsubset L^p we have [ Big|s...

#research #paper #ai #machine-learning
5 days ago · ai · - · -

[Paper] LongSeeker: Elastic Context Orchestration for Long-Horizon Search Agents

Long-horizon search agents must manage a rapidly growing working context as they reason, call tools, and observe information. Naively accumulating all intermedi...

#research #paper #ai #machine-learning
5 days ago · ai · - · -

[Paper] Sharp Capacity Thresholds in Linear Associative Memory: From Winner-Take-All to Listwise Retrieval

How many key-value associations can a dtimes d linear memory store? We show that the answer depends not only on the d^2 degrees of freedom in the memory matrix,...

#research #paper #ai #machine-learning
5 days ago · ai · - · -

[Paper] LoViF 2026 The First Challenge on Holistic Quality Assessment for 4D World Model (PhyScore)

This paper reports on the LoViF 2026 PhyScore challenge, a competition on holistic quality assessment of world-model-generated videos across both 2D and 4D gene...

#research #paper #ai #computer-vision
5 days ago · ai · - · -

[Paper] OpenSearch-VL: An Open Recipe for Frontier Multimodal Search Agents

Deep search has become a crucial capability for frontier multimodal agents, enabling models to solve complex questions through active search, evidence verificat...

#research #paper #ai #computer-vision
5 days ago · ai · - · -

[Paper] Estimating the expected output of wide random MLPs more efficiently than sampling

By far the most common way to estimate an expected loss in machine learning is to draw samples, compute the loss on each one, and take the empirical average. Ho...

#research #paper #ai #machine-learning
5 days ago · ai · - · -

[Paper] Understanding In-Context Learning for Nonlinear Regression with Transformers: Attention as Featurizer

Pre-trained transformers are able to learn from examples provided as part of the prompt without any weight updates, a remarkable ability known as in-context lea...

#research #paper #ai #machine-learning
5 days ago · ai · - · -

[Paper] MRI-Eval: A Tiered Benchmark for Evaluating LLM Performance on MRI Physics and GE Scanner Operations Knowledge

Background: Existing MRI LLM benchmarks rely mainly on review-book multiple-choice questions, where top proprietary models already score highly, limiting discri...

#research #paper #ai #nlp
5 days ago · ai · - · -

[Paper] When Life Gives You BC, Make Q-functions: Extracting Q-values from Behavior Cloning for On-Robot Reinforcement Learning

Behavior Cloning (BC) has emerged as a highly effective paradigm for robot learning. However, BC lacks a self-guided mechanism for online improvement after demo...

#research #paper #ai #machine-learning
5 days ago · ai · - · -

[Paper] Design Conductor 2.0: An agent builds a TurboQuant inference accelerator in 80 hours

Driven by a rapid co-evolution of both harness and underlying models, LLM agents are improving at a dizzying pace. In our prior work (performed in Dec. 2025), w...

#research #paper #ai #machine-learning
5 days ago · ai · - · -

[Paper] The First Token Knows: Single-Decode Confidence for Hallucination Detection

Self-consistency detects hallucinations by generating multiple sampled answers to a question and measuring agreement, but this requires repeated decoding and ca...

#research #paper #ai #machine-learning #nlp
5 days ago · ai · - · -

[Paper] Direct From Darwin: Deriving Advanced Optimizers From Evolutionary First Principles

Evolutionary computation has long promised to deliver both high-performance optimization tools as well as rigorous scientific simulations of Darwinian evolution...

#research #paper #ai #machine-learning
5 days ago · ai · - · -

[Paper] Geometry-Aware State Space Model: A New Paradigm for Whole-Slide Image Representation

Accurate analysis of histopathological images is critical for disease diagnosis and treatment planning. Whole-slide images (WSIs), which digitize tissue specime...

#research #paper #ai #machine-learning #computer-vision
5 days ago · ai · - · -

[Paper] PhysForge: Generating Physics-Grounded 3D Assets for Interactive Virtual World

Synthesizing physics-grounded 3D assets is a critical bottleneck for interactive virtual worlds and embodied AI. Existing methods predominantly focus on static ...

#research #paper #ai #computer-vision
5 days ago · ai · - · -

[Paper] Wasserstein-Aligned Localisation for VLM-Based Distributional OOD Detection in Medical Imaging

Zero-shot anomaly localisation via vision-language models (VLMs) offers a compelling approach for rare pathology detection, yet its performance is fundamentally...

#research #paper #ai #computer-vision
5 days ago · ai · - · -

[Paper] PSK at SemEval-2026 Task 9: Multilingual Polarization Detection Using Ensemble Gemma Models with Synthetic Data Augmentation

We present our system for SemEval-2026 Task 9: Multilingual Polarization Detection, a binary classification task spanning 22 languages. Our approach fine-tunes ...

#research #paper #ai #machine-learning #nlp
5 days ago · ai · - · -

[Paper] Aes3D: Aesthetic Assessment in 3D Gaussian Splatting

As 3D Gaussian Splatting (3DGS) gains attention in immersive media and digital content creation, assessing the aesthetics of 3D scenes becomes important in help...

#research #paper #ai #machine-learning #computer-vision
5 days ago · ai · - · -

[Paper] Superposition Is Not Necessary: A Mechanistic Interpretability Analysis of Transformer Representations for Time Series Forecasting

Transformer architectures have been widely adopted for time series forecasting, yet whether the representational mechanisms that make them powerful in NLP actua...

#research #paper #ai #machine-learning
5 days ago · ai · - · -

[Paper] What Matters in Practical Learned Image Compression

One of the major differentiators unlocked by learned codecs relative to their hard-coded traditional counterparts is their ability to be optimized directly to a...

#research #paper #ai #machine-learning #computer-vision
5 days ago · devops · - · -

[Paper] Toward a Risk Assessment Framework for Institutional DeFi: A Nine-Dimension Approach

Decentralized finance (DeFi) protocols now intermediate over USD 100 billion in value, including regulated stablecoins and tokenized assets deployed as collater...

#research #paper #devops
5 days ago · ai · - · -

[Paper] Human-AI Co-Mentorship in Project-Based Learning: A Case Study in Financial Forecasting

This paper reflects on a AI research project carried out by a team of high-school and early-undergraduate students under the mentorship of graduate researchers ...

#research #paper #ai #machine-learning
5 days ago · ai · - · -

[Paper] Low-Cost Black-Box Detection of LLM Hallucinations via Dynamical System Prediction

Large Language Models (LLMs) frequently generate plausible but non-factual content, a phenomenon known as hallucination. While existing detection methods typica...

#research #paper #ai #machine-learning
5 days ago · ai · - · -

[Paper] Transformed Latent Variable Multi-Output Gaussian Processes

Multi-Output Gaussian Processes (MOGPs) provide a principled probabilistic framework for modelling correlated outputs but face scalability bottlenecks when appl...

#research #paper #ai #machine-learning
5 days ago · ai · - · -

[Paper] Beyond Semantics: An Evidential Reasoning-Aware Multi-View Learning Framework for Trustworthy Mental Health Prediction

Automated mental health prediction using textual data has shown promising results with deep learning and large language models. However, deploying these models ...

#research #paper #ai #nlp
5 days ago · ai · - · -

[Paper] Text Corpora as Concept Fields: Black-Box Hallucination and Novelty Measurement

We introduce the **Concept Field** of a text corpus: a local drift field with pointwise uncertainty, estimated in sentence-embedding space from the deltas betwe...

#research #paper #ai #machine-learning #nlp
5 days ago · ai · - · -

[Paper] Continual Knowledge Updating in LLM Systems: Learning Through Multi-Timescale Memory Dynamics

LLMs are trained once, then deployed into a world that never stops changing. External memory compensates for this, but most systems manage it explicitly rather ...

#research #paper #ai #machine-learning #nlp
5 days ago · ai · - · -

[Paper] Automatically Finding and Validating Unexpected Side-Effects of Interventions on Language Models

We present an automated, contrastive evaluation pipeline for auditing the behavioral impact of interventions on large language models. Given a base model M_1 an...

#research #paper #ai #machine-learning #nlp
5 days ago · ai · - · -

[Paper] The Pinocchio Dimension: Phenomenality of Experience as the Primary Axis of LLM Psychometric Differences

We administer 45 validated psychometric questionnaires to 50 large language models (LLMs) to identify the dimensions along which LLMs differ psychometrically. U...

#research #paper #ai #nlp
5 days ago · ai · - · -

[Paper] The Impossibility Triangle of Long-Context Modeling

We identify and prove a fundamental trade-off governing long-sequence models: no model can simultaneously achieve (i) per-step computation independent of sequen...

#research #paper #ai #machine-learning #nlp

Newer posts

Older posts