nlp — Page 3 | EUNO.NEWS

Sort:

2 days ago · ai · - · -

[Paper] SWE-CI: Evaluating Agent Capabilities in Maintaining Codebases via Continuous Integration

Large language model (LLM)-powered agents have demonstrated strong capabilities in automating software engineering tasks such as static bug fixing, as evidenced...

#research #paper #ai #machine-learning #nlp
2 days ago · ai · - · -

[Paper] CONCUR: Benchmarking LLMs for Concurrent Code Generation

Leveraging Large Language Models (LLMs) for code generation has increasingly emerged as a common practice in the domain of software engineering. Relevant benchm...

#research #paper #ai #machine-learning #nlp
2 days ago · ai · - · -

[Paper] Using Learning Progressions to Guide AI Feedback for Science Learning

Generative artificial intelligence (AI) offers scalable support for formative feedback, yet most AI-generated feedback relies on task-specific rubrics authored ...

#research #paper #ai #nlp
2 days ago · ai · - · -

[Paper] Density-Guided Response Optimization: Community-Grounded Alignment via Implicit Acceptance Signals

Language models deployed in online communities must adapt to norms that vary across social, cultural, and domain-specific contexts. Prior alignment approaches r...

#research #paper #ai #machine-learning #nlp
2 days ago · ai · - · -

[Paper] Understanding and Mitigating Dataset Corruption in LLM Steering

Contrastive steering has been shown as a simple and effective method to adjust the generative behavior of LLMs at inference time. It uses examples of prompt res...

#research #paper #ai #machine-learning #nlp
2 days ago · ai · - · -

[Paper] Learning When to Act or Refuse: Guarding Agentic Reasoning Models for Safe Multi-Step Tool Use

Agentic language models operate in a fundamentally different safety regime than chat models: they must plan, call tools, and execute long-horizon actions where ...

#research #paper #ai #nlp
2 days ago · ai · - · -

[Paper] No Memorization, No Detection: Output Distribution-Based Contamination Detection in Small Language Models

CDD, or Contamination Detection via output Distribution, identifies data contamination by measuring the peakedness of a model's sampled outputs. We study the co...

#research #paper #ai #machine-learning #nlp
2 days ago · ai · - · -

[Paper] Code2Math: Can Your Code Agent Effectively Evolve Math Problems Through Exploration?

As large language models (LLMs) advance their mathematical capabilities toward the IMO level, the scarcity of challenging, high-quality problems for training an...

#research #paper #ai #nlp
2 days ago · ai · - · -

[Paper] ACE-Brain-0: Spatial Intelligence as a Shared Scaffold for Universal Embodiments

Universal embodied intelligence demands robust generalization across heterogeneous embodiments, such as autonomous driving, robotics, and unmanned aerial vehicl...

#research #paper #ai #nlp #computer-vision
2 days ago · ai · - · -

[Paper] BeyondSWE: Can Current Code Agent Survive Beyond Single-Repo Bug Fixing?

Current benchmarks for code agents primarily assess narrow, repository-specific fixes, overlooking critical real-world challenges such as cross-repository reaso...

#research #paper #ai #nlp
2 days ago · ai · - · -

[Paper] MoD-DPO: Towards Mitigating Cross-modal Hallucinations in Omni LLMs using Modality Decoupled Preference Optimization

Omni-modal large language models (omni LLMs) have recently achieved strong performance across audiovisual understanding tasks, yet they remain highly susceptibl...

#research #paper #ai #machine-learning #nlp #computer-vision
2 days ago · ai · - · -

[Paper] Type-Aware Retrieval-Augmented Generation with Dependency Closure for Solver-Executable Industrial Optimization Modeling

Automated industrial optimization modeling requires reliable translation of natural-language requirements into solver-executable code. However, large language m...

#research #paper #ai #machine-learning #nlp

Newer posts

Older posts