EUNO.NEWS EUNO.NEWS
  • All (20993) +299
  • AI (3155) +14
  • DevOps (933) +7
  • Software (11054) +203
  • IT (5802) +74
  • Education (48)
  • Notice
  • All (20993) +299
    • AI (3155) +14
    • DevOps (933) +7
    • Software (11054) +203
    • IT (5802) +74
    • Education (48)
  • Notice
  • All (20993) +299
  • AI (3155) +14
  • DevOps (933) +7
  • Software (11054) +203
  • IT (5802) +74
  • Education (48)
  • Notice
Sources Tags Search
한국어 English 中文
  • 3 days ago · ai

    [Paper] ABC-Bench: Benchmarking Agentic Backend Coding in Real-World Development

    The evolution of Large Language Models (LLMs) into autonomous agents has expanded the scope of AI coding from localized code generation to complex, repository-l...

    #research #paper #ai #machine-learning #nlp
  • 4 days ago · ai

    [Paper] MatchTIR: Fine-Grained Supervision for Tool-Integrated Reasoning via Bipartite Matching

    Tool-Integrated Reasoning (TIR) empowers large language models (LLMs) to tackle complex tasks by interleaving reasoning steps with external tool interactions. H...

    #research #paper #ai #machine-learning #nlp
  • 4 days ago · ai

    [Paper] Grounding Agent Memory in Contextual Intent

    Deploying large language models in long-horizon, goal-oriented interactions remains challenging because similar entities and facts recur under different latent ...

    #research #paper #ai #machine-learning #nlp
  • 4 days ago · ai

    [Paper] LIBERTy: A Causal Framework for Benchmarking Concept-Based Explanations of LLMs with Structural Counterfactuals

    Concept-based explanations quantify how high-level concepts (e.g., gender or experience) influence model behavior, which is crucial for decision-makers in high-...

    #research #paper #ai #machine-learning #nlp
  • 4 days ago · ai

    [Paper] Detecting Winning Arguments with Large Language Models and Persuasion Strategies

    Detecting persuasion in argumentative text is a challenging task with important implications for understanding human communication. This work investigates the r...

    #research #paper #ai #nlp
  • 4 days ago · ai

    [Paper] Influential Training Data Retrieval for Explaining Verbalized Confidence of LLMs

    Large language models (LLMs) can increase users' perceived trust by verbalizing confidence in their outputs. However, prior work has shown that LLMs are often o...

    #research #paper #ai #nlp
  • 4 days ago · ai

    [Paper] Be Your Own Red Teamer: Safety Alignment via Self-Play and Reflective Experience Replay

    Large Language Models (LLMs) have achieved remarkable capabilities but remain vulnerable to adversarial ``jailbreak'' attacks designed to bypass safety guardrai...

    #research #paper #ai #nlp
  • 4 days ago · ai

    [Paper] Form and Meaning in Intrinsic Multilingual Evaluations

    Intrinsic evaluation metrics for conditional language models, such as perplexity or bits-per-character, are widely used in both mono- and multilingual settings....

    #research #paper #ai #nlp
  • 4 days ago · ai

    [Paper] Representation-Aware Unlearning via Activation Signatures: From Suppression to Knowledge-Signature Erasure

    Selective knowledge erasure from LLMs is critical for GDPR compliance and model safety, yet current unlearning methods conflate behavioral suppression with true...

    #research #paper #ai #machine-learning #nlp
  • 4 days ago · ai

    [Paper] Learning Latency-Aware Orchestration for Parallel Multi-Agent Systems

    Multi-agent systems (MAS) enable complex reasoning by coordinating multiple agents, but often incur high inference latency due to multi-step execution and repea...

    #research #paper #ai #machine-learning #nlp
  • 4 days ago · ai

    [Paper] Defending Large Language Models Against Jailbreak Attacks via In-Decoding Safety-Awareness Probing

    Large language models (LLMs) have achieved impressive performance across natural language tasks and are increasingly deployed in real-world applications. Despit...

    #research #paper #ai #machine-learning #nlp
  • 4 days ago · ai

    [Paper] Agent Skills in the Wild: An Empirical Study of Security Vulnerabilities at Scale

    The rise of AI agent frameworks has introduced agent skills, modular packages containing instructions and executable code that dynamically extend agent capabili...

    #research #paper #ai #machine-learning #nlp

Newer posts

Older posts
EUNO.NEWS
RSS GitHub © 2026