EUNO.NEWS EUNO.NEWS
  • All (2571) +248
  • AI (578) +19
  • DevOps (150) +2
  • Software (1091) +156
  • IT (746) +70
  • Education (6) +1
  • Notice
  • All (2571) +248
    • AI (578) +19
    • DevOps (150) +2
    • Software (1091) +156
    • IT (746) +70
    • Education (6) +1
  • Notice
  • All (2571) +248
  • AI (578) +19
  • DevOps (150) +2
  • Software (1091) +156
  • IT (746) +70
  • Education (6) +1
  • Notice
Sources Tags Search
한국어 English 中文
  • 5 days ago · software

    Go's Regexp is Slow. So I Built My Own - 3000x Faster

    I've been writing Go for years. Love the language. But there's one thing that always bothered me: regex performance. Recently I was processing large text files...

    #go #regex #performance #coregex #benchmark
  • 1 week ago · ai

    [Paper] EvilGenie: A Reward Hacking Benchmark

    We introduce EvilGenie, a benchmark for reward hacking in programming settings. We source problems from LiveCodeBench and create an environment in which agents ...

    #reward hacking #code generation #benchmark #LLM evaluation #AI safety
  • 1 week ago · ai

    [Paper] Beyond Accuracy: An Empirical Study of Uncertainty Estimation in Imputation

    Handling missing data is a central challenge in data-driven analysis. Modern imputation methods not only aim for accurate reconstruction but also differ in how ...

    #imputation #uncertainty estimation #calibration #deep generative models #benchmark
  • 1 week ago · ai

    [Paper] Bangla Sign Language Translation: Dataset Creation Challenges, Benchmarking and Prospects

    Bangla Sign Language Translation (BdSLT) has been severely constrained so far as the language itself is very low resource. Standard sentence level dataset creat...

    #sign-language #dataset #translation #computer-vision #benchmark
  • 1 week ago · ai

    [Paper] Can LLMs extract human-like fine-grained evidence for evidence-based fact-checking?

    Misinformation frequently spreads in user comments under online news articles, highlighting the need for effective methods to detect factually incorrect informa...

    #LLM #evidence extraction #fact-checking #multilingual dataset #benchmark
  • 1 week ago · ai

    [Paper] CodeFuse-CommitEval: Towards Benchmarking LLM's Power on Commit Message and Code Change Inconsistency Detection

    Version control relies on commit messages to convey the rationale for code changes, but these messages are often low quality and, more critically, inconsistent ...

    #LLM #benchmark #commit-message inconsistency #software engineering #code review
EUNO.NEWS
RSS GitHub © 2025