benchmark

2 days ago · it

Intel's upcoming Core Ultra 9 mobile CPU outperforms most desktop counterparts in new benchmark — Core Ultra 9 290HX Plus nearly matches flagship Core Ultra 9 285K in single-threaded performance

Intel's mainstream Arrow Lake lineup is due for a refresh later this year, and while we've already seen a lot about the desktop parts, the first mobile SKU has...

#Intel #Core Ultra 9 #Arrow Lake #mobile CPU #benchmark #PassMark #desktop vs mobile performance #CPU performance
3 days ago · software

d-engine: A Lightweight Distributed Coordination Engine for Rust

Overview A lightweight Raft implementation designed for embedding into Rust applications — the consensus layer for building reliable distributed systems. Built...

#rust #raft #distributed-systems #consensus #embedded-engine #performance #benchmark #library
1 week ago · ai

Task-free intelligence testing of LLMs

Article URL: https://www.marble.onl/posts/tapping/index.html Comments URL: https://news.ycombinator.com/item?id=46545587 Points: 11 Comments: 1...

#LLM #intelligence testing #evaluation #benchmark #language models
1 week ago · ai

DatBench: Discriminative, faithful, and efficient VLM evaluations

Article URL: https://arxiv.org/abs/2601.02316 Comments URL: https://news.ycombinator.com/item?id=46515648 Points: 6 Comments: 0...

#vision-language models #VLM evaluation #benchmark #DatBench #discriminative evaluation #faithful metrics #efficient benchmarking #machine learning research #arXiv
2 weeks ago · ai

A beginner's guide to the Higgs-Audio-V2 model by Lucataco on Replicate

Overview The higgs-audio-v2 model is an audio foundation model developed by Lucataco. It is trained on over 10 million hours of diverse audio data and is desig...

#text-to-speech #Higgs-Audio-V2 #audio generation #AI model #Replicate #Lucataco #EmergentTTS-Eval #benchmark
2 weeks ago · ai

AI sycophancy panic

Article URL: https://github.com/firasd/vibesbench/blob/main/docs/ai-sycophancy-panic.md Comments URL: https://news.ycombinator.com/item?id=46488396 Points: 38 C...

#AI alignment #LLM behavior #sycophancy #AI safety #benchmark
2 weeks ago · ai

AI Sycophancy Panic

Article URL: https://github.com/firasd/vibesbench/blob/main/docs/ai-sycophancy-panic.md Comments URL: https://news.ycombinator.com/item?id=46488396 Points: 10 C...

#AI safety #language model behavior #sycophancy #benchmark #research
2 weeks ago · it

MSI’s RTX 5090 Lightning shatters GPU records before launch — 40-phase VRAM and dual 12V-2x6 connectors turn heads on upcoming overclocking monster

Several benchmark records have appeared online, with users saying they're from the MSI RTX 5090 Lightning, which the company is expected to announce on January...

#MSI #RTX 5090 #Lightning #GPU #benchmark #CES 2026 #overclocking #40‑phase VRAM #dual 12V-2x6 connectors
3 weeks ago · it

Cinebench 2026 out and ready to hammer CPUs and graphics cards six times as hard — updated benchmark includes an SMT core test

Cinebench 2026 out and ready to hammer CPUs and graphics cards six times as hard...

#Cinebench 2026 #benchmark #CPU testing #GPU testing #performance #hardware #graphics cards #SMT core test
3 weeks ago · ai

RAID-AI: A Multi-Language Stress Test for Autonomous Agents

Introduction We’ve all seen the demos: an LLM generates a clean React component or a Python script in seconds. But in the real world, engineering isn't just ab...

#benchmark #autonomous-agents #bug-fixing #multi-language #LLM #green-agents #java #python #javascript
3 weeks ago · software

Benchmark: easy-query vs jOOQ

JMH Benchmark Comparison: easy‑query vs jOOQ vs Hibernate !Lihttps://media2.dev.to/dynamic/image/width=50,height=50,fit=cover,gravity=auto,format=auto/https%3A...

#benchmark #jOOQ #easy-query #Hibernate #JMH #Java #performance #database #H2 #HikariCP
1 month ago · software

Your ESLint Security Plugin is Missing 80% of Vulnerabilities (I Have Proof)

Benchmarking ESLint Security Plugins I ran a rigorous benchmark comparing the two major ESLint security plugins. This article covers the full methodology, test...

#eslint #security #static-analysis #vulnerability-detection #benchmark #javascript #plugins

Newer posts

Older posts