AI safety — Page 7

Sort:

3 weeks ago · ai · - · -

Make Trust Irrelevant: A Gamer's Take on Agentic AI Safety

I wrote a short position paper arguing that current agentic AI safety failures are the confused deputy problem on repeat. We are handing agents ambient authorit...

#agentic AI #AI safety #confused deputy problem #trustless AI #hard authority #prompt engineering
3 weeks ago · ai · - · -

AI is getting scary

!Cover image for AI is getting scaryhttps://media2.dev.to/dynamic/image/width=1000,height=420,fit=cover,gravity=auto,format=auto/https%3A%2F%2Fdev-to-uploads.s3...

#AI safety #agentic AI #OpenClaw #automation #email cleanup incident #AI mishaps
3 weeks ago · ai · - · -

The Only Thing Standing Between Humanity and AI Apocalypse Is … Claude?

As AI systems grow more powerful, Anthropic’s resident philosopher says the startup is betting Claude itself can learn the wisdom needed to avoid disaster....

#AI safety #Anthropic #Claude #AI alignment #large language models
0 month ago · ai · - · -

Making AI work for everyone, everywhere: our approach to localization

OpenAI shares its approach to AI localization, showing how globally shared frontier models can be adapted to local languages, laws, and cultures without comprom...

#OpenAI #localization #multilingual AI #AI safety #global models
0 month ago · ai · - · -

Psychometric Jailbreaks Reveal Internal Conflict in Frontier Models

Article URL: https://arxiv.org/abs/2512.04124 Comments URL: https://news.ycombinator.com/item?id=46902855 Points: 8 Comments: 3...

#psychometric testing #jailbreak #frontier models #large language models #AI safety #model evaluation
0 month ago · it · - · -

Why Waymo is having a hard time time stopping for school buses

For years, Alphabet-owned Waymo has tried to set itself apart from other self-driving startups by emphasizing a culture of caution and safety. Now, just ahead o...

#Waymo #autonomous vehicles #self-driving cars #school zones #road safety #Alphabet #AI safety
0 month ago · ai · - · -

Beyond the Chatbot: A Blueprint for Trustable AI

markdown JAN. 29, 2026 !Ajeet Mirwanihttps://developers.google.com/static/images/author/Ajeet_Mirwani.pnghttps://developers.googleblog.com/search/?author=Ajeet+...

#trustworthy AI #AI hallucination #real‑time AI #autonomous systems #AI safety #Google AI #developer experts
0 month ago · ai · - · -

Coordination Is the Substrate: What NVIDIA's Groq Acquisition Really Signals About AI Governance

Intelligence was never the threat. Coordination is. Every existing governance framework breaks at that point. The Real Shift: Coordination Over Intelligence For...

#NVIDIA #Groq #AI governance #multi-agent systems #coordination #deterministic execution #AI safety
1 month ago · ai · - · -

Beyond the Chatbot: A Blueprint for Trustable AI

'JAN. 29, 2026

#trustworthy AI #AI hallucination #real‑time inference #autonomous driving #telemetry #Google AI #AI safety
1 month ago · ai · - · -

How does misalignment scale with model intelligence and task complexity?

Article URL: https://alignment.anthropic.com/2026/hot-mess-of-ai/ Comments URL: https://news.ycombinator.com/item?id=46864498 Points: 61 Comments: 14...

#AI alignment #model intelligence #task complexity #misalignment scaling #Anthropic #AI safety #large language models
1 month ago · ai · - · -

The Sora feed philosophy

Discover the Sora feed philosophy—built to spark creativity, foster connections, and keep experiences safe with personalized recommendations, parental controls,...

#OpenAI #Sora #personalized recommendations #parental controls #AI safety #guardrails #creative AI
1 month ago · ai · - · -

Beyond the Chatbot: A Blueprint for Trustable AI

'markdown JAN. 29, 2026

#AI trust #AI hallucination #real‑time AI #autonomous driving #telemetry #Google AI #AI safety

Newer posts

Older posts