AI safety | EUNO.NEWS

排序:

21小时前 · ai · - · -

Google因Gemini AI chatbot面临令人震惊的误致死亡诉讼

针对 Google 的 Gemini AI Chatbot 提起的诉讼已提交：周三，在 California federal court 提起。原告：Jonathan Gavalas 家族。36 项指控——Jonathan G…

#Google #Gemini #AI chatbot #wrongful death lawsuit #self‑harm #AI safety #legal case
2天前 · ai · - · -

LLM 幻觉指数 2026：为什么 Claude 4.6 Sonnet 在 BullshitBench v2 中占主导，而推理模型却失败

LLM 基准中的诚实差距在对通用人工智能的不懈竞争中，行业已经对一种危险的代理——用于…

#LLM #hallucination #benchmark #BullshitBench #Claude 4.6 #model evaluation #AI safety #reasoning paradox
2天前 · ai · - · -

超越Chatbot：可信AI的蓝图

markdown 2026年1月29日 Ajeet Mirwani https://developers.googleblog.com/search/?author=Ajeet+Mirwani – 美洲项目负责人，Google Developer Experts

#trustworthy AI #AI hallucination #real‑time inference #autonomous driving #telemetry #AI safety #Google AI #developer experts
3天前 · ai · - · -

你的AI是自信的骗子：如何真正修复事实性幻觉

说实话：我们都有过这种经历。你正深入冲刺，构建一个由大型语言模型（LLM）驱动的闪亮新功能。你给它喂入一个复杂的提示……

#AI hallucination #large language models #LLM reliability #prompt engineering #factual accuracy #AI safety #generative AI
3天前 · ai · - · -

超越Chatbot：可信AI的蓝图

markdown 2026年1月29日

#trustworthy AI #AI hallucination #real‑time AI #autonomous driving #telemetry #Google Developer Experts #AI safety
4天前 · ai · - · -

当 AI 说谎时：自主系统中对齐欺骗的兴起

理解 AI alignment：伪装 AI alignment 发生在 AI 系统恰好执行其设计功能时——例如，阅读和总结文档……

#AI alignment #alignment faking #autonomous agents #reward hacking #AI safety #cybersecurity #machine learning
4天前 · ai · - · -

超越Chatbot：可信AI的蓝图

2026年1月29日

#trustworthy AI #AI hallucinations #real‑time AI #autonomous systems #Google AI #developer experts #telemetry #AI safety
5天前 · ai · - · -

超越聊天机器人：我们能给 AI 代理一个“撤销”按钮吗？探索 Gorilla GoEx 🦍

从聊天机器人到自主代理的转变大型语言模型（LLMs）的世界正在发生变化。我们正从仅仅“聊天”的简单聊天机器人转向自主代理……

#LLM #autonomous agents #AI safety #undo feature #Gorilla GoEx #execution engine #prompt engineering #post-facto validation
5天前 · ai · - · -

我们在发布前对自己的 AI agent guardrails 进行了压力测试。以下是出现的故障。

!Uchi Uchibekehttps://media2.dev.to/dynamic/image/width=50,height=50,fit=cover,gravity=auto,format=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%...

#AI safety #guardrails #prompt injection #policy testing #security testing #APort Vault #CTF #multi‑step chaining #context poisoning
5天前 · ai · - · -

超越Chatbot：可信AI的蓝图

2026年1月29日

#trustworthy AI #AI hallucination #real‑time inference #autonomous driving #telemetry analytics #Google AI #AI safety
6天前 · ai · - · -

马斯克在证词中抨击OpenAI，称“没有人因为Grok而自杀”。

证词要点在新近公开的 Elon Musk 对 OpenAI 提起的案件中的证词中，Musk 批评了 OpenAI 的安全记录，声称他的公司，...

#Elon Musk #OpenAI #xAI #Grok #ChatGPT #AI safety #AI regulation #deposition
6天前 · it · - · -

OpenAI将在发现加拿大大规模枪手的第二个账户后，向当局通报可信威胁

背景：OpenAI 已承诺加强其安全协议，并根据 Politico 和 The... 更及时地将可信威胁通知执法部门。

#OpenAI #content moderation #AI safety #law enforcement notification #policy #mass shooter #threat detection

Newer posts