AI safety — Page 3

Sort:

1 week ago · ai · - · -

Ask HN: Have top AI research institutions just given up on the idea of safety?

Discussion I understand there's a difference between the stated values and actual values of individuals and organizations, and so I want to ask this in the mos...

#AI safety #research institutions #AI ethics #AI governance #Hacker News discussion
1 week ago · ai · - · -

What are the best coping mechanisms for AI Fatalism?

Your kids forwarded you Matt Shumer's Something Big Happened article. Your feed exploded with the Citrini 2028 Global Intelligence Crisis and its artful, immuta...

#AI fatalism #psychological coping #AI safety #AI policy #mental health
1 week ago · ai · - · -

Why your AI keeps ignoring your safety constraints (and how we fixed it by engineering 'Intent')

If you’ve spent any time prompting LLMs, you’ve probably run into this frustrating scenario: you tell the AI to prioritize “safety, clarity, and conciseness.” W...

#AI safety #LLM prompting #intent engineering #value hierarchies #prompt engineering
1 week ago · ai · - · -

AI 데이터·신뢰성 평가 전문 ‘셀렉트스타’, MWC 2026서 ‘글로벌 AI 레드팀 챌린지’ 개최

!AI 데이터·신뢰성 평가 전문 ‘셀렉트스타’, MWC 2026서 ‘글로벌 AI 레드팀 챌린지’ 개최https://besuccess.com/wp-content/uploads/2026/02/%EC%82%AC%EC%A7%841_%EC%85%80%EB%A0%89%ED%8A%B8%EC%8A%A...

#AI safety #LLM evaluation #red team #MWC 2026 #model bias #AI security #telecom AI #SelectStar
1 week ago · ai · - · -

Anthropic Drops Flagship Safety Pledge

Anthropic’s Shift on Its Flagship Safety Policy Anthropic, the wildly successful AI company that has cast itself as the most safety‑conscious of the top resear...

#Anthropic #AI safety #Responsible AI #AI policy #large language models #risk mitigation
1 week ago · ai · - · -

What is an Interpretable LLM and Why It Matters?

!Cover image for What is an Interpretable LLM and Why It Matters?https://media2.dev.to/dynamic/image/width=1000,height=420,fit=cover,gravity=auto,format=auto/ht...

#interpretable LLM #explainable AI #large language models #model transparency #AI safety
1 week ago · ai · - · -

Beyond the Chatbot: A Blueprint for Trustable AI

'markdown JAN. 29, 2026

#AI trust #AI hallucination #real‑time inference #autonomous driving #telemetry #AI safety #Google AI
1 week ago · ai · - · -

Beyond the Chatbot: A Blueprint for Trustable AI

'markdown Jan 29, 2026 Ajeet Mirwani Americas Program Lead, Google Developer Experts

#trustworthy AI #AI hallucination #real‑time AI #autonomous driving #AI safety #Google AI #AI reliability
1 week ago · ai · - · -

We Built Iron Dome for AI Agents 🛡️

Your AI Agent Is Brilliant – But It Trusts Anyone Who Can Write Text It reads emails, processes webhooks, calls APIs, drafts responses, and manages data. Yet i...

#AI agents #prompt injection #AI security #behavioral defense #Iron Dome #prompt injection mitigation #AI safety
1 week ago · ai · - · -

Beyond the Chatbot: A Blueprint for Trustable AI

'JAN. 29, 2026

#AI trust #AI hallucination #real‑time inference #autonomous driving #telemetry #AI safety #Google Developer Experts
1 week ago · it · - · -

Tesla loses bid to overturn $243M Autopilot verdict

Background A jury awarded a $243 million verdict against Tesla for its role in a 2019 fatal crash in Florida that killed Naibel Benavides and critically injure...

#Tesla #Autopilot #autonomous driving #legal verdict #product liability #AI safety #automotive technology
1 week ago · it · - · -

AI Safety Meets the War Machine

Anthropic doesn’t want its AI used in autonomous weapons or government surveillance. Those carve‑outs could cost it a major military contract....

#Anthropic #AI safety #autonomous weapons #military contracts #government surveillance #AI policy

Newer posts

Older posts