LLM safety

1 week ago · ai

The Mute Agent: Why Your AI Needs to Shut Up and Listen to the Graph

We are building agents wrong The current industry standard for agentic AI is the Chatty Generalist. You give an LLM a list of tools, a system prompt that says...

#AI agents #LLM safety #prompt engineering #graph-based reasoning #mute agent #tool use
1 week ago · ai

Why Memory Poisoning is the New Frontier in AI Security

!Cover image for Why Memory Poisoning is the New Frontier in AI Securityhttps://media2.dev.to/dynamic/image/width=1000,height=420,fit=cover,gravity=auto,format=...

#memory poisoning #AI security #adversarial attacks #LLM safety #prompt injection
2 weeks ago · ai

MCP Security 101: Protecting Your AI Agents from 'God-Mode' Risks

Learn the critical security risks of the Model Context Protocol MCP and how to protect your AI agents from tool poisoning, supply‑chain attacks, and more If yo...

#AI security #Model Context Protocol #AI agents #tool poisoning #supply chain attacks #prompt injection #LLM safety #agent orchestration
2 weeks ago · ai

Ensuring AI Reliability: Correctness, Consistency, and Availability

AI Reliability Overview AI systems frequently fail to meet performance expectations, producing inaccurate results, behaving unpredictably, or experiencing oper...

#AI reliability #correctness #consistency #availability #LLM safety #robustness #performance
3 weeks ago · ai

Why Your AI Needs the Right to Say 'I Don't Know'

Rethinking Hallucination I used to think hallucinations were a knowledge problem—AI making things up because it didn’t know the answer. After months of working...

#hallucination #prompt engineering #context gaps #LLM safety #AI assistants #AI accuracy
1 month ago · ai

Evaluating chain-of-thought monitorability

OpenAI introduces a new framework and evaluation suite for chain-of-thought monitorability, covering 13 evaluations across 24 environments. Our findings show th...

#chain-of-thought #monitorability #model reasoning #evaluation suite #OpenAI #AI interpretability #LLM safety
1 month ago · ai

Beyond Accuracy: The 73+ Dimensions of AI Agent Quality

!Cover image for Beyond Accuracy: The 73+ Dimensions of AI Agent Qualityhttps://media2.dev.to/dynamic/image/width=1000,height=420,fit=cover,gravity=auto,format=...

#AI agent evaluation #quality metrics #LLM safety #efficiency #compliance
1 month ago · ai

Multi-Agent Arena: Insights from London Great Agent Hack 2025

What mattered: robust agents, glass-box reasoning, and red-team resilience The post Multi-Agent Arena: Insights from London Great Agent Hack 2025 appeared first...

#multi-agent systems #agent hackathon #robust agents #glass-box reasoning #red-team resilience #LLM safety