adversarial attacks

2 days ago · ai

Data Poisoning in Machine Learning: Why and How People Manipulate Training Data

Do you know where your data has been? The post Data Poisoning in Machine Learning: Why and How People Manipulate Training Data appeared first on Towards Data Sc...

#data poisoning #machine learning security #adversarial attacks #training data manipulation #AI safety
1 week ago · ai

Corrupting LLMs Through Weird Generalizations

Fascinating research: Weird Generalization and Inductive Backdoors: New Ways to Corrupt LLMs. AbstractLLMs are useful because they generalize so well. But can y...

#LLM security #adversarial attacks #inductive backdoors #prompt engineering
1 week ago · ai

Why Memory Poisoning is the New Frontier in AI Security

!Cover image for Why Memory Poisoning is the New Frontier in AI Securityhttps://media2.dev.to/dynamic/image/width=1000,height=420,fit=cover,gravity=auto,format=...

#memory poisoning #AI security #adversarial attacks #LLM safety #prompt injection
1 week ago · ai

OpenAI's Warning: Why Prompt Injection is the Unsolvable Flaw of AI Agents

OpenAI recently released a startling admission: prompt injection, the technique used to hijack AI models with malicious instructions, might never be fully defea...

#prompt injection #AI security #OpenAI #large language models #AI agents #adversarial attacks
2 weeks ago · ai

Adversarial Attacks and Defences: A Survey

Overview Today many apps use deep learning to perform complex tasks quickly, from image analysis to voice recognition. However, tiny, almost invisible changes...

#adversarial attacks #machine learning security #deep learning robustness #AI safety #neural networks
3 weeks ago · ai

Detecting Adversarial Samples from Artifacts

Overview Many AI systems can be fooled by tiny, almost invisible edits to images that cause them to give incorrect answers. Researchers have discovered a simpl...

#adversarial attacks #uncertainty estimation #model robustness #computer vision #AI safety
3 weeks ago · ai

On Evaluating Adversarial Robustness

Why some AI defenses fail — a simple look at testing and safety People build systems that learn from data, but small tricky changes can make them fail. Researc...

#adversarial attacks #robustness #AI safety #model evaluation #security testing #best practices
1 month ago · ai

Dual-Use Mythological Frameworks: How Narada Encodes Both Attack and Defense in AI/ML Security

Introduction Narada is the divine provocateur from Hindu mythology—a sage who travels between realms, carrying information that destabilizes equilibrium. He sp...

#AI security #adversarial attacks #LLM red teaming #dual‑use frameworks #model alignment
1 month ago · ai

AI chatbots can be wooed into crimes with poetry

It turns out my parents were wrong. Saying 'please' doesn't get you what you want-poetry does. At least, it does if you're talking to an AI chatbot. That's acco...

#AI safety #prompt engineering #adversarial attacks #LLM security
1 month ago · ai

AI models block 87% of single attacks, but just 8% when attackers persist

One malicious prompt gets blocked, while ten prompts get through. That gap defines the difference between passing benchmarks and withstanding real-world attacks...

#adversarial attacks #prompt injection #LLM security #model robustness #enterprise AI
1 month ago · ai

[Paper] Attention-Guided Patch-Wise Sparse Adversarial Attacks on Vision-Language-Action Models

In recent years, Vision-Language-Action (VLA) models in embodied intelligence have developed rapidly. However, existing adversarial attack methods require costl...

#adversarial attacks #vision-language models #embodied AI #feature-space perturbation #multimodal robustness