Corrupting LLMs Through Weird Generalizations
Source: Schneier on Security
Fascinating research
Weird Generalization and Inductive Backdoors: New Ways to Corrupt LLMs
Abstract
LLMs are useful because they generalize so well. But can you have too much of a good thing? We show that a small amount of finetuning in narrow contexts can dramatically shift behavior outside those…