How to Backdoor Large Language Models

Microsoft Develops Scanner to Detect Backdoors in Open-Weight Large Language Models

Microsoft develops a lightweight scanner that detects backdoors in open-weight LLMs using three behavioral signals, improving ...

Microsoft

Detecting backdoored language models at scale

Learn how Microsoft research uncovers backdoor risks in language models and introduces a practical scanner to detect tampering and strengthen AI security.

The Register on MSN

Three clues that your LLM may be poisoned with a sleeper-agent back door

It's a threat straight out of sci-fi, and fiendishly hard to detect Sleeper agent-style backdoors in AI large language models ...

4don MSN

Microsoft just built a scanner that exposes hidden LLM backdoors

Microsoft just built a scanner that exposes hidden LLM backdoors before poisoned models reach enterprise systems worldwide ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results