Interesting Paper Exploring Prompt Injection

Résumé + lien vers la source. Sans paywall, sans tracking.

AI & Deepfakes 2026-06-25 07:23 Source: Schneier on Security

This is a fascinating explotation of how LLMs fall for prompt injection attacks. It turns out that they learn to recognize the style of text in different role/instruction blocks, and not just the tags. Their conclusion: Role tags were a formatting trick that became the security architecture and the cognitive scaffolding of modern LLMs.

Lire l’article original →