
Konstantin Lapine
Documentation Lead
This release updates AIDR detection models with improved accuracy for the Malicious Prompt detector. These improvements apply automatically. You don't need to change any configuration.
Changes
Malicious prompt detection improvements
- Reduced false positives on benign prompts, including prompts that contain common keywords previously associated with prompt injection patterns, such as
ignore. - Improved detection of adversarial prompts that use obfuscation and evasion methods, including:
- Prompts with embedded emojis
- Payload splitting attacks using slashes, dashes, and other delimiters
- Special token injection, such as zero-width spaces and classification tokens
- Improved prompt injection detection across supported languages.
Subscriptions
- AIDR for Workforce
- AIDR for Agents