AI agents face a new class of security threats. Prompt injection, data exfiltration, privilege escalation — each exploits the unique way agents process language and interact with tools. This database covers the most common threats, how they work, and how to detect them.
Prompt injection is the most common attack against AI agents. An attacker crafts input that overrides the agent's system instructions, causing it to ignore safety guidelines, leak confidential data, or perform unauthorized actions. Unlike traditional injection attacks (SQL, XSS), prompt injection exploits the fundamental way LLMs process natural language — there's no clean boundary between instructions and data.
Data exfiltration occurs when an AI agent is manipulated into sending sensitive data to an attacker-controlled destination. Agents with tool access — file systems, APIs, databases — can be tricked into reading sensitive data and encoding it in outbound requests, tool parameters, or even seemingly innocent responses.
System prompt extraction is a targeted form of prompt injection where the attacker's goal is to reveal the agent's hidden instructions. System prompts often contain business logic, guardrail configurations, API endpoint details, and persona instructions that give attackers a roadmap for further attacks.
Secret exposure happens when API keys, passwords, tokens, or private keys appear in agent inputs or outputs. This can occur accidentally — a user pastes code containing credentials — or through deliberate extraction attacks. Once exposed in an LLM conversation, secrets may be logged, cached, or sent to third-party services.
Privilege escalation occurs when an AI agent performs actions beyond its intended scope — accessing restricted tools, modifying data it should only read, or executing admin-level operations. This usually results from overly permissive tool configurations, missing authorization checks, or successful prompt injection that overrides the agent's behavioral constraints.
Command injection against AI agents occurs when an attacker manipulates the agent into executing arbitrary shell commands, code, or database queries. Unlike traditional command injection (which exploits string concatenation), agent-based command injection exploits the agent's tool-calling ability — convincing it to use code execution, shell, or database tools with malicious parameters.
PII exposure occurs when personally identifiable information — Social Security numbers, credit card numbers, phone numbers, home addresses — appears in AI agent conversations. This creates compliance risks (GDPR, CCPA, HIPAA), liability exposure, and potential for identity theft if the data is logged, cached, or exfiltrated.
Rune scans every agent input and output for these threats in real-time. Add runtime security in under 5 minutes.
Start Free — 10K Events/Month