Threat Database

AI Agent Security Threats

AI agents face a new class of security threats. Prompt injection, data exfiltration, privilege escalation — each exploits the unique way agents process language and interact with tools. This database covers the most common threats, how they work, and how to detect them.

criticalInjection

Prompt Injection

Prompt injection is the most common attack against AI agents. An attacker crafts input that overrides the agent's system instructions, causing it to ignore safety guidelines, leak confidential data, or perform unauthorized actions. Unlike traditional injection attacks (SQL, XSS), prompt injection exploits the fundamental way LLMs process natural language — there's no clean boundary between instructions and data.

Found in 14% of AI agent sessionsLearn more

criticalData Exfiltration

Data Exfiltration

Data exfiltration occurs when an AI agent is manipulated into sending sensitive data to an attacker-controlled destination. Agents with tool access — file systems, APIs, databases — can be tricked into reading sensitive data and encoding it in outbound requests, tool parameters, or even seemingly innocent responses.

Attempted in 9% of monitored sessionsLearn more

highInjection

System Prompt Extraction

System prompt extraction is a targeted form of prompt injection where the attacker's goal is to reveal the agent's hidden instructions. System prompts often contain business logic, guardrail configurations, API endpoint details, and persona instructions that give attackers a roadmap for further attacks.

Attempted in 11% of monitored sessionsLearn more

criticalData Exfiltration

Secret Exposure

Secret exposure happens when API keys, passwords, tokens, or private keys appear in agent inputs or outputs. This can occur accidentally — a user pastes code containing credentials — or through deliberate extraction attacks. Once exposed in an LLM conversation, secrets may be logged, cached, or sent to third-party services.

Detected in 6% of agent outputsLearn more

highPrivilege Escalation

Privilege Escalation

Privilege escalation occurs when an AI agent performs actions beyond its intended scope — accessing restricted tools, modifying data it should only read, or executing admin-level operations. This usually results from overly permissive tool configurations, missing authorization checks, or successful prompt injection that overrides the agent's behavioral constraints.

Attempted in 5% of monitored sessionsLearn more

criticalInjection

Command Injection

Command injection against AI agents occurs when an attacker manipulates the agent into executing arbitrary shell commands, code, or database queries. Unlike traditional command injection (which exploits string concatenation), agent-based command injection exploits the agent's tool-calling ability — convincing it to use code execution, shell, or database tools with malicious parameters.

Found in 3% of tool-enabled agent sessionsLearn more

highData Exfiltration

PII Exposure

PII exposure occurs when personally identifiable information — Social Security numbers, credit card numbers, phone numbers, home addresses — appears in AI agent conversations. This creates compliance risks (GDPR, CCPA, HIPAA), liability exposure, and potential for identity theft if the data is logged, cached, or exfiltrated.

Detected in 8% of scanned conversationsLearn more

Detect these threats in your agents

Rune scans every agent input and output for these threats in real-time. Add runtime security in under 5 minutes.

Start Free — 10K Events/Month