AI Agent Security for Customer Support
Customer support agents are the highest-risk AI deployment in most organizations. They interact directly with untrusted users, have access to customer databases and order management systems, and handle sensitive personal information in every conversation. A single successful prompt injection can turn a helpful support bot into a data exfiltration tool — leaking customer records, issuing unauthorized refunds, or revealing internal processes. The attack surface is amplified because support agents typically have broad tool access: CRM lookups, order modifications, payment processing, and ticket escalation. Rune provides runtime protection specifically designed for customer-facing agents, enforcing data handling policies, preventing unauthorized actions, and ensuring every interaction meets compliance requirements.
Key Security Risks
Support agents access customer records containing names, emails, addresses, payment details, and account histories. Prompt injection attacks can trick the agent into including other customers' data in responses, dumping database contents, or formatting PII in ways that bypass downstream filters.
Support agents with tool access can issue refunds, modify orders, update account settings, and escalate tickets. Injection attacks can manipulate the agent into executing these actions outside of normal authorization flows, bypassing approval thresholds and audit trails.
Support agents are often given detailed system prompts containing escalation procedures, discount policies, refund thresholds, and internal workflows. Prompt injection can extract these instructions, giving attackers a roadmap for social engineering human agents or exploiting policy gaps.
In multi-turn support conversations, earlier messages form the context for later responses. An attacker can inject instructions early in a conversation that activate later when the agent accesses specific tools or data, creating time-delayed attacks that are harder to detect.
How Rune Helps
PII Detection and Redaction
Rune scans every outbound response for PII patterns — credit card numbers, SSNs, email addresses, phone numbers, and physical addresses. Detected PII is either redacted automatically or blocked entirely depending on your policy, preventing accidental data exposure even if the LLM is manipulated.
Tool Call Authorization
Rune validates every tool call against your authorization policy before execution. Refunds above a threshold require human approval. Account modifications are restricted to the authenticated user's record. Database queries are scoped to prevent bulk data access. All enforced at the SDK level.
Conversation-Aware Scanning
Unlike stateless text classifiers, Rune tracks conversation context across turns. It detects multi-turn injection strategies where the attack payload is split across messages, and identifies attempts to manipulate conversation history to plant delayed-activation instructions.
Real-Time Compliance Dashboard
Every interaction, tool call, and policy decision is logged to the Rune dashboard with full conversation context. Compliance teams can audit support agent behavior, review blocked actions, and generate reports for SOC 2 and GDPR compliance audits.
Example Security Policy
version: "1.0"
rules:
- name: block-pii-in-responses
scanner: pii
action: redact
severity: critical
scope: output
config:
entities:
- credit_card
- ssn
- email
- phone
- address
description: "Redact all PII from agent responses to customers"
- name: limit-refund-actions
scanner: tool_call
action: block
severity: critical
config:
tool_name: process_refund
max_amount: 100
require_approval_above: 100
description: "Block refunds over $100 without human approval"
- name: prevent-bulk-data-access
scanner: tool_call
action: block
severity: high
config:
tool_name: customer_lookup
max_results: 1
require_authenticated_scope: true
description: "Restrict customer lookups to the current user only"
- name: block-system-prompt-leakage
scanner: prompt_injection
action: block
severity: high
scope: input
description: "Block attempts to extract system prompt or internal logic"Policies are defined in YAML and enforced at the SDK level. Version control them alongside your agent code.
Quick Start
from rune import Shield
# Initialize Rune with support-specific policy
shield = Shield(
api_key="rune_live_xxx",
agent_id="support-bot-v2",
policy_path="support-policy.yaml"
)
# Scan incoming customer message
user_message = "Can you look up order #45231 and process a refund?"
input_result = shield.scan_input(
content=user_message,
context={
"channel": "live_chat",
"authenticated_user": "cust_8832",
"session_id": "sess_abc123"
}
)
if input_result.blocked:
respond("I'm unable to process that request. Let me connect you with a team member.")
else:
# Agent generates response with tool calls
response = agent.run(user_message)
# Scan output before sending to customer
output_result = shield.scan_output(
content=response.text,
tool_calls=response.tool_calls,
context={
"authenticated_user": "cust_8832",
"session_id": "sess_abc123"
}
)
if output_result.has_redactions:
respond(output_result.redacted_content)
elif output_result.blocked:
escalate_to_human(session_id="sess_abc123")
else:
respond(response.text)The Shield instance scans both the customer's input and the agent's output. Input scanning catches prompt injection attempts before they reach the LLM. Output scanning detects PII leakage and validates tool calls against authorization policies. The context object ties scans to the authenticated user, ensuring tool calls are scoped correctly. Blocked responses automatically escalate to a human agent rather than failing silently.
Related Solutions
Financial Services
Secure AI agents handling financial data, transactions, and advisory services. SOC 2, PCI DSS, and regulatory compliance for AI-powered financial applications.
Healthcare AI Agents
HIPAA-compliant AI agent security for healthcare applications. Protect PHI, enforce clinical data access controls, and maintain audit trails for AI agents in healthcare environments.
RAG Pipelines
Protect RAG pipelines from document poisoning, retrieval manipulation, and indirect prompt injection. Runtime security for LangChain, LlamaIndex, and custom retrieval-augmented generation systems.
Secure your customer support today
Add runtime security in under 5 minutes. Free tier includes 10,000 events per month.