ChecklistMarch 20266 min read

From Prototype to Production:
An AI Agent Security Checklist

Your agent works in development. It passed your test suite. Now you need to ship it to production — where real users send unexpected inputs, retrieved documents contain hidden instructions, and tool calls chain in ways you never anticipated. Here are 8 things to check before you deploy.

Scan all inputs before they reach the LLM

Every user message, retrieved document, and API response entering your agent is a potential attack vector. Scan inputs for prompt injection, social engineering, and manipulation patterns before the LLM processes them.

Shield.scan_input() docs

Scan all outputs before they reach the user

Your agent might leak API keys, credentials, PII, or internal data in its responses. Scan every output for sensitive information before it leaves the system — especially when the agent has access to databases or file systems.

Shield.scan_output() docs

Restrict tool access per agent

Apply the principle of least privilege to your agents. A customer support agent doesn't need write access to your production database. A research agent doesn't need the ability to send emails. Define which tools each agent can call.

Policy configuration docs

Monitor tool call sequences, not just individual calls

Individual tool calls can look benign while the sequence constitutes an attack. Reading a config file is fine. Sending its contents to an external API is not. Monitor the chain of tool calls across each agent session.

Behavioral analysis

Run in monitor mode in staging first

Don't go straight to enforce mode in production. Run Rune in monitor mode in your staging environment first. Observe what it detects. Tune your policies and thresholds. Then flip to enforce mode in production with confidence.

Getting started guide

Set up real-time alerts

When Rune blocks a threat or detects an anomaly, your team needs to know immediately. Route alerts to Slack, email, or webhooks. Set severity thresholds so critical alerts wake someone up and low-severity alerts get reviewed in the morning.

Alert configuration

Define policies as code

Write your security policies in YAML and check them into version control alongside your agent code. This makes policies reviewable, testable, and auditable. When the policy changes, the PR shows exactly what changed.

Policy YAML reference

Review your agent's risk score weekly

Rune assigns a risk score to each agent based on threat frequency, severity, and tool access patterns. Review scores weekly to catch trends — an increasing score means your agent is seeing more threats or exhibiting riskier behavior.

Dashboard overview

The pattern

Every item on this checklist follows the same principle: don't trust, verify. Don't trust user inputs — scan them. Don't trust agent outputs — scan them. Don't trust tool calls — intercept them. Don't trust that testing caught everything — monitor in production. The goal isn't to make your agent slower or more restrictive. It's to make it trustworthy.

Ship agents you can trust

Rune handles items 1-8 on this checklist with one integration. Three lines of code. Free plan includes 10K events/mo.

Start Shipping Securely Read the Docs

From Prototype to Production:An AI Agent Security Checklist