AI Agent Security for RAG Pipelines
Retrieval-Augmented Generation pipelines pull documents from vector databases and inject them directly into LLM context windows. This creates a unique attack surface: every document in your corpus becomes a potential injection vector. Unlike direct user input, RAG-sourced injections bypass traditional input validation because the content arrives through a trusted retrieval path. Attackers can poison knowledge bases, manipulate retrieval rankings to surface malicious content, or embed hidden instructions in documents that override system prompts when retrieved. Rune scans every retrieved document before it reaches the LLM, detecting injections buried in PDFs, web scrapes, and internal knowledge bases — regardless of how they entered your corpus.
Key Security Risks
Malicious instructions embedded within documents stored in your vector database. When a user query triggers retrieval of a poisoned document, the injected instructions are treated as trusted context by the LLM, allowing attackers to override system prompts, exfiltrate data, or redirect agent behavior.
Attackers craft queries designed to manipulate which documents are retrieved from the vector store. By understanding how embedding similarity works, they can engineer inputs that consistently surface specific documents — including poisoned ones — while bypassing relevant results.
Deliberately large or numerous retrieved documents fill the LLM's context window, pushing system prompts and safety instructions out of the effective attention range. The LLM then operates without its guardrails, becoming susceptible to instructions contained in the retrieved content.
In multi-tenant RAG systems, insufficient namespace isolation in the vector database allows queries from one tenant to retrieve documents belonging to another. Combined with prompt injection, this can be weaponized to systematically extract another tenant's knowledge base.
How Rune Helps
Retrieved Document Scanning
Rune's middleware intercepts documents after retrieval but before they enter the LLM context. Each document is scanned for prompt injection patterns, hidden instructions, and anomalous content using multi-layer detection — pattern matching, semantic analysis, and optional LLM-based classification.
Chunk-Level Granularity
Rather than scanning entire documents as blobs, Rune analyzes individual chunks as they flow through the retrieval pipeline. This catches injections that only appear in specific sections and provides precise alerting — you know exactly which chunk in which document triggered the detection.
Output Validation
Even if a poisoned document reaches the LLM, Rune scans the final output before it reaches the user. Responses that contain exfiltration attempts (encoded data, suspicious URLs), policy violations, or hallucinated instructions are blocked or flagged in real-time.
Framework-Native Integration
Rune integrates directly with LangChain's callback system and LlamaIndex's instrumentation module. You add one middleware wrapper and every retrieval, LLM call, and tool invocation is automatically scanned — no custom plumbing required.
Example Security Policy
version: "1.0"
rules:
- name: scan-retrieved-documents
scanner: prompt_injection
action: block
severity: critical
scope: retrieval
description: "Block retrieved documents containing injection attempts"
- name: detect-context-overflow
scanner: context_abuse
action: alert
severity: high
config:
max_retrieved_tokens: 12000
description: "Alert when retrieved content exceeds safe context ratio"
- name: block-data-exfiltration
scanner: data_exfiltration
action: block
severity: critical
config:
patterns:
- urls
- encoded_data
- markdown_images
description: "Prevent data exfiltration through LLM responses"
- name: pii-in-retrieval
scanner: pii
action: redact
severity: high
scope: output
description: "Redact PII surfaced from retrieved documents"Policies are defined in YAML and enforced at the SDK level. Version control them alongside your agent code.
Quick Start
from rune import Shield
from rune.integrations.langchain import RuneCallbackHandler
from langchain_openai import ChatOpenAI
from langchain.chains import RetrievalQA
from langchain_community.vectorstores import Chroma
# Initialize Rune Shield
shield = Shield(
api_key="rune_live_xxx",
agent_id="rag-pipeline",
policy_path="policy.yaml"
)
# Create callback handler for LangChain
rune_handler = RuneCallbackHandler(shield)
# Your existing RAG setup — Rune wraps it transparently
llm = ChatOpenAI(model="gpt-4o", callbacks=[rune_handler])
vectorstore = Chroma(persist_directory="./chroma_db", embedding_function=embeddings)
retriever = vectorstore.as_retriever(search_kwargs={"k": 5})
chain = RetrievalQA.from_chain_type(
llm=llm,
retriever=retriever,
callbacks=[rune_handler], # Scans retrieved docs + LLM output
)
# Every query now has runtime protection
result = chain.invoke({"query": "What is our refund policy?"})
# Retrieved docs scanned for injections before reaching GPT-4o
# Final response scanned for data exfiltration and policy violationsThe RuneCallbackHandler hooks into LangChain's callback system to intercept every stage of the RAG pipeline. Retrieved documents are scanned for prompt injection before they enter the LLM context. The LLM's response is scanned for data exfiltration attempts and policy violations before reaching the user. All events are logged to the Rune dashboard for monitoring and forensic analysis.
Related Solutions
Customer Support
Secure AI-powered customer support agents against prompt injection, PII leakage, and unauthorized actions. Enforce compliance for support bots handling sensitive customer data.
Autonomous Multi-Step Agents
Secure autonomous AI agents executing multi-step workflows. Prevent cascading attacks, runaway execution, and unauthorized actions in agent loops, CrewAI, and AutoGPT-style systems.
Data Analysis Agents
Protect data analysis agents from SQL injection, unauthorized data access, and exfiltration. Runtime security for AI agents with database access and analytical tool use.
Secure your rag pipelines today
Add runtime security in under 5 minutes. Free tier includes 10,000 events per month.