AI SecurityJune 14, 2026 · 11 min read

RAG security: protecting retrieval-augmented generation systems

Retrieval-augmented generation gives models access to your data, and that data becomes part of the attack surface. This is how RAG systems get attacked and how to secure them.
Written by
R
Raptoric AI Security
Share
LinkedInX / TwitterCopy link

RAG security is the practice of protecting retrieval-augmented generation systems, the AI architecture that lets a model answer using your data by retrieving relevant documents and feeding them into the model's context. RAG is one of the most common ways to build useful AI products, because it grounds the model in your knowledge base instead of relying only on what it learned in training. But the moment a model retrieves and acts on your data, that data becomes part of the attack surface, and the retrieval pipeline introduces risks that a standalone model does not have.

The core security problem with RAG is that retrieved content reaches the model as input, and the model does not reliably distinguish data from instructions. If an attacker can get content into the knowledge base, or influence what gets retrieved, they can influence what the model does. Combined with the access RAG systems often have to sensitive internal data, this makes RAG a priority for AI security. This article explains how RAG systems get attacked and how to secure them, drawing on our AI security service.

What is retrieval-augmented generation?

Retrieval-augmented generation is an architecture where, instead of relying solely on the model's trained knowledge, the application retrieves relevant information from a data source and provides it to the model as context for its answer. A typical pipeline converts documents into embeddings stored in a vector database, finds the most relevant chunks for a given query, and inserts them into the prompt. The model then answers grounded in that retrieved content.

RAG is popular because it lets AI products use current, proprietary, or domain-specific information without retraining the model, and it reduces, though does not eliminate, the model inventing answers. But it also means the model is now reading content from a data source, and the security of the system depends on the trustworthiness of that content and the controls around retrieval.

How RAG systems get attacked

RAG introduces several distinct risks, most of which stem from the fact that retrieved content is treated as input the model trusts.

  • Indirect prompt injection, where an attacker plants instructions in a document that later gets retrieved, and the model follows them as if they were legitimate.
  • Knowledge base poisoning, where an attacker introduces malicious or misleading content into the data source so the model retrieves and repeats it.
  • Data leakage across users or tenants, where retrieval returns content the requesting user should not be allowed to see.
  • Sensitive data exposure, where the knowledge base contains secrets or personal data that the model surfaces in its answers.
  • Context manipulation, where an attacker influences which documents are retrieved to steer the model toward a desired output.
  • Embedding and similarity attacks, where crafted content is designed to be retrieved for queries it should not match.
In a RAG system, your knowledge base is part of your prompt. Anything an attacker can write into it, the model may later read as an instruction.

How to secure a RAG system

Securing RAG means controlling what goes into the knowledge base, who can retrieve what, and how retrieved content is treated. The key controls include the following.

  • Enforce access control at retrieval time, so the system only returns content the requesting user is authorized to see, respecting tenancy and permissions.
  • Treat retrieved content as untrusted input, and design so that instructions found in documents are never executed as commands.
  • Control what enters the knowledge base, validating and sanitizing ingested content to limit poisoning.
  • Separate data from instructions in the prompt as clearly as the architecture allows, and constrain what the model is permitted to act on.
  • Keep secrets and sensitive data out of the knowledge base unless access is strictly controlled and necessary.
  • Log retrieval and generation, so leakage and manipulation can be detected and investigated.

Access control at retrieval time is the control most often missing. Many RAG systems retrieve from a shared index without enforcing per-user permissions, so the model can surface content the user was never entitled to. We test for this and the other RAG risks through AI penetration testing and AI red teaming.

RAG, agents, and the wider AI attack surface

RAG rarely exists in isolation. It is often part of an agentic system that can also take actions, which compounds the risk: a poisoned document could carry an instruction that an agent then acts on through its tools. That is why RAG security, agent security, and prompt injection are best understood together. We cover the action side in securing AI agents and the manipulation side in prompt injection is not a prompt problem.

Frequently asked questions

What is RAG security?

RAG security is the protection of retrieval-augmented generation systems, which let a model answer using retrieved data. It addresses risks such as indirect prompt injection through retrieved documents, knowledge base poisoning, data leakage across users, and exposure of sensitive content.

What is the biggest RAG security risk?

Two stand out: indirect prompt injection, where instructions hidden in retrieved documents steer the model, and missing access control at retrieval time, where the system returns content the user should not see. Both stem from treating retrieved content as trusted.

How do you prevent prompt injection in RAG?

You cannot make the model immune, so you contain the impact: treat retrieved content as untrusted, control what enters the knowledge base, enforce retrieval-time access control, and constrain what the model is allowed to act on. Defenses sit in the architecture, not the prompt.

Can a RAG system leak data between users?

Yes, if retrieval does not enforce per-user permissions. A common flaw is retrieving from a shared index without respecting tenancy or access rights, so the model surfaces content the requesting user was not authorized to see. Retrieval-time access control prevents this.

RAG makes AI genuinely useful by grounding it in your data, and in doing so it makes your data part of the attack surface. If you are building RAG into your products, see our AI security service and book a scoping call to discuss testing and securing it.

Want this tested on your own systems?
Our team will scope it with you on a 30-minute call.
Book a scoping call