AI SecurityJune 14, 2026 · 11 min read

RAG security: protecting retrieval-augmented generation systems

Retrieval-augmented generation gives models access to your data, and that data joins the attack surface. See how RAG gets attacked and how to secure it.

A security engineer reviewing a retrieval pipeline and vector database on a dashboard.

Written by

Alen Bosanac

Offensive Security

RAG security is the practice of protecting retrieval-augmented generation systems, the AI architecture that lets a model answer using your data by retrieving relevant documents and feeding them into the model's context. RAG is one of the most common ways to build a useful AI product, because it grounds the model in your knowledge base instead of relying only on what it learned during training. A support assistant that reads your help center, an internal tool that answers questions over your wiki, a copilot that cites your contracts: these are all RAG. The pattern is popular for good reason. It lets you ship current, proprietary, domain-specific answers without retraining a model, and it reduces the rate at which the model invents facts.

The moment a model retrieves and acts on your data, that data becomes part of the attack surface. A standalone model only knows its training data. A RAG system reads live content from a store you control, and the security of the whole system now depends on that content and the controls around retrieval. For a regulated company in finance, healthcare, or critical infrastructure, this matters in two ways. The knowledge base often holds sensitive or personal data, so a retrieval flaw can become a data breach under GDPR. And the retrieval pipeline is software that sits in front of that data, so it falls inside the scope of your security program under NIS2, DORA, and ISO/IEC 27001. RAG does not get a pass because it is AI.

What retrieval-augmented generation actually is

Retrieval-augmented generation is an architecture where, instead of relying only on the model's trained knowledge, the application fetches relevant information from a data source and gives it to the model as context for the answer. A typical pipeline ingests documents, splits them into chunks, converts each chunk into an embedding (a numeric vector that captures meaning), and stores those vectors in a vector database. At query time the system embeds the user's question, finds the chunks whose vectors are most similar, and inserts the top results into the prompt. The model then answers grounded in that retrieved text.

This design buys you accuracy and freshness without the cost of retraining, which is why RAG is the default pattern for enterprise AI. It also changes the security model in a way teams often miss. The model is now reading content from a store, and that content arrives in the same channel as the system instructions: the prompt. The model treats the retrieved passage and your carefully written system prompt as one stream of tokens. It has no reliable way to know that one came from you and the other came from a document an attacker may have written.

Why RAG expands the attack surface

A plain chatbot has one untrusted input: the user's message. A RAG system has at least three. It has the user's query, the retrieved documents, and the ingestion path that fills the knowledge base in the first place. Each of these is a place an attacker can push content toward the model. The retrieved document is the most dangerous, because users assume documents are passive data, when to the model they are just more text in the prompt.

RAG also inherits everything that was already wrong with the systems feeding it. If the ingestion job scrapes public web pages, an attacker controls part of your knowledge base. If it indexes a shared SharePoint or wiki, anyone who can write to that wiki can write to the model's context. If the retriever queries a database without applying the user's permissions, the model can read rows the user never could. The attack surface is the union of the model, the pipeline, and every source the pipeline trusts.

The main RAG risks

Most RAG risks come from one root cause: retrieved content is treated as input the model trusts. The OWASP Top 10 for Large Language Model Applications catalogs these under prompt injection, sensitive information disclosure, and data and model poisoning.¹ The concrete failure modes look like this.

Indirect prompt injection, where an attacker plants instructions inside a document, and the model follows them when that document is later retrieved, treating the hidden text as a command rather than data.
Knowledge base poisoning, where an attacker introduces malicious or misleading content into the data source so the model retrieves and repeats it as if it were authoritative.
Access-control bleed, where the retriever returns content the requesting user is not authorized to see because retrieval runs against a shared index without enforcing per-user permissions.
Sensitive data exposure, where the knowledge base contains secrets, credentials, or personal data and the model surfaces them in an answer, which can become a reportable breach under GDPR.
Over-permissioned retrieval, where the retrieval service authenticates as a high-privilege account and can read far more than any single user should, so any injection or bug exposes the whole store.
Context manipulation and similarity attacks, where an attacker crafts content designed to be retrieved for queries it should not match, steering the model toward a chosen output.

Indirect prompt injection is the risk that surprises teams most, because it does not require access to your servers. An attacker only needs to get text into something you will later index. A poisoned support ticket, a comment on a public page, a PDF attached to an email. We cover the mechanism in depth in prompt injection is not a prompt problem, and the data side in AI data and model poisoning.

In a RAG system, your knowledge base is part of your prompt. Anything an attacker can write into it, the model may later read as an instruction.

Risks and defenses at a glance

No single control fixes RAG. Each risk needs a specific defense placed at the right layer of the pipeline. The table maps the main risks to the control that contains them.

Risk	Primary defense
Indirect prompt injection via retrieved documents	Treat retrieved content as untrusted, sanitize on ingest, and constrain what the model is allowed to act on
Knowledge base poisoning	Validate and govern the ingestion path, restrict who can write to indexed sources, track provenance
Access-control bleed between users or tenants	Enforce per-user access control at retrieval time, filter the index by the caller's identity
Sensitive data exposure in answers	Keep secrets and personal data out of the index unless strictly necessary, apply output filtering
Over-permissioned retrieval	Give the retriever least privilege, scope queries to the user rather than a service account
Context manipulation and similarity attacks	Monitor retrieval, rank and re-rank with guardrails, log what was retrieved for each answer

RAG risks and the control that addresses each one.

How to secure a RAG pipeline

Securing RAG means controlling what goes into the knowledge base, who can retrieve what, and how retrieved content is treated once it reaches the model. The work splits into a clear sequence. Treat it as architecture, not as prompt wording, because no system prompt survives contact with a determined injection.

01
Sanitize and govern ingestion
Validate every document before it enters the index. Strip or neutralize content that looks like instructions, control characters, or hidden text in markup and metadata. Restrict who and what can write to indexed sources, and never index untrusted web content into a store that high-privilege users query.
02
Enforce per-user access control at retrieval
Run retrieval as the user, not as a shared service account. Filter the vector index by the caller's identity, tenant, and permissions so the retriever can only return chunks that user is entitled to see. This is the single most commonly missing control.
03
Apply least privilege to the retriever
Scope the retrieval service's own access to the minimum it needs. If it reads from a database, it should read only what the application requires, so a bug or injection cannot expose the entire store.
04
Handle output before it reaches the user or other systems
Filter generated answers for secrets, personal data, and leaked instructions. If the answer feeds another system or an agent action, validate it there too, because the model output is now untrusted input downstream.
05
Track provenance
Record which documents produced each answer, and surface citations. Provenance lets you trace a bad answer back to a poisoned source, audit what the model saw, and prove control to an assessor under ISO/IEC 27001 or SOC 2.
06
Monitor retrieval and generation
Log queries, retrieved chunks, and outputs. Watch for anomalous retrieval patterns and for answers that quote content a user should not have reached. Monitoring is how you detect poisoning and access-control bleed that slipped past prevention.

Access control at retrieval time is the control most often missing in the systems we test. Many RAG products retrieve from one shared index without applying per-user permissions, so the model can surface content the requesting user was never entitled to see. We test for this and the other RAG risks through AI penetration testing and AI red teaming, and we set the controls in context in securing LLM applications.

Classic application security still applies

RAG is a software system. It has APIs, a database, authentication, authorization, dependencies, and infrastructure. Every classic flaw still bites. A vector database left open to the internet leaks the whole index. A retrieval API with broken object-level authorization lets one user query another tenant's documents, which is access-control bleed by a different name. Injection, secrets in code, weak authentication, and unpatched dependencies all apply unchanged.

The mistake is to focus only on the novel AI risks and skip the basics. The OWASP Top 10 for Large Language Model Applications sits on top of ordinary application security, it does not replace it. The fastest path into a RAG system is often the boring one: an exposed admin endpoint or an over-permissioned service account, not a clever prompt. Treat the pipeline as an application first, then layer the AI-specific controls on top.

How to test a RAG system

You cannot prove a RAG system is safe by reading the prompt. You test it by attacking it the way an adversary would, across all three input paths. Testing starts with indirect prompt injection: plant marked instructions in a document the system will index, then ask a normal question and check whether the model obeys the hidden text. If it does, the architecture, not the wording, needs to change.

The other essential test is access control. Authenticate as a low-privilege user and try to retrieve content belonging to another user or tenant, probing the retriever directly and through crafted queries. Then test ingestion: can an attacker who can write to an indexed source poison answers, and is that content quarantined or trusted on arrival. Finally, test the plumbing with a standard application assessment of the APIs, the vector store, and the infrastructure. This combination of AI-specific and classic testing is what AI red teaming delivers, and it aligns with the risk framing in the NIST AI Risk Management Framework² and the layered controls in the ENISA AI cybersecurity framework.³

RAG, agents, and compliance

RAG rarely exists alone. It is often part of an agentic system that can also take actions, which compounds the risk. A poisoned document can carry an instruction that an agent then executes through its tools, turning a text injection into a real-world action. That is why RAG security, agent security, and prompt injection are best understood together. We cover the action side in securing AI agents and the broader model risks in the OWASP LLM Top 10.

There is also a regulatory dimension. If your RAG system makes or supports consequential decisions, the EU AI Act may classify it as high-risk and require risk management, data governance, logging, and human oversight, all of which map cleanly onto the controls above. We explain the obligations in our EU AI Act compliance guide. Provenance, monitoring, and access control are not just security hygiene, they are evidence you will need for an assessor.

How Raptoric helps

Raptoric is an independent, vendor-neutral firm, so our advice is not tied to any AI platform or product. We assess and test RAG systems end to end: indirect prompt injection through retrieved documents, knowledge base poisoning, access-control bleed at retrieval, sensitive data exposure, and the classic application flaws in the pipeline around them. If you are building RAG into a product, see our AI security service and book a scoping call to discuss testing and securing it before it reaches production.

Frequently asked questions

What is RAG security?

RAG security is the protection of retrieval-augmented generation systems, which let a model answer using retrieved data. It addresses risks such as indirect prompt injection through retrieved documents, knowledge base poisoning, access-control bleed between users, exposure of sensitive content, and over-permissioned retrieval, alongside the classic application security of the pipeline itself.

Why does RAG expand the attack surface?

A plain chatbot has one untrusted input, the user's message. A RAG system adds two more: the documents it retrieves and the ingestion path that fills the knowledge base. Retrieved content reaches the model in the same channel as your instructions, and the model cannot reliably tell which came from you and which from an attacker.

What is the biggest RAG security risk?

Two stand out. Indirect prompt injection, where instructions hidden in a retrieved document steer the model, and missing access control at retrieval time, where the system returns content the user should not see. Both come from treating retrieved content and a shared index as trusted when they are not.

How do you prevent prompt injection in RAG?

You cannot make the model immune, so you contain the impact. Sanitize content on ingest, treat retrieved text as untrusted, enforce retrieval-time access control, and constrain what the model is allowed to act on. The defenses live in the architecture and the pipeline, not in the wording of the system prompt.

Can a RAG system leak data between users?

Yes, if retrieval does not enforce per-user permissions. A common flaw is querying a shared index, or running retrieval as a high-privilege service account, without respecting tenancy and access rights. The model then surfaces content the requesting user was never authorized to see. Per-user access control at retrieval prevents this.

How do you test a RAG system?

You attack all three input paths. Plant marked instructions in an indexed document to test indirect prompt injection, attempt cross-user and cross-tenant retrieval to test access control, and try to poison answers through the ingestion path. Then run a classic application assessment of the APIs, vector store, and infrastructure around the model.

Sources

1OWASP. OWASP Top 10 for Large Language Model Applications. Open Worldwide Application Security Project, 2025. Link
2NIST. Artificial Intelligence Risk Management Framework (AI RMF 1.0). National Institute of Standards and Technology, 2023. Link
3ENISA. Multilayer Framework for Good Cybersecurity Practices for AI. European Union Agency for Cybersecurity, 2023. Link

Related service

AI Security

→

Want this tested on your own systems?

Our team will scope it with you on a 30-minute call.

Book a scoping call

Keep reading

All insights →

01AI Security

AI penetration testing: how to test LLM apps, agents, and RAG

Read →8 min read

02AI Security

AI red teaming: a practical guide for security teams

Read →7 min read

03AI Security

Securing AI agents: the new attack surface of agentic AI

Read →6 min read