AI SecurityMay 28, 2026 · 7 min read
Prompt injection is not a prompt problem
Teams keep trying to patch injection with better system prompts. The fix lives in the architecture, not the wording.
Written by
R
Raptoric AI Security
Share
LinkedInX / TwitterCopy link

Every few weeks a team shows us a new system prompt. It is longer than the last one, full of capital letters and the word NEVER. They are sure this version finally stops the model from leaking data or calling the wrong tool. It does not, and it never will, because they are solving the wrong problem.

Why wording cannot win

A language model treats every token it reads as input. It does not have a separate, privileged channel for instructions and a lower-class channel for data. When your application pastes a web page, a support ticket, or a PDF into the context, the model reads the attacker’s text with exactly the same trust it gives your own rules.

That means an instruction buried in retrieved content can override the one you wrote, no matter how forcefully you wrote it. You are not in an argument the model can referee. You handed both sides the same microphone.

If untrusted text and trusted instructions share one context, you have already lost. The only question is how much.

Where the real controls live

The durable fixes are structural, and they sit outside the prompt:

  • Treat every tool the model can call as an attack surface. Scope each one to the minimum it needs, and require confirmation for anything that moves money, data, or state.
  • Put a hard boundary between retrieved content and instructions. Tag untrusted text, and never let it expand the model’s permissions.
  • Validate outputs the way you validate any other untrusted input — before they reach a database, a shell, or another service.
  • Log the full chain: what was retrieved, what the model decided, what it called. You cannot investigate what you did not record.

How we test it

When we red-team an AI system, we do not grade the system prompt. We map the trust boundaries, then attack across them: indirect injection through retrieved documents, tool-call hijacking, and data exfiltration through the model’s own outputs. The findings we hand back are architectural, because that is where the fix has to happen.

Better wording buys you a day. Better structure buys you the year.

Want this tested on your own systems?
A senior engineer will scope it with you on a 30-minute call.
Book a scoping call
Stay current
Subscribe to the Raptoric briefing.
Monthly intelligence digest. Disclosure highlights, threat-actor activity, and engagement field notes from our practitioners.
name@company.com
Subscribe
Issued monthly · unsubscribe anytime · PGP available
RRaptoric
A technical cybersecurity services firm. Engineering-grade rigor across five practice lines. Engaged by 140+ organizations in financial services, healthcare, technology, and government.
L
X
G
Y
Services
Offensive SecurityApplication & CloudDetection & ResponseProgram & RiskAI SecurityView all services →
Industries
Financial ServicesHealthcareTechnology & SaaSGovernment & DefenseAI PlatformsCritical Infrastructure
Research
2026 Adversary ReportDisclosures & CVEsThreat IntelligenceEngineering Blog
Company
AboutCareersNewsroomContactResponsible AI
Engage
Book a scoping callPGP keyshello@raptoric.com
SOC 2 Type II
ISO 27001:2022
CREST
CHECK
PCI QSA
NIST 800-171
Audited annually · references on request
© 2026 Raptoric Security, Inc. · All rights reserved · Delaware C-Corp
PrivacyTermsResponsible disclosureModern slavery statementTrust center