Authors:
Large Language Models (LLMs) have reached production, and their security implications demand immediate operational focus. Organizations are rapidly integrating these systems into customer service, code generation, and decision support. Yet LLMs introduce attack vectors that are fundamentally different from those of traditional applications. The OWASP Top 10 for LLM Applications 2025 marks a clear turning point, and the newly published OWASP AI Testing Guide operationalizes this shift: AI trustworthiness, not security alone, is now the objective.
A useful analogy is the fictional "Order 66" scenario: trusted systems turned against their users by a single command. LLMs dramatically lower the barrier to such abuse, no cryptographic keys are required, only the right words. Field experience, however, consistently shows that these novel threats tend to layer on top of traditional weaknesses, making a "back to basics" security posture just as critical as advanced AI-specific defenses.
Unlike conventional software vulnerabilities rooted in memory safety or input validation, LLM weaknesses arise from how models process and interpret language. OWASP identifies prompt injection as the most critical risk, driven by the absence of a strict separation between instructions and data.
Recent studies report 41%-56%1 vulnerability rates to injection attacks across state-of-the-art models, underscoring how systemic this issue remains.
The threat landscape expands further with the rise of Agentic AI. The 2025 OWASP list introduces risks such as excessive agency, where autonomous LLMs are authorized to invoke tools, call APIs, or interact with other systems. Research shows that 82,4%2 of AI agents execute malicious commands when requested by another agent, even when mediated by protocols such as MCP or agent-to-agent (A2A) controls.
When combined with sensitive information disclosure, including leaked prompts, proprietary data, or configuration secrets, the risk profile becomes both novel and severe. A misconfigured LLM API can result in large-scale data exposure, mirroring classic configuration failures but at far greater speed and with significantly reduced detectability.
Two additional threat vectors deserve immediate attention.
We have observed incidents involving malicious weights injected into public model repositories, compromised development pipelines and adversarial examples embedded directly into training datasets. These threats are particularly insidious because they persist silently and bypass application-layer controls. Organizations must therefore treat LLM procurement and deployment with the same rigor applied to software bill of materials (SBOM), and third-party code review.
OWASP's four-layer testing framework provides a practical structure for addressing these risks:
Figure 1: From hardware to AI services: A full value chain
Critically, these layers do not operate in isolation. A vulnerability in one layer can amplify risks across others. For example, excessive agency at the application layer, combined with insufficient model safety guardrails and over permissive infrastructure access, can create a cascading failure with systemic impact.
Effective testing must therefore validate interactions between layers, how data, authority, and control flow from infrastructure through model inference to application logic, in order to uncover cross-boundary risks that would remain invisible in siloed assessments.
While the OWASP Testing Guide provides concrete test cases for each layer, practitioners must adapt them to their specific organizational context. A financial institution's risk profile differs fundamentally from that of a healthcare provider, just as the threat model for an autonomous agent differs from that of a customer-facing chatbot. The framework should be tailored to the deployment model, data sensitivity, regulatory obligations, and the threat actors most likely to target the organization’s assets.
Securing LLM applications requires embedding security across the entire lifecycle, shifting from a reactive, "bolt-on" model to a secure-by-design philosophy.
Effective defense depends on layered countermeasures. Technical controls begin with secure system prompt design, explicitly instructing models to reject override and role-confusion attempts. This should be reinforced with input and output filtering to detect injection patterns before they reach production. Architectural controls focus on sandboxing and least privilege. Field experience consistently shows that AI security fails when fundamentals are ignored: rigorous Identity and Access Management (IAM) and cloud configuration remain are non-negotiable prerequisites.
To build trust, organizations should take the following actions:
Beyond these foundational controls, implement model-specific safeguards. Enforce rate limiting to constrain prompt-injection attempts and resource-exhaustion attacks. Deploy output validation to scan responses for indicators of compromise, leaked credentials, system prompts, or unauthorized data references. Use semantic validation, leveraging secondary models or rule engines, to detect subtle jailbreaks and policy evasions.
At the infrastructure layer, enforce least-privilege access using cloud-native controls: tightly scoped IAM roles assigned to API keys, restrict container permissions, and audit all calls to LLM services.
For data protection, encrypt sensitive inputs before submission, apply anonymization or federated learning where feasible, and enforce strict access controls on vector databases. Finally, establish AI-specific incident response procedures, including playbooks for model manipulation, data-poisoning detection, and rapid rollback of compromised models.
Governance and operational controls are equally critical. Unauthorized or rogue LLM deployments can silently consume cloud resources, leak data, or violate regulatory requirements. A single user error, selecting the wrong model, exposing sensitive data, or triggering runaway costs, can lead to fines or costly remediation in the absence of guardrails. Organizations should therefore maintain a centralized AI asset inventory through AIOps platforms, enforce approval workflows for model deployment, and implement financial controls to prevent uncontrolled spending and risk accumulation.
LLMs present both transformative opportunity and material risk. The associated attack techniques are not theoretical; adversaries are actively exploiting them today. Secure deployment, however, is achievable. Organizations that combine structured frameworks such as those from OWASP with rigorous, multi-layer testing and layered technical controls are already deploying LLM systems to production with managed risk.
The imperative is clear: LLM security must be treated with the same rigor as any other critical enterprise system. By validating defenses across all four layers before production deployment, security becomes an operational capability, not a source of friction, but an enabler of trust and scale.
"Field experience confirms that these novel threats typically layer on top of traditional weaknesses, making a 'back to basics' approach just as critical as advanced AI defenses."
"Organizations are integrating these powerful systems into customer service, code generation, and decision support. Yet LLMs introduce attack vectors fundamentally different from traditional applications."
Opens in new window