developerGuide· 6 min read· 3,904

Building Secure AI Applications: A Developer Guide to Prompt Injection, Data Leakage, and Authorization in 2026

AI applications have security vulnerabilities that traditional web application security guides do not cover. Prompt injection, indirect injection through retrieved content, data leakage via embeddings, and authorization failures in agentic systems each require specific defenses. This guide covers every AI-specific security threat with code for each mitigation.

🔧 Tools mentioned in this article

Rebuff

Open-source prompt injection detection library for Python

github.com

Visit

LangChain

LLM framework with built-in safety patterns used in defense code examples

www.langchain.com

Visit

Priya Nair

June 19, 2026

#building secure ai applications developer guide 2026#prompt injection defense ai application security guide#ai application security vulnerabilities fixes code 2026#secure llm application developer guide complete 2026#ai security prompt injection data leakage fix guide 2026

Introduction

AI applications face security threats that do not exist in traditional software. A prompt injection attack can override the system prompt through user input. An indirect injection can hijack an agent through malicious content in a retrieved document. Embedding models can leak personal data that was used in training. Authorization checks fail in agentic systems when the agent uses a user's identity to access resources without verifying scope. Each of these requires a specific defense.

Threat 1: Prompt Injection

python

# Prompt injection defense: sanitization + detection + isolation

import re
from openai import OpenAI

client = OpenAI()

# Patterns commonly seen in prompt injection attempts
INJECTION_PATTERNS = [
    r'ignore (all |your )?(previous |above |prior )?instructions',
    r'disregard (all |your )?(previous |above |prior )?instructions',
    r'forget (all |your )?(previous |above |prior )?instructions',
    r'new instructions?:',
    r'system:\s',
    r'\[INST\]',
    r'\[SYS\]',
    r'</?(system|instruction|prompt)>',
    r'you are now',
    r'act as (if )?you are',
    r'pretend (that )?you (are|have)',
]

def detect_injection_attempt(user_input: str) -> dict:
    """
    Detects likely prompt injection attempts.
    Returns detection result and matched patterns.
    """
    matches = []
    lower_input = user_input.lower()
    
    for pattern in INJECTION_PATTERNS:
        if re.search(pattern, lower_input, re.IGNORECASE):
            matches.append(pattern)
    
    return {
        "likely_injection": len(matches) > 0,
        "matched_patterns": matches,
        "risk_level": "high" if len(matches) > 1 else ("medium" if matches else "low")
    }

def sanitize_user_input(user_input: str) -> str:
    """
    Sanitizes input before including in prompt.
    Escapes characters that could be interpreted as prompt structure.
    """
    # Remove or escape common injection delimiters
    sanitized = user_input
    sanitized = sanitized.replace("<", "&lt;").replace(">", "&gt;")
    sanitized = sanitized.replace("[INST]", "[BLOCKED]").replace("[SYS]", "[BLOCKED]")
    return sanitized

def safe_user_prompt_handling(
    system_prompt: str,
    user_input:    str
) -> str:
    """
    Isolates user input from system prompt to prevent injection.
    Uses structural separation that is harder to escape.
    """
    # Check for injection before processing
    detection = detect_injection_attempt(user_input)
    if detection["risk_level"] == "high":
        return "I cannot process that input. Please rephrase your question."
    
    # Sanitize
    clean_input = sanitize_user_input(user_input)
    
    # Structural separation — hard delimiter between system and user
    messages = [
        {"role": "system", "content": system_prompt},
        # User content explicitly wrapped to signal to model it is user-provided
        {"role": "user", "content": f"""The user has provided the following input.
        Process only the legitimate request within it:
        ---USER INPUT START---
        {clean_input}
        ---USER INPUT END---
        """
        }
    ]
    
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=messages,
        temperature=0.1
    )
    return response.choices[0].message.content

# Indirect injection defense: scan retrieved content before adding to context
def scan_retrieved_content(document: str) -> dict:
    """
    Scans RAG-retrieved content for embedded injection attempts.
    Malicious content in a document could hijack the agent.
    """
    detection = detect_injection_attempt(document)
    
    if detection["likely_injection"]:
        # Flag the document but still allow retrieval with warning
        return {
            "safe":    False,
            "content": f"[CONTENT FLAGGED FOR REVIEW] {document[:200]}...",
            "reason":  f"Matched patterns: {detection['matched_patterns']}"
        }
    
    return {"safe": True, "content": document}

Threat 2: Authorization Failures in Agentic Systems

python

# Authorization enforcement in AI agents
# Prevents agent from accessing resources the user is not authorized for

from dataclasses import dataclass
from typing import Optional

@dataclass
class AgentContext:
    user_id:       str
    user_role:     str       # 'admin', 'user', 'readonly'
    allowed_scopes: list[str]  # ['read:own_data', 'write:own_data']
    tenant_id:     str

class AuthorizedToolWrapper:
    """
    Wraps agent tools to enforce authorization before execution.
    The agent cannot bypass this — all tool calls go through this wrapper.
    """
    def __init__(self, context: AgentContext):
        self.context = context
        self.audit_log = []
    
    def call_tool(
        self,
        tool_name:      str,
        tool_args:      dict,
        required_scope: str
    ) -> dict:
        # 1. Check scope authorization
        if required_scope not in self.context.allowed_scopes:
            self._log_unauthorized_attempt(tool_name, tool_args, required_scope)
            raise PermissionError(
                f"User {self.context.user_id} lacks scope '{required_scope}' "
                f"required for tool '{tool_name}'"
            )
        
        # 2. Enforce tenant isolation — prevent cross-tenant data access
        if "user_id" in tool_args:
            if tool_args["user_id"] != self.context.user_id:
                # Only admins can access other users' data
                if self.context.user_role != "admin":
                    self._log_unauthorized_attempt(tool_name, tool_args, required_scope)
                    raise PermissionError(
                        f"User {self.context.user_id} cannot access "
                        f"data for user {tool_args['user_id']}"
                    )
        
        # 3. Inject tenant context into tool call to prevent cross-tenant leakage
        tool_args["_tenant_id"] = self.context.tenant_id  # Always filter by tenant
        
        # 4. Log the authorized call
        self._log_authorized_call(tool_name, tool_args, required_scope)
        
        # 5. Execute the tool
        return self._execute_tool(tool_name, tool_args)
    
    def _log_unauthorized_attempt(self, tool_name: str, args: dict, scope: str):
        import datetime
        self.audit_log.append({
            "timestamp":  datetime.datetime.utcnow().isoformat(),
            "type":       "UNAUTHORIZED",
            "user_id":    self.context.user_id,
            "tool":       tool_name,
            "scope_needed": scope
        })
        # In production: send to security monitoring system
    
    def _log_authorized_call(self, tool_name: str, args: dict, scope: str):
        import datetime
        self.audit_log.append({
            "timestamp": datetime.datetime.utcnow().isoformat(),
            "type":      "AUTHORIZED",
            "user_id":   self.context.user_id,
            "tool":      tool_name
        })
    
    def _execute_tool(self, tool_name: str, args: dict) -> dict:
        # Dispatch to actual tool implementations
        tools = {
            "get_user_data":    self._get_user_data,
            "update_user_data": self._update_user_data
        }
        if tool_name not in tools:
            raise ValueError(f"Unknown tool: {tool_name}")
        return tools[tool_name](**args)
    
    def _get_user_data(self, user_id: str, _tenant_id: str, **kwargs) -> dict:
        # Always include tenant_id in database queries
        return {"user": f"Data for {user_id} in tenant {_tenant_id}"}
    
    def _update_user_data(self, user_id: str, data: dict, _tenant_id: str, **kwargs) -> dict:
        return {"updated": True, "user": user_id}

Common Mistakes

Mistake 1: Trusting the LLM to enforce security rules — security must be enforced at the code level, not by instructing the model to refuse certain requests. Prompt instructions can be bypassed.
Mistake 2: Logging full prompts including user data — prompts often contain sensitive user information. Log metadata and hashed identifiers, not raw prompt content.
Mistake 3: No rate limiting on AI endpoints — AI endpoints are more expensive to abuse than traditional endpoints. Implement aggressive rate limiting from day one.
Mistake 4: Giving agents write access without confirmation — agents with write access to databases, files, or external services should require explicit user confirmation for destructive operations.
Mistake 5: Not scanning RAG documents for injections — any content retrieved from external sources can contain embedded injection attempts that hijack the agent's behavior.

FAQ

Q: Can prompt injection be fully prevented? A: No. It can be detected and mitigated but not fully prevented because the boundary between instruction and data is inherent to how LLMs work. Defense in depth is the correct strategy.
Q: Should AI applications pass security audits? A: Yes. Traditional OWASP testing plus AI-specific threat modeling covering prompt injection, data leakage, and authorization failures should be part of security review.
Q: How do I test for prompt injection vulnerabilities? A: Use automated injection testing with a curated set of injection prompts against every user input field in the application. Both known patterns and novel variations should be tested.

Conclusion

Secure AI applications require security engineering at every layer: input sanitization and injection detection at the prompt layer, structural separation of system and user content, authorization enforcement at the tool execution layer, and tenant isolation at the data layer. The model cannot be trusted to enforce security policies — all security must be implemented in code around the model.

Home All posts