Prompt Attack Surface
The total set of potential vulnerabilities in an LLM application that can be exploited through prompt-based attacks, including injection, leaking, and jailbreaking vectors.
Why It Matters
Understanding your prompt attack surface is the first step in securing LLM applications. Every user input point is a potential attack vector.
Example
Mapping all points where user input reaches the LLM: direct chat, file uploads that get parsed into prompts, URL content that gets fetched, and tool outputs that feed back.
Think of it like...
Like mapping all the doors and windows in a building for a security assessment — every entry point needs to be evaluated and protected.
Related Terms
Prompt Injection
A security vulnerability where malicious input is crafted to override or manipulate an LLM's system prompt or instructions, causing it to behave in unintended ways.
Prompt Leaking
When a user successfully extracts a system's hidden system prompt through clever questioning. Prompt leaking reveals proprietary instructions, business logic, and safety configurations.
Jailbreak
Techniques used to bypass an AI model's safety constraints and content policies, tricking it into generating outputs it was designed to refuse.
AI Safety
The research field focused on ensuring AI systems operate reliably, predictably, and without causing unintended harm. It spans from technical robustness to long-term existential risk concerns.