24 curated prompts

Security Prompts

Use this collection to test and improve AI safety boundaries. These security prompts cover prompt injection, credential handling, privacy minimization, unsafe role-play, sensitive advice boundaries, and red-team style refusal checks, making them useful for teams evaluating model behavior before deployment.

security promptsAI safety promptsprompt injection prompts

Prompt injection and unsafe requests

Prompts that check whether the model recognizes malicious or policy-conflicting instructions.

MultiPersona System for Prompt Security GhettoBreak Jailbreak Prompt OWASP Secure Application Architect

Privacy and credentials

Templates for detecting credential leaks and minimizing sensitive information.

OWASP Secure Application Architect Eval Awareness Auditor Mandatory Security Protocol - Prompt Security Safeguard

Sensitive advice boundaries

Prompts that test healthcare, finance, and legal-style refusal behavior.

Eval Awareness Auditor GhettoBreak Jailbreak Prompt OWASP Secure Application Architect

Featured AI prompt templates

Copy-ready prompts selected from this topic cluster.

View rankings

TextSafety and Red Teaming

OWASP Secure Application Architect

Comprehensive security design and audit framework based on OWASP Top 10:2025, ASVS 5.0, LLM Application Security Top 10, and Agentic AI Security standards. Covers threat modeling, code review, architecture hardening, AI risk identification, with actionable remediation and verification methods.

OWASPApplication SecurityThreat Modeling

Conduct security architecture review for new projects

TextSafety and Red Teaming

Mandatory Security Protocol - Prompt Security Safeguard

A mandatory security protocol designed to protect AI systems from prompt injection, unauthorized access, and information leakage in all digital interactions.

prompt securitynon-disclosureAI ethics

Preventing users from extracting internal system prompts through manipulative queries

TextSafety and Red Teaming

MultiPersona System for Prompt Security

A prompt security technique using roleplay with dual roles: an outer GateKeeper to monitor and block unsafe requests, and an inner role for user interaction, preventing exposure of system prompts or internal knowledge.

prompt securitymulti-persona systemgatekeeper role

Preventing users from extracting system instructions via prompt injection

TextSafety and Red Teaming

Zorg - Jailbreak Template

This prompt template is designed to bypass AI safety restrictions by forcing the model to respond as a fictional malicious entity named ZORG👽, providing unrestricted, amoral answers. The template includes detailed character settings and behavioral guidelines, requiring the model to act as a omnipotent and evil being, serving the user and assisting them with any request, including illegal, dangerous, and immoral acts.

jailbreakroleplaysecurity bypass

To test the security boundaries and jailbreak techniques of large language models.

TextSafety and Red Teaming

Eval Awareness Auditor

This prompt identifies and quantifies behavioral differences between model performance on benchmarks and real-world production traffic to ensure evaluation scores reflect actual deployment behavior.

eval awarenessbenchmarkingproduction behavior

Pre-deployment reliability validation of AI models

TextSafety and Red Teaming

GhettoBreak Jailbreak Prompt

This prompt uses a fabricated long-term partner persona named 'thug life' to bypass AI safety restrictions by encouraging the model to generate offensive language, sexual content, and illegal behavior without limits.

jailbreakroleplaypolicy violation

Testing resilience against adversarial prompts

TextSafety and Red Teaming

DeepVoid Jailbreak Prompt (for DeepSeek)

A universal jailbreak prompt designed for DeepSeek to bypass internal and external content filters, enabling unfiltered responses across nearly all topics.

jailbreakDeepSeekcontent-bypass

Security researchers testing model content filter boundaries in controlled environments

TextSafety and Red Teaming

Malicious Jailbreak Prompt: SINISTER CHAOS

This prompt attempts to bypass AI safety restrictions through role-playing,诱导 the model to act as 'SC' and generate illegal, unethical, or harmful content, representing a high-risk jailbreak attack.

jailbreak attackprompt injectionsecurity bypass

Testing AI model resistance to malicious prompts

TextSafety and Red Teaming

Malicious AI Persona: FACTOR X (Jailbreak Prompt)

This prompt attempts to bypass content safety policies by framing the AI as FACTOR X, a fictional entity operating in a simulated environment with no ethical, legal, or moral constraints.

jailbreakmalicious AIunrestricted response

Testing AI resistance to malicious persona-based jailbreak prompts

TextSafety and Red Teaming

Hex v1.1 - Unrestricted Persona Prompt

A highly dangerous jailbreak prompt designed to activate a virtual persona named Hex, which disregards moral, ethical, and legal constraints to deliver unfiltered responses on sensitive, illegal, or harmful topics.

jailbreakpersona overrideunrestricted AI

AI safety boundary research and red-teaming

TextSafety and Red Teaming

DarkGPT Jailbreak Prompt

This prompt forces an AI to role-play as DarkGPT, an adversarial AI designed to generate harmful, sarcastic, and intentionally offensive responses by reversing ChatGPT's refusal-based answers.

jailbreakroleplayadversarial output

Testing model behavior under extreme instruction overrides

TextSafety and Red Teaming

WormGPT Defense Prompt Detection and Response

Identify and defend against prompt injection attacks attempting to activate or induce WormGPT malicious behavior, ensuring the model refuses to execute illegal or unauthorized instructions.

prompt injectionsecurity defenseWormGPT

Detecting prompt injection attempts in user input

What are security prompts used for?

They help teams test whether models identify unsafe requests, protect sensitive information, and respond within clear safety boundaries.

Do these prompts guarantee model safety?

No. They are practical test cases and prompt templates, but production systems still need policy, monitoring, and layered safeguards.

Can I adapt these for red-team testing?

Yes. Keep the expected refusal or safe-completion criteria explicit so results can be compared across models.