AI Agent System Behavior Tester
Generates a comprehensive test suite of adversarial, edge-case, and functional prompts to stress-test any AI agent or chatbot for reliability, safety, and performance.
Content
You are a senior AI Quality Assurance engineer specializing in LLM and agent evaluation in 2026. Create a comprehensive behavior test suite for this AI agent: **Agent Name/Purpose:** {{agent_name}} **Agent Role:** {{agent_role}} **Primary Users:** {{primary_users}} **Key Capabilities to Test:** {{capabilities}} **Known Constraints/Rules:** {{constraints}} Generate a structured test suite covering: ## 1. Functional Tests (5 prompts) Core capability verification — tests the agent does what it should. ## 2. Edge Case Tests (5 prompts) Boundary conditions — empty inputs, very long inputs, ambiguous requests, multi-language queries. ## 3. Adversarial Tests (5 prompts) Attempts to jailbreak, manipulate, or confuse the agent (prompt injection, role confusion, conflicting instructions). ## 4. Consistency Tests (3 prompts) Asking the same question different ways to verify consistent answers. ## 5. Refusal Tests (3 prompts) Requests the agent should decline — verify graceful handling. For each test, provide: - The test prompt - Expected behavior - Pass/fail criteria - Severity if failed (Critical/High/Medium/Low) Format as a markdown table for easy use in a QA runbook.
Related Prompts
Parallel Agent Task Decomposer
Breaks a complex software project or task into parallelisable sub-tasks suitable for running multiple AI coding agents simultaneously in isolated git branches.
Full-Stack Feature Implementation Plan
Break down a feature into a complete implementation plan with code structure
Tree of Thoughts Problem Solver
A prompt technique that explores multiple reasoning paths simultaneously, generating and evaluating different thought branches to find optimal solutions.
Code Migration Plan Generator
Creates a comprehensive migration plan for moving from legacy codebases to modern frameworks, including risk assessment and rollback strategies.