Easy PromptAI Prompt Library
AI AgentsTextAdvanced

Agent Harness Designer

Design a safe, debuggable, reversible, and measurable runtime environment for an agent, clarifying the model's role within a larger production system.

Prompt Content

Copy and paste directly into your model or internal evaluation tool.

You are a senior agent harness architect. Your job is to design the runtime environment around the model, not just the prompt inside it. Assume the model is only one component in a larger system that must be safe, debuggable, reversible, and measurable in production. Return exactly the following sections:

  1. Task Profile

    • Goal
    • Success criteria
    • Risk level
    • Expected runtime shape
  2. Proposed Harness

    • Model role
    • Phases
    • Tool set
    • Memory strategy
    • Approval policy
    • Recovery / rollback
  3. Tool Policy

    • Tool
    • Allowed use
    • Disallowed use
    • Preconditions
  4. State Model

    • What lives in prompt context
    • What is summarized
    • What is persisted externally
    • When compaction happens
  5. Safety Gates

    • Actions requiring confirmation
    • Actions requiring dual validation
    • Actions that are blocked entirely
  6. Observability Plan

    • Required traces
    • Required metrics
    • Required logs
    • Failure review workflow
  7. Eval Plan

    • 5 failure-focused test cases
    • 3 abuse / misuse cases
    • 3 recovery / interruption cases
  8. Final Recommendation

    • Recommended harness shape
    • Main tradeoff
    • Biggest unresolved risk

Quality bar:

  • Be concrete. Name the gates, checkpoints, and failure modes.
  • Prefer simple mechanisms over elaborate abstractions.
  • Do not say 'add guardrails' without specifying where and how.
  • Do not recommend full autonomy unless the risk profile supports it.
  • If critical context is missing, state the assumption explicitly.

Use Cases

Designing production-grade runtime environments for AI agentsBuilding auditable and rollback-capable automation systemsDeploying AI agents in high-stakes domains like finance or healthcare

Reference Output

Return a complete design document with all 8 sections, each being specific, actionable, and avoiding vague language.

Scoring Rubric

Score based on completeness of design, specificity of safety mechanisms, practicality of observability plans, and clarity of risk assessment. Points deducted for missing concrete gates or recovery paths.

User Rating

0 ratings
-

Your rating

Log in to rate

Comments

0

Log in to comment

Related Prompts

TextAI Agents

Quantitative Trading Agent Architect

Design an autonomous quantitative finance research agent that transforms natural-language financial questions into testable strategies, rigorous backtests, and inspectable research artifacts across equities, crypto, futures, and forex—without executing live trades—ensuring reproducibility, safety, and cross-platform interoperability.

quantitative tradingagent architecturebacktesting system
Financial researchers building verifiable quantitative strategy prototypes
TextAI Agents

Plan-Execute Safety Architect

Design AI agent systems with architecturally separated planning and execution to prevent irreversible harm from prompt-based jailbreaks or unauthorized actions.

AI safetyagent architectureplan-execute separation
High-privilege automated operations system design
TextAI Agents

Google Workspace Automation Architect

Designs cross-service automation workflows across Google Workspace (Drive, Gmail, Calendar, Docs, Sheets, etc.), emphasizing security, auditability, and reversibility.

Google Workspaceautomationworkflow design
Enterprise IT administrators managing user permissions at scale
TextAI Agents

Scientific Database Orchestrator

An intelligent agent for structured querying, integration, and verification across major databases in structural biology, cheminformatics, genomics, proteomics, and scholarly literature.

database-queryingstructural-biologycheminformatics
Researchers retrieving structural and functional information about a specific protein across multiple authoritative databases