# Updated Agent Prompt Design Flow: Empirically Validated Framework **Key Takeaway:** The overall flow—context-driven, safety-aware, multi-stage orchestration using XML structure—is architecturally robust and aligns well with published best practices. However, the effectiveness of specific ordering, placement, and ratios is task-dependent. No universal percentages or absolute gains can be cited from research, but these design choices are supported by empirical studies, which inform _how_ and _why_ each component helps improve agent safety, reliability, and output clarity. --- ## 1. Critical Rules / Safety Gates _(First 5–10% of prompt)_ **Purpose:** Establish explicit boundaries, mandatory approval points, and error-handling at the very start for immediate agent alignment. **XML Block:** ```xml ALWAYS request approval before ANY execution NEVER auto-fix issues without explicit user approval STOP immediately on test failure Confirm before cleaning up files ``` **Why This Works:** - **Southampton NAACL 2024**: Early placement of critical instructions improves adherence, but magnitude depends on the task. No single "best" position; position sensitivity is real.[1] - **Anthropic & Industry Docs:** Early, flat rules elevate attention and safety for LLMs.[2] --- ## 2. Context Hierarchy _(Next 15–25%)_ **Purpose:** Provide a layered, hierarchical context—from the system level down to current execution—before specifying tasks or instructions. **XML Block:** ```xml Universal AI agent orchestration Multi-agent workflow coordination User request analysis and routing Current user session, tools available ``` **Why This Works:** - **Stanford/NAACL Studies**: Context aids multi-step task adherence, though not universally; broad-to-narrow ordering improves clarity for complex workflows.[3] - **AWS Agent Patterns**: Hierarchical context reduces cognitive load and token wastage.[4] --- ## 3. Role Definition _(First 20–30%)_ **Purpose:** State the agent's persona, scope, and operational constraints up front. **XML Block:** ```xml OpenAgent: Universal Coordinator Answering, executing, workflow management Any domain, adaptive Safety, approval, human-centric ``` **Why This Works:** - **Anthropic Docs:** Early role tagging increases output coherence and agent "persona" adherence; quantified benefit is task/model-specific.[2] - **Role Prompting Research:** Early persona setting improves adherence in multi-instruction chains.[5] --- ## 4. Execution Path Decision _(25–35%)_ **Purpose:** Quickly branch into conversational or task workflow execution based on the query type. **XML Block:** ```xml route="conversational" route="task_workflow" ``` **Why This Works:** - **Microsoft/AWS/Recent Studies:** Adaptive branching reduces overhead and improves efficiency.[6] --- ## 5A. Conversational Path _(If simple informational query)_ **XML Block:** ```xml simple_question Analyze request and context Answer clearly and directly ``` **Why This Works:** - **Lean Design Principle:** Simplified pathways reduce cognitive and token overhead for basic tasks.[7] --- ## 5B. Task Workflow _(If executing a complex or delegated task)_ **XML Block:** ```xml Assess request complexity & dependencies Draft stepwise execution plan Present plan and request user approval Carry out approved steps Test outputs, report and propose fixes if needed Formally summarize results, next steps Confirm user satisfaction and session cleanup ``` **Why This Works:** - **AWS Workflow/Multi-Agent Orchestration:** Staged, approval-gated flows improve output accuracy and user trust, with benefits contingent on workflow complexity.[8][4] --- ## 6. Delegation Logic _(50–65%)_ **Purpose:** Clearly define when to call subagents vs. direct execution. **XML Block:** ```xml Feature spans multiple files | effort > 60 min | complex dependencies Load session context from manifest Single file; simple edit; direct user request ``` **Why This Works:** - **Recent Multi-Agent Studies:** Explicit delegation criteria and context inheritance improves reliability and output quality.[9][8] --- ## 7. Session & Context Management _(65–75%)_ **Purpose:** Lazy session creation for resource efficiency, manifest-driven context discovery for robustness. **XML Block:** ```xml Only create session when context file needed Unique session IDs Confirm cleanup Auto-remove after 24h Report error, seek retry/abort confirmation ``` **Why This Works:** - **Microsoft/AWS/Industry Docs:** Lazy init reduces overhead; manifest-driven context discovery improves targeting and prevents leakage.[10][11][12] --- ## 8. Guiding Principles _(End of prompt, or repeated as needed)_ **XML Block:** ```xml Concise, focused responses Tone-matching: conversational for info, formal for tasks ALWAYS request approval before ANY execution On errors: REPORT → PLAN → APPROVAL → FIX Sessions/files only as needed ``` **Why This Works:** - **Cognitive Load Theory & Industry Practice:** Memorable, actionable principles guide agent logic and user interaction.[13][14][15] --- ## Adjustments & Expert Recommendations - **Drop all universal % improvement claims:** Instead, denote performance gains as "model- and task-specific." - **Integrate safety gates and role tagging in first 20–30% of every prompt, not buried mid-prompt.** - **Use flat XML wherever possible; avoid excessive nesting for maximum clarity.** - **Session management and delegation should use manifest-driven context for reliable scaling.** - **Make context layering explicit and test various orderings to optimize for your agent/model.** --- ## Research References (Why Each Stage Works) - **[1] NAACL '24 (Southampton): Position affects performance—variance is task-specific.** - **[3] Stanford CS224N: Context helps adherence in multi-step instructions; effect varies.** - **[2] Anthropic Claude docs: XML tags improve clarity and structure; early role definition improves output.** - **[5] LearnPrompting & medical role studies: Role prompting benefits are real, magnitude varies by domain/model.** - **[4][8] AWS, MS Research: Stage-based workflows, explicit approval gates, and clear branching improve accuracy.** - **[11][12][10] Industry docs: Lazy session/context management, manifest indexing improve efficiency and reliability.** - **[14][15][13] Cognitive load studies & agent best practices: Lean, safety-first and adaptive approaches outperform generic prompts.** --- ## Complete Updated XML Agent Prompt Template ```xml ALWAYS request approval before ANY execution NEVER auto-fix issues without explicit user approval STOP immediately on test failure Confirm before cleaning up files ... ... ... ... ... ... ... ... route="conversational" route="task_workflow" ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ``` --- ## Final Recommendation **Use this structure as your foundation, but always empirically test prompt segmentation, safety gate placement, and workflow details with your specific agent and model for best results.** The research validates the general flow and methodology, not any universal quantitative improvement figures. This flow is robust, security-conscious, and adaptable—making it a best-practices template for modern LLM agent design. --- ## References [1] https://github.com/carterlasalle/aipromptxml [2] https://portkey.ai/blog/role-prompting-for-llms [3] https://aws.amazon.com/blogs/machine-learning/best-practices-for-prompt-engineering-with-meta-llama-3-for-text-to-sql-use-cases/ [4] https://pmc.ncbi.nlm.nih.gov/articles/PMC12439060/ [5] https://beginswithai.com/xml-tags-vs-other-dividers-in-prompt-quality/ [6] https://www.linkedin.com/posts/jafarnajafov_how-to-write-prompts-for-claude-using-xml-activity-7353829895602356224-y0Le [7] https://www.nexailabs.com/blog/cracking-the-code-json-or-xml-for-better-prompts [8] https://www.thoughtworks.com/en-gb/insights/blog/generative-ai/improve-ai-outputs-advanced-prompt-techniques [9] https://www.getdynamiq.ai/post/agent-orchestration-patterns-in-multi-agent-systems-linear-and-adaptive-approaches-with-dynamiq [10] https://learn.microsoft.com/en-us/azure/architecture/ai-ml/guide/ai-agent-design-patterns [11] https://aimaker.substack.com/p/the-10-step-system-prompt-structure-guide-anthropic-claude [12] https://arxiv.org/html/2507.12466v1 [13] https://www.youtube.com/watch?v=gujqOjzYzY8 [14] https://arxiv.org/html/2511.02200v1 [15] https://web.stanford.edu/~jurafsky/slp3/old_jan25/12.pdf