Appendix C: Template and Snippet Library
Ready-to-use ChatML Patterns for Various Conversational Tasks
This appendix presents a comprehensive library of ChatML templates and reusable snippets that serve as building blocks for structured, reproducible conversational design.
Drawing from all chapters of this book — from foundational syntax to pipeline engineering, tool invocation, and memory persistence — these examples demonstrate how to compose consistent ChatML prompts for real-world applications such as reasoning agents, project automation bots, retrieval assistants, and observability systems.
Each template adheres to the canonical ChatML schema introduced in Chapter 3 and follows the design principles outlined in Chapter 5 (“Structure, Hierarchy, and Reproducibility”).
ChatML, LLMs, Prompt Engineering, LangChain, LlamaIndex
Appendix C: Template and Snippet Library
C.1 Introduction: Why Template Libraries Matter
A ChatML system gains power through reuse and modularity.
Rather than hand-crafting every prompt, developers can maintain a central repository of templates for each conversational pattern.
These templates:
- Enforce consistent role markup
- Simplify updates and version control
- Support dynamic rendering via Jinja2 (see Chapter 7)
- Improve reproducibility across pipelines and frameworks
Each snippet here follows the pattern:
<|im_start|>role
content
<|im_end|>
C.2 Foundational Templates – Core Role Blueprints
System Initialization
<|im_start|>system
You are a structured conversational AI following the ChatML protocol.
Always produce reproducible, verifiable, and role-tagged responses.
Context: {{ project_name }} | Date: {{ current_date }}
<|im_end|>
Purpose: Defines global policy and identity (from Chapter 5).
User Query
<|im_start|>user
{{ user_request }}
<|im_end|>
Usage: Simple entry point for any user instruction. Used in every pipeline (see Chapter 6).
Assistant Reasoning
<|im_start|>assistant
Analyzing request: {{ user_request }}
Reasoning path:
1. Retrieve relevant context
2. Execute necessary tools
3. Compose structured summary
<|im_end|>
Purpose: Makes reasoning transparent and deterministic.
Tool Invocation
<|im_start|>tool
{{ tool_name }}({{ tool_args }})
<|im_end|>
Derived From: Chapter 8 (“Tool Invocation and Function Binding”).
Best Practice: Always pair with an assistant message explaining the invocation reason.
Tool Response
<|im_start|>tool
{{ tool_result_json }}
<|im_end|>
Encodes tool outputs in JSON for easy replay and audit.
C.3 Conversational Patterns
1. Question → Answer Loop
<|im_start|>system
You are a concise answering assistant.
<|im_end|>
<|im_start|>user
What is sprint velocity?
<|im_end|>
<|im_start|>assistant
Sprint velocity is a measure of work completed in one iteration,
calculated as total story points finished during a sprint.
<|im_end|>
2. Assistant → Tool → Assistant Chain
<|im_start|>user
Show open issues for Sprint 9.
<|im_end|>
<|im_start|>assistant
Invoking Jira tool to fetch Sprint 9 data.
<|im_end|>
<|im_start|>tool
fetch_jira_tickets(sprint="Sprint 9")
<|im_end|>
<|im_start|>tool
{"open": 3, "closed": 15, "total": 18}
<|im_end|>
<|im_start|>assistant
Sprint 9 has 3 open and 15 closed issues.
<|im_end|>
Explained in: Chapter 6 (Pipeline Orchestration).
3. Contextual Memory Recall
<|im_start|>memory
Loaded context: Sprint 8 summary → Velocity 42 points, 3 issues open.
<|im_end|>
<|im_start|>user
Compare current sprint with previous.
<|im_end|>
See: Chapter 9 (Memory Persistence Layer).
C.4 Advanced Templates – Dynamic Templating (Jinja2 Integration)
Multi-Role Template with Conditional Context
<|im_start|>user
{{ user_query }}
<|im_end|>
<|im_start|>assistant
{% if context %}
Context retrieved: {{ context }}
{% endif %}
Working on your request now…
<|im_end|>
Used in: Chapter 7 examples; allows runtime variable injection.
Iterative Report Template
{% for sprint in sprints %}
<|im_start|>assistant
Sprint {{ sprint.number }} Summary:
- Velocity: {{ sprint.velocity }}
- Closed Issues: {{ sprint.closed }}
<|im_end|>
{% endfor %}
Generates multiple structured ChatML blocks programmatically.
C.5 Tool Invocation Patterns
Standard Function Call
<|im_start|>tool
fetch_velocity(sprint="{{ sprint_name }}")
<|im_end|>
Function + Result Pair
<|im_start|>tool
fetch_velocity(sprint="Sprint 10")
<|im_end|>
<|im_start|>tool
{"velocity": 45}
<|im_end|>
Multi-Tool Sequence
<|im_start|>assistant
Running 2 tools for Sprint review.
<|im_end|>
<|im_start|>tool
fetch_velocity("Sprint 10")
<|im_end|>
<|im_start|>tool
fetch_burndown("Sprint 10")
<|im_end|>
C.6 Observability and Logging Snippets
(From Chapter 10: Testing and Observability)
<|im_start|>system_monitor
Timestamp: 2025-11-11 | Latency: 1.24 s | Tokens: 432
<|im_end|>
<|im_start|>critic
Output validated ✅ | Policy check passed.
<|im_end|>
These roles help trace performance and quality metrics.
C.7 Memory Persistence Snippets
Persistent Store Entry
{"role": "user", "content": "Summarize Sprint 4 retrospective."}
{"role": "assistant", "content": "Noted key lessons: improve testing coverage and communication."}Vector Recall Message
<|im_start|>memory
Retrieved embedding match: “Team velocity increased by 12% since Sprint 7.”
<|im_end|>
Used in: Chapter 9 examples with Qdrant/LlamaIndex.
C.8 Multi-Agent Role Patterns
(Referenced in Chapter 6 and Appendix B)
| Agent Type | ChatML Role | Description |
|---|---|---|
| Planner | assistant |
Breaks tasks into steps |
| Executor | tool |
Performs actual operations |
| Critic | critic |
Reviews outputs |
| Memory Agent | memory |
Persists state/context |
Example Conversation
<|im_start|>assistant
Planning project deployment sequence.
<|im_end|>
<|im_start|>tool
deploy_project(environment="staging")
<|im_end|>
<|im_start|>critic
Deployment log verified — no errors found.
<|im_end|>
C.9 Integration-Ready Templates (for LangChain / LlamaIndex)
(Derived from Appendix B)
LangChain Prompt Conversion
messages = chatml_to_langchain([
{"role": "system", "content": "Assist Agile teams in project tracking."},
{"role": "user", "content": "Generate summary for Sprint 3."}
])LlamaIndex Document Serialization
docs = chatml_to_llamaindex(chatml_transcript)
index = VectorStoreIndex.from_documents(docs)C.10 Specialized Templates for Project Support Bot
Sprint Summary Template
<|im_start|>assistant
Sprint {{ sprint_number }} Summary:
- Velocity: {{ velocity }}
- Open Issues: {{ open }}
- Closed Issues: {{ closed }}
- Highlights: {{ highlights }}
<|im_end|>
Retrospective Report
<|im_start|>assistant
Sprint {{ sprint_number }} Retrospective Insights:
1. What went well – {{ positives }}
2. What to improve – {{ improvements }}
3. Action Items – {{ actions }}
<|im_end|>
Daily Stand-up Template
<|im_start|>user
Yesterday: {{ yesterday }}
Today: {{ today }}
Blockers: {{ blockers }}
<|im_end|>
C.11 Validation and Testing Templates
(Supporting Chapter 10)
Structural Validation Prompt
<|im_start|>critic
Validate that all ChatML blocks contain matching <|im_start|>/<|im_end|> markers.
<|im_end|>
Replay Verification Prompt
<|im_start|>system
Replay ChatML session {{ session_id }} for debug analysis.
Ensure deterministic output consistency.
<|im_end|>
C.12 Template Registry and Versioning
| Template Name | Description | Ver sion | Example File |
|---|---|---|---|
system_init. jinja2 |
Global system context prompt | 1.2 | templates/ system_init.jinja2 |
assistant_reasoning. jinja2 |
Structured thinking logic | 1.0 | templates/ assistant_reasoning.jinja2 |
tool_action. jinja2 |
Function invocation blueprint | 1.1 | templates/ tool_action.jinja2 |
sprint_report. jinja2 |
Reporting layout for Agile metrics | 2.0 | templates/ sprint_report.jinja2 |
critic_review. jinja2 |
QA and compliance check | 1.0 | templates/ critic_review.jinja2 |
Maintain version control using Git or a centralized template registry.
C.13 Closing Summary
This library consolidates all the building blocks discussed throughout the book:
| Chapter | Concept | Template Focus |
|---|---|---|
| 3 | Message Structure | Canonical ChatML blocks |
| 5 | Design Principles | Structured roles & hierarchy |
| 6 | Pipelines | Role logic and routing |
| 7 | Templates (Jinja2) | Dynamic rendering |
| 8 | Tool Invocation | tool role schema |
| 9 | Memory Persistence | memory role and context replay |
| 10 | Observability | Logging and testing roles |
| B | Framework Integration | LangChain & LlamaIndex adapters |
Each snippet provides a ready-to-deploy blueprint for structured, testable LLM conversations.
By reusing these templates, developers ensure that every message remains reproducible, auditable, and semantically clear — the core promise of ChatML.