Appendix C: Template and Snippet Library

Ready-to-use ChatML Patterns for Various Conversational Tasks

Abstract

This appendix presents a comprehensive library of ChatML templates and reusable snippets that serve as building blocks for structured, reproducible conversational design.

Drawing from all chapters of this book — from foundational syntax to pipeline engineering, tool invocation, and memory persistence — these examples demonstrate how to compose consistent ChatML prompts for real-world applications such as reasoning agents, project automation bots, retrieval assistants, and observability systems.

Each template adheres to the canonical ChatML schema introduced in Chapter 3 and follows the design principles outlined in Chapter 5 (“Structure, Hierarchy, and Reproducibility”).

Keywords

ChatML, LLMs, Prompt Engineering, LangChain, LlamaIndex

Appendix C: Template and Snippet Library

C.1 Introduction: Why Template Libraries Matter

A ChatML system gains power through reuse and modularity.

Rather than hand-crafting every prompt, developers can maintain a central repository of templates for each conversational pattern.

These templates:

  • Enforce consistent role markup
  • Simplify updates and version control
  • Support dynamic rendering via Jinja2 (see Chapter 7)
  • Improve reproducibility across pipelines and frameworks

Each snippet here follows the pattern:

<|im_start|>role
content
<|im_end|>

C.2 Foundational Templates – Core Role Blueprints

System Initialization

<|im_start|>system
You are a structured conversational AI following the ChatML protocol.
Always produce reproducible, verifiable, and role-tagged responses.
Context: {{ project_name }} | Date: {{ current_date }}
<|im_end|>

Purpose: Defines global policy and identity (from Chapter 5).


User Query

<|im_start|>user
{{ user_request }}
<|im_end|>

Usage: Simple entry point for any user instruction. Used in every pipeline (see Chapter 6).


Assistant Reasoning

<|im_start|>assistant
Analyzing request: {{ user_request }}
Reasoning path:
1. Retrieve relevant context
2. Execute necessary tools
3. Compose structured summary
<|im_end|>

Purpose: Makes reasoning transparent and deterministic.


Tool Invocation

<|im_start|>tool
{{ tool_name }}({{ tool_args }})
<|im_end|>

Derived From: Chapter 8 (“Tool Invocation and Function Binding”).

Best Practice: Always pair with an assistant message explaining the invocation reason.


Tool Response

<|im_start|>tool
{{ tool_result_json }}
<|im_end|>

Encodes tool outputs in JSON for easy replay and audit.


C.3 Conversational Patterns

1. Question → Answer Loop

<|im_start|>system
You are a concise answering assistant.
<|im_end|>

<|im_start|>user
What is sprint velocity?
<|im_end|>

<|im_start|>assistant
Sprint velocity is a measure of work completed in one iteration,
calculated as total story points finished during a sprint.
<|im_end|>

2. Assistant → Tool → Assistant Chain

<|im_start|>user
Show open issues for Sprint 9.
<|im_end|>

<|im_start|>assistant
Invoking Jira tool to fetch Sprint 9 data.
<|im_end|>

<|im_start|>tool
fetch_jira_tickets(sprint="Sprint 9")
<|im_end|>

<|im_start|>tool
{"open": 3, "closed": 15, "total": 18}
<|im_end|>

<|im_start|>assistant
Sprint 9 has 3 open and 15 closed issues.
<|im_end|>

Explained in: Chapter 6 (Pipeline Orchestration).


3. Contextual Memory Recall

<|im_start|>memory
Loaded context: Sprint 8 summary → Velocity 42 points, 3 issues open.
<|im_end|>

<|im_start|>user
Compare current sprint with previous.
<|im_end|>

See: Chapter 9 (Memory Persistence Layer).


C.4 Advanced Templates – Dynamic Templating (Jinja2 Integration)

Multi-Role Template with Conditional Context

<|im_start|>user
{{ user_query }}
<|im_end|>

<|im_start|>assistant
{% if context %}
Context retrieved: {{ context }}
{% endif %}
Working on your request now…
<|im_end|>

Used in: Chapter 7 examples; allows runtime variable injection.


Iterative Report Template

{% for sprint in sprints %}
<|im_start|>assistant
Sprint {{ sprint.number }} Summary:
- Velocity: {{ sprint.velocity }}
- Closed Issues: {{ sprint.closed }}
<|im_end|>
{% endfor %}

Generates multiple structured ChatML blocks programmatically.


C.5 Tool Invocation Patterns

Standard Function Call

<|im_start|>tool
fetch_velocity(sprint="{{ sprint_name }}")
<|im_end|>

Function + Result Pair

<|im_start|>tool
fetch_velocity(sprint="Sprint 10")
<|im_end|>

<|im_start|>tool
{"velocity": 45}
<|im_end|>

Multi-Tool Sequence

<|im_start|>assistant
Running 2 tools for Sprint review.
<|im_end|>

<|im_start|>tool
fetch_velocity("Sprint 10")
<|im_end|>

<|im_start|>tool
fetch_burndown("Sprint 10")
<|im_end|>

C.6 Observability and Logging Snippets

(From Chapter 10: Testing and Observability)

<|im_start|>system_monitor
Timestamp: 2025-11-11 | Latency: 1.24 s | Tokens: 432
<|im_end|>

<|im_start|>critic
Output validated ✅ | Policy check passed.
<|im_end|>

These roles help trace performance and quality metrics.


C.7 Memory Persistence Snippets

Persistent Store Entry

{"role": "user", "content": "Summarize Sprint 4 retrospective."}
{"role": "assistant", "content": "Noted key lessons: improve testing coverage and communication."}

Vector Recall Message

<|im_start|>memory
Retrieved embedding match: “Team velocity increased by 12% since Sprint 7.”
<|im_end|>

Used in: Chapter 9 examples with Qdrant/LlamaIndex.


C.8 Multi-Agent Role Patterns

(Referenced in Chapter 6 and Appendix B)

Agent Type ChatML Role Description
Planner assistant Breaks tasks into steps
Executor tool Performs actual operations
Critic critic Reviews outputs
Memory Agent memory Persists state/context

Example Conversation

<|im_start|>assistant
Planning project deployment sequence.
<|im_end|>

<|im_start|>tool
deploy_project(environment="staging")
<|im_end|>

<|im_start|>critic
Deployment log verified — no errors found.
<|im_end|>

C.9 Integration-Ready Templates (for LangChain / LlamaIndex)

(Derived from Appendix B)

LangChain Prompt Conversion

messages = chatml_to_langchain([
    {"role": "system", "content": "Assist Agile teams in project tracking."},
    {"role": "user", "content": "Generate summary for Sprint 3."}
])

LlamaIndex Document Serialization

docs = chatml_to_llamaindex(chatml_transcript)
index = VectorStoreIndex.from_documents(docs)

C.10 Specialized Templates for Project Support Bot

Sprint Summary Template

<|im_start|>assistant
Sprint {{ sprint_number }} Summary:
- Velocity: {{ velocity }}
- Open Issues: {{ open }}
- Closed Issues: {{ closed }}
- Highlights: {{ highlights }}
<|im_end|>

Retrospective Report

<|im_start|>assistant
Sprint {{ sprint_number }} Retrospective Insights:
1. What went well – {{ positives }}
2. What to improve – {{ improvements }}
3. Action Items – {{ actions }}
<|im_end|>

Daily Stand-up Template

<|im_start|>user
Yesterday: {{ yesterday }}
Today: {{ today }}
Blockers: {{ blockers }}
<|im_end|>

C.11 Validation and Testing Templates

(Supporting Chapter 10)

Structural Validation Prompt

<|im_start|>critic
Validate that all ChatML blocks contain matching <|im_start|>/<|im_end|> markers.
<|im_end|>

Replay Verification Prompt

<|im_start|>system
Replay ChatML session {{ session_id }} for debug analysis.
Ensure deterministic output consistency.
<|im_end|>

C.12 Template Registry and Versioning

Template Name Description Ver sion Example File
system_init. jinja2 Global system context prompt 1.2 templates/ system_init.jinja2
assistant_reasoning. jinja2 Structured thinking logic 1.0 templates/ assistant_reasoning.jinja2
tool_action. jinja2 Function invocation blueprint 1.1 templates/ tool_action.jinja2
sprint_report. jinja2 Reporting layout for Agile metrics 2.0 templates/ sprint_report.jinja2
critic_review. jinja2 QA and compliance check 1.0 templates/ critic_review.jinja2

Maintain version control using Git or a centralized template registry.


C.13 Closing Summary

This library consolidates all the building blocks discussed throughout the book:

Chapter Concept Template Focus
3 Message Structure Canonical ChatML blocks
5 Design Principles Structured roles & hierarchy
6 Pipelines Role logic and routing
7 Templates (Jinja2) Dynamic rendering
8 Tool Invocation tool role schema
9 Memory Persistence memory role and context replay
10 Observability Logging and testing roles
B Framework Integration LangChain & LlamaIndex adapters

Each snippet provides a ready-to-deploy blueprint for structured, testable LLM conversations.

By reusing these templates, developers ensure that every message remains reproducible, auditable, and semantically clear — the core promise of ChatML.