The ChatML (Chat Markup Language) Handbook
A Developer’s Guide to Structured Prompting and LLM Conversations
ChatML, LLMs, Prompt Engineering, LangChain, LlamaIndex
Preface

“Prompting is no longer guesswork — it’s dialogue engineering.”
The field of conversational AI is undergoing a profound transformation. What began as intuition-driven prompt crafting has evolved into a discipline grounded in structure, semantics, and reproducibility.
Prompts are no longer ad-hoc strings — they are interfaces between human intent and machine reasoning.
This handbook captures that transformation. It turns the art of prompting into a structured framework for reliable, auditable, and scalable dialogue design — bridging the gap between language and logic.
Purpose of the Book
The goal of this book is to demystify the design philosophy and engineering discipline behind ChatML — the markup language that defines structured prompting for Large Language Models (LLMs).
ChatML introduces explicit roles, message boundaries, and context persistence, eliminating ambiguity and enabling predictable collaboration between humans and machines.
Beyond syntax, this is a book about architecting conversations — how to design systems that can reason, remember, and act.
You’ll see how ChatML connects prompt engineering, software architecture, and system orchestration, and how it powers real-world applications like Support Bot v3.4 built on FastAPI and Ollama (Qwen 2.5).
Who This Book Is For
This book is written for a wide range of readers:
- AI Developers and Engineers building production-grade conversational systems.
- Researchers exploring reasoning, orchestration, and multi-agent collaboration.
- Educators and Students seeking a conceptual framework for dialogue systems.
- Product and Platform Teams integrating LLMs with APIs, memory, and tools.
Whether you’re experimenting with prompt templates or architecting large-scale agentic systems, this handbook offers both a conceptual compass and a practical toolkit.
What’s New in This Edition
This edition unifies conceptual foundations with engineering practice, offering a clear path from structured prompting to structured systems.
Highlights
- Expanded Practical Section — featuring Support Bot v3.4, a full implementation using ChatML + FastAPI + Ollama.
- Dedicated “Support Bot Project” (Part III) — a hands-on example of structured orchestration.
- New Chapters on Tool Execution and Memory Persistence — enabling automation and long-term state.
- Unified Reference Appendix — combining templates, syntax, and framework integration.
- Improved Template Rendering Examples — showcasing dynamic, modular ChatML composition.
How to Use This Book
The chapters are designed to be modular yet interconnected. You can read them sequentially for a complete understanding, or jump to the sections most relevant to your work:
- Start with Part I – Foundations, to understand the principles of ChatML.
- Move to Part II – Engineering, to learn the practical architecture of ChatML systems.
- Explore Part III – The Support Bot Project, for a full implementation walkthrough.
- Refer to Part IV – Ecosystem & Reference, for syntax and framework details.
Use this book as both a guide and a reference — a living bridge between concept and implementation.
Acknowledgments
This work builds upon the collective innovation of the open-source AI community.
From early chatbot prototypes to advanced orchestration frameworks, the contributions of researchers and developers laid the foundation for what we now call dialogue engineering.
Special thanks to the creators of Ollama, LangChain, LlamaIndex, and OpenAI’s ChatML, whose openness and rigor made it possible to bridge theory with practice.
This edition marks the transition of ChatML from a concept to a fully realized engineering framework — one that unites persistent memory, tool invocation, and structured validation into a single conversational architecture.
© 2025 — Ranjan Kumar
All rights reserved.