If you’ve interacted with ChatGPT or built your own conversational AI, you might have wondered — how exactly does the AI know which parts of a message are from the user, which are from the system, and which are from the assistant?
Behind the scenes, OpenAI uses a simple but powerful markup format called ChatML (Chat Markup Language) to structure conversations. While it originated with OpenAI’s models, similar role-based message formatting is now used or adapted by other large language models as well — for example, Anthropic Claude, Qwen, Mistral, and various open-source chat models have implemented ChatML-compatible or inspired prompt formats to maintain clear conversation context.
In this article, we’ll explore what ChatML is, how it works, and why it matters for building smarter AI systems.
What is ChatML?
ChatML is a lightweight, plain-text markup format designed to give large language models a clear, structured way to understand conversation history.
Instead of sending raw text, developers wrap messages with special tokens that identify the role of the speaker (system, user, assistant, or tool) and the message content.
For example:
<|im_start|>system
You are a helpful assistant.
<|im_end|>
<|im_start|>user
What's the capital of France?
<|im_end|>
<|im_start|>assistant
Code language: HTML, XML (xml)
Here’s what’s happening:
system
→ Sets rules, instructions, or context for the AI.user
→ Represents a message from the end-user.assistant
→ Represents the AI’s reply.<|im_start|>
&<|im_end|>
→ Special tokens to mark message boundaries.
Why Does ChatML Exist?
In early LLM implementations, prompts were often long strings with no strict structure. This made them fragile — minor wording changes could break expected behavior.
ChatML solves this by:
- Separating roles clearly → The model knows who said what.
- Making multi-turn conversations stable → No guessing where one message ends and another begins.
- Supporting system-level control → Developers can enforce guidelines (e.g., tone, style, or restrictions).
Roles in ChatML
Role | Purpose |
---|---|
system | Defines the AI’s personality, constraints, and instructions. |
user | The actual human input. |
assistant | The AI’s output in the conversation. |
tool | For calling or simulating API/tool outputs (in some implementations). |
Building a ChatML Prompt in Python
Here’s a quick helper function to convert a list of messages into ChatML format:
def to_chatml(messages):
chatml = ""
for m in messages:
chatml += f"< |im_start|>{m['role']}\n{m['content']}<|im_end|>\n"
chatml += "<|im_start|>assistant\n" # Leave open for AI's reply
return chatml
messages = [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Tell me a joke."}
]
print(to_chatml(messages))
This produces a properly formatted ChatML string ready for the model.
Advantages of Using ChatML
- Consistency – Prevents prompt breakage due to formatting errors.
- Flexibility – Works for single-turn and multi-turn conversations.
- Control – Gives developers fine-grained control over model behavior.
- Scalability – Easy to extend for new roles or system instructions.
When to Use ChatML
- Custom LLM Applications – If you’re building a chatbot with models like GPT-3.5, GPT-4, or Qwen.
- Multi-Turn Conversations – Where keeping track of roles is important.
- Prompt Engineering – For reliable, repeatable outputs.
ChatML Beyond OpenAI: How Other LLMs Use It
Although ChatML began as an OpenAI-specific format, its structure has proven so practical that many other large language models have adopted either direct compatibility or ChatML-inspired variations.
Here’s how some popular LLMs approach it:
1. Qwen (Alibaba Cloud)
Qwen models (including Qwen2 and Qwen2.5) support ChatML-style formatting directly. They use the same <|im_start|>
and <|im_end|>
tokens with roles like system
, user
, and assistant
. This makes it easy for developers to swap prompts between OpenAI models and Qwen without heavy modifications.
2. Anthropic Claude
Claude doesn’t use ChatML syntax literally, but it follows the same role-based conversation pattern — separating system instructions, user messages, and assistant replies. Developers often wrap Claude prompts in ChatML-like structures for internal consistency in multi-model applications.
3. Mistral / Mixtral
Some Mistral-based chat models on Hugging Face have fine-tunes that understand ChatML, especially in the open-source community. This helps standardize multi-turn conversations without reinventing formatting rules.
4. Open-Source Fine-Tunes
Many open-source LLaMA 2/3 fine-tunes — such as Vicuna, Alpaca, and WizardLM — adopt ChatML or similar message separation schemes. Even if the tokens differ, the concept of “role + message boundary” comes directly from ChatML’s influence.
ChatML Compatibility Across LLMs
LLM / Model Family | ChatML Support | Notes on Usage |
---|---|---|
OpenAI GPT-3.5 / GPT-4 | ✅ Full support | Native format, uses <|im_start|> / <|im_end|> tokens with roles (system , user , assistant ). |
Qwen / Qwen2 / Qwen2.5 | ✅ Full support | ChatML-compatible; directly understands OpenAI-style role markup. |
Anthropic Claude | ⚠️ Partial / Adapted | Doesn’t use ChatML tokens but follows the same role/message separation; can be adapted easily. |
Mistral / Mixtral Chat Models | ⚠️ Partial / Fine-tune dependent | Some fine-tunes understand ChatML, others require a different role separator format. |
LLaMA-based Fine-Tunes (Vicuna, WizardLM, etc.) | ⚠️ Partial / Inspired | Often trained with similar role-based prompts but token formats may differ. |
Gemini (Google) | ❌ No native support | Uses its own structured prompt format, but conceptually similar in role separation. |
Falcon Chat Models | ⚠️ Partial / Inspired | Many fine-tunes replicate ChatML-style conversations for compatibility. |
Why This Matters for Developers
By understanding ChatML’s role-based design, you can:
- Switch between models with minimal prompt changes.
- Standardize multi-model pipelines using one consistent conversation format.
- Avoid prompt fragility when moving from prototyping to production.
In short, ChatML isn’t just an OpenAI thing anymore — it’s becoming a de facto standard for structuring chatbot conversations across the LLM ecosystem.
Summary
ChatML might look like a simple markup, but it plays a huge role in making conversations with AI structured, predictable, and controllable. If you’re building an app that needs to work across multiple LLMs, it’s smart to create a prompt formatting layer in your code. This layer can output true ChatML for models that support it and convert it to a role-based equivalent for those that don’t.