πŸ” How SynthID Works

A Visual Journey Through AI Watermarking

1

The Problem: Detecting AI-Generated Text

πŸ€–
❓
Is this text from AI or a human?

As AI models like ChatGPT and Gemini generate billions of words daily, we need a way to identify which text comes from AI systems.


The Challenge: The watermark must be completely invisible to human readers while remaining detectable by computer systems.

2

How LLMs Choose Words

When generating text, AI models predict the next word based on probability scores. Let's see an example:
"My favorite tropical fruits are mango and ___"
🍌 bananas 85%
High Probability
πŸ₯­ papaya 10%
Medium
✈️ airplanes 0.1%
Very Low
Key Insight: When multiple words are equally likely, the model has flexibility in which one to choose. This is where watermarking happens!
3

Step 1: The Secret Key

πŸ”‘
SECRET CRYPTOGRAPHIC KEY
Known only to Google

Google generates a secret cryptographic key that determines how the watermark is embedded.


This key is like a recipe that tells the AI model which words to favor slightly during generation.


Important: Without this key, the watermark is impossible to detect or forge!

4

Step 2: Dividing Words into Green & Red Lists

Using the secret key, SynthID splits the vocabulary into two groups:
βœ…
GREEN LIST
Words that get a slight boost in probability
delicious tropical sweet
❌
RED LIST
Words that get slightly suppressed in probability
tasty exotic juicy
The Magic: These lists change based on the previous words! The pattern looks random but is actually determined by the secret key.
5

Step 3: Subtle Probability Adjustments

When generating each word, SynthID makes tiny adjustments to favor green list words:

πŸ‘€ Normal Model

delicious: 30%
tasty: 30%
amazing: 25%
All equally likely

πŸ” With SynthID

delicious: 33% βœ…
tasty: 27% ❌
amazing: 28% βœ…
Green words boosted slightly
Crucially: These adjustments are so small (2-3%) that they don't change the meaning or quality of the text. Humans can't tell the difference!
6

Step 4: Building the Statistical Pattern

As the model generates text, it consistently favors green list words. Over 200+ words, this creates a detectable statistical pattern:
The weather was absolutely perfect for a tropical vacation. Mango trees swayed gently in the breeze
65%
Green list words (vs 50% expected by chance)

In unwatermarked text, you'd expect about 50% green words and 50% red words by random chance.


With SynthID, the text contains significantly more green wordsβ€”creating the watermark signature!

7

Step 5: Detection with the Secret Key

πŸ“„
Suspect Text
β†’
πŸ”‘
Apply Secret Key
β†’
πŸ”
Statistical Test
β†’
βœ…
Watermark Found!

The detector reconstructs the green/red lists using the secret key and counts how many green words appear in the text.


If significantly more than 50% are green words, the text is watermarked. The more words you analyze, the more confident the detection becomes.

Minimum requirement: SynthID needs at least 200 tokens (words) to reliably detect the watermark.
8

Why Humans Can't See It

πŸ‘οΈ
HUMAN PERCEPTION
"The weather was perfect"

vs

"The weather was ideal"
Both sound equally natural βœ“
πŸ€–
MACHINE DETECTION
perfect = GREEN βœ…

ideal = RED ❌
Pattern detected! βœ“

Since the probability changes are tiny (2-3%), the AI still picks natural-sounding words. The text quality remains identicalβ€”humans genuinely cannot tell the difference.


But machines with the secret key can detect the statistical bias toward green words that accumulates across the entire text.

9

Real-World Performance: Google's Test Results

20 Million
Gemini responses tested with real users

βœ… What Works

  • Long-form content (essays, stories)
  • Creative open-ended prompts
  • Survives light editing
  • No quality degradation
  • Only 7% latency overhead

❌ Limitations

  • Factual queries (Paris = capital)
  • Short text (<200 words)
  • Code generation
  • Heavy editing/translation
  • Can be removed by attackers
User Verdict: When users rated responses via thumbs up/down, they showed NO PREFERENCE between watermarked and unwatermarked text. The watermark truly is invisible!
10

The Bigger Picture: Why This Matters

SynthID represents the first large-scale deployment of AI text watermarking.

By open-sourcing it in October 2024, Google enabled researchers worldwide to:

  • Build better watermarking methods
  • Test security vulnerabilities
  • Develop detection tools
  • Understand fundamental limitations
🌍
Global Impact
EU AI Act Compliance
Combat Misinformation
Academic Integrity
The Paradox: Open-sourcing revealed that watermarking works well enough for production... but also exposed fundamental vulnerabilities that make it far from a complete solution.
✨

Summary: The SynthID Recipe

  1. πŸ”‘ Generate secret key (known only to Google)
  2. πŸ“‹ Split vocabulary into green (boosted) and red (suppressed) lists
  3. πŸ“ˆ Adjust probabilities slightly during text generation (2-3%)
  4. πŸ“ Generate text that naturally favors green words
  5. πŸ” Detect pattern by counting green words with secret key
  6. βœ… Verify watermark with statistical confidence
🎯 Key Takeaway
"The watermark is imperceptible to humans but detectable by machinesβ€”
a mathematical fingerprint hidden in plain sight."

πŸ’‘ Want to learn more about AI watermarking?

Check out my full article: How Google’s SynthID Actually Works: A Visual Breakdown

GitHub: SynthID Text | Nature Paper (2024): Scalable watermarking for identifying large language model outputs