Validation Layer Design: Building the Reflex That Catches What the Model Gets Wrong

The Bug That Wasn't a Bug

A team ran a nightly pipeline that extracted structured data from legal documents using an LLM. It had been running reliably for two months.

One night it failed. Silent failure - no exception thrown, no alert fired. The output database had been accepting records for six weeks with a damage_cap field stored as a string: "$500,000". Not a number. A string.

Downstream analytics, which assumed an integer, had been silently coercing it to zero. Every risk calculation for those six weeks had a damage cap of $0.

Nobody noticed until a lawyer asked why all their high-value contracts showed zero liability exposure.

The LLM had been returning "$500,000" instead of 500000 for six weeks. Every single record. The application code had accepted it. The database had stored it as VARCHAR. The analytics had silently broken.

There was no bug in the code. There was no validation layer.

This is Part 5 of the Harness Engineering series. Part 1 introduced the seven-layer Harness Architecture. This article goes deep on Layer 5 - the Validation and Repair Layer - covering schema validators, semantic checks, repair prompt patterns, and the fail-fast vs. repair decision.

What the Validation Layer Actually Does

In Part 1 I described the Validation and Repair Layer as "the difference between a flaky demo and a production system." Let me be more specific.

The model will produce malformed output. This is not a model quality problem - it is a property of probabilistic systems operating at scale. Even a highly capable model running a well-designed prompt will produce type errors, missing fields, invalid enum values, and structurally broken JSON at some non-trivial rate. Under load, at edge cases, after prompt drift, after context window pressure - the rate increases.

The Validation Layer has two jobs:

Detection - catch every output that doesn't meet the contract your downstream systems expect. Not occasionally. Every time. Before a single malformed record reaches your database, your API, or your user.

Recovery - when validation fails, either repair the output automatically or fail in a controlled, observable, recoverable way. No silent failures. No downstream corruption. No $0 damage caps propagating for six weeks.

I structure this as a Validation Cascade: three tiers of checks applied in sequence, each catching what the previous cannot. Schema first (structure), semantic second (meaning), business rules third (domain constraints). The cascade runs cheapest-first - which is also why it works at production scale.

The model doesn't need to be perfect. It needs to be correctable. The Validation Layer is what makes correction possible.

The Three Validation Tiers

Validation is not a single check. It's a hierarchy of checks, applied in order, each tier catching failures the previous one cannot.

Tier 1: Schema Validation

The first check is structural. Does the output conform to the expected shape? Are required fields present? Are types correct? Is the JSON well-formed?

This is where Pydantic earns its place. Define your expected output as a Pydantic model and let it do the structural validation for you:

code

from pydantic import BaseModel, field_validator, model_validatorfrom typing import Literalfrom decimal import Decimalclass LiabilityClause(BaseModel):    clause_text: str    risk_type: str    severity: Literal["low", "medium", "high", "critical"]    damage_cap: int | None  # Must be integer, not string    jurisdiction: str    effective_date: str | None    @field_validator("damage_cap", mode="before")    @classmethod    def coerce_damage_cap(cls, v):        if v is None:            return None        if isinstance(v, str):            # Strip currency symbols and commas before failing            cleaned = v.replace("$", "").replace(",", "").strip()            try:                return int(float(cleaned))            except ValueError:                raise ValueError(f"damage_cap must be a number, got: {v!r}")        return v    @field_validator("severity")    @classmethod    def validate_severity(cls, v):        if v not in ("low", "medium", "high", "critical"):            raise ValueError(f"severity must be one of low/medium/high/critical, got: {v!r}")        return vclass ExtractionResult(BaseModel):    clauses: list[LiabilityClause]    document_type: str    jurisdiction_detected: str    @model_validator(mode="after")    def at_least_one_clause(self):        if not self.clauses:            raise ValueError("Extraction must return at least one clause")        return self

Notice the coerce_damage_cap validator. It doesn't just reject "$500,000" - it strips the currency symbol and converts it. This is defensive coercion: attempt to recover the intent before raising an error. The model meant 500000. Get 500000. Log that a coercion happened. Move on.

Defensive coercion handles the long tail of formatting variations without triggering a repair loop for every minor difference. Reserve repair loops for failures that genuinely can't be coerced.

Tier 2: Semantic Validation

Schema validation checks structure. Semantic validation checks meaning.

A response can be structurally valid and semantically wrong. severity: "low" for a clause that says "contractor bears unlimited liability for all damages" is structurally fine and semantically broken.

code

class SemanticValidator:    def __init__(self, llm):        self.llm = llm    def validate(self, result: ExtractionResult, source_document: str) -> list[str]:        issues = []        for clause in result.clauses:            issues.extend(self._check_severity_consistency(clause))            issues.extend(self._check_damage_cap_plausibility(clause, source_document))        return issues    def _check_severity_consistency(self, clause: LiabilityClause) -> list[str]:        # Rule-based semantic check: unlimited liability should never be "low"        unlimited_indicators = [            "unlimited", "all damages", "any and all", "without limit"        ]        text_lower = clause.clause_text.lower()        if any(ind in text_lower for ind in unlimited_indicators):            if clause.severity in ("low", "medium"):                return [                    f"Clause mentions unlimited liability but severity is '{clause.severity}'. "                    f"Expected 'high' or 'critical'."                ]        return []    def _check_damage_cap_plausibility(        self, clause: LiabilityClause, source: str    ) -> list[str]:        # Use LLM as judge for complex semantic checks        if clause.damage_cap is not None and clause.damage_cap > 0:            prompt = (                f"Source clause: {clause.clause_text}\n\n"                f"Extracted damage_cap: ${clause.damage_cap:,}\n\n"                "Does the extracted damage cap accurately reflect the source clause? "                "Reply with only 'yes' or 'no'."            )            response = self.llm.call(prompt).strip().lower()            if response == "no":                return [f"Damage cap ${clause.damage_cap:,} appears inconsistent with source clause text."]        return []

Semantic validation has two modes: rule-based (fast, deterministic, covers known failure patterns) and LLM-as-judge (flexible, catches novel failures, slower and more expensive). Use rule-based for common patterns you can enumerate. Use LLM-as-judge only for complex semantic consistency checks that can't be rule-encoded, and only when the cost of a false negative justifies the additional LLM call.

Tier 3: Business Rule Validation

The third tier enforces constraints that are specific to your domain and invisible to the model.

code

class BusinessRuleValidator:    def __init__(self, db):        self.db = db    def validate(self, result: ExtractionResult, document_id: str) -> list[str]:        issues = []        # Cross-reference: jurisdiction must be in our supported set        supported = self.db.get_supported_jurisdictions()        if result.jurisdiction_detected not in supported:            issues.append(                f"Jurisdiction '{result.jurisdiction_detected}' not in supported set. "                f"Supported: {supported}"            )        # Uniqueness: no duplicate clause_text within same document        seen_texts = set()        for clause in result.clauses:            if clause.clause_text in seen_texts:                issues.append(f"Duplicate clause text detected: {clause.clause_text[:50]}...")            seen_texts.add(clause.clause_text)        # Referential integrity: document_type must match DB record        doc_type = self.db.get_document_type(document_id)        if doc_type and doc_type != result.document_type:            issues.append(                f"Extracted document_type '{result.document_type}' "                f"doesn't match DB record '{doc_type}'"            )        return issues

Business rule validation is where domain knowledge lives - the constraints your model cannot know because they exist in your systems, not in the document being processed.

The Repair Loop Pattern

When validation fails and defensive coercion can't recover the output, you have a choice: fail fast or repair.

Fail fast when:

The failure is unrecoverable (the model returned HTML instead of JSON)
The failure has cascaded past the point where a retry would be coherent
You've already retried and the same failure recurs
The cost of retry exceeds the value of the result

Repair when:

The failure is specific and correctable (wrong type on a known field)
The schema violation can be described precisely in a prompt
The model is likely to succeed on a targeted correction request
The downstream cost of failure exceeds the cost of an additional LLM call

The repair loop is simple in concept and critical in implementation:

code

import jsonfrom pydantic import ValidationErrordef validated_extraction(    prompt: str,    document: str,    schema: type[BaseModel],    llm,    max_retries: int = 3) -> BaseModel:    current_prompt = prompt    last_error = None    for attempt in range(max_retries):        raw = llm.call(current_prompt + f"\n\nDocument:\n{document}")        # Parse JSON        try:            parsed = json.loads(raw)        except json.JSONDecodeError as e:            last_error = f"Invalid JSON: {e}"            current_prompt = build_json_repair_prompt(prompt, raw, last_error)            continue        # Schema validation        try:            result = schema.model_validate(parsed)            return result        except ValidationError as e:            last_error = format_validation_errors(e)            current_prompt = build_schema_repair_prompt(prompt, raw, last_error)            continue    raise MaxRetriesExceeded(        f"Validation failed after {max_retries} attempts. Last error: {last_error}"    )def build_schema_repair_prompt(original_prompt: str, bad_output: str, errors: str) -> str:    return (        f"{original_prompt}\n\n"        f"Your previous response contained validation errors:\n{errors}\n\n"        f"Your previous response was:\n{bad_output}\n\n"        "Please correct these specific errors and return valid JSON. "        "Do not change any fields that were correct."    )def format_validation_errors(e: ValidationError) -> str:    lines = []    for error in e.errors():        field = " -> ".join(str(loc) for loc in error["loc"])        lines.append(f"- Field '{field}': {error['msg']} (got: {error.get('input', 'unknown')})")    return "\n".join(lines)

Two implementation details that matter enormously:

"Do not change any fields that were correct." Without this instruction, the model often fixes the errored field but introduces new errors in previously correct fields. Constrain the repair to the failing fields only.

Exponential backoff between retries. If the model fails once, it's unlikely to succeed with the same prompt immediately. Add a small delay between attempts. On the third attempt, consider widening the prompt with additional examples of the correct format.

Instructor: The Production Standard

In Part 1, I named Pydantic and Instructor as the industry-standard tools for this layer. Here's why Instructor specifically earns that designation.

Instructor wraps LLM calls with automatic Pydantic-based validation and retry, turning the repair loop above into a single function call:

code

import instructorfrom anthropic import Anthropicclient = instructor.from_anthropic(Anthropic())result = client.messages.create(    model="claude-sonnet-4-6",    max_tokens=2000,    messages=[{"role": "user", "content": f"Extract liability clauses:\n{document}"}],    response_model=ExtractionResult,  # Your Pydantic model    max_retries=3,                     # Automatic repair loop)# result is already a validated ExtractionResult instance

Instructor handles JSON parsing, schema validation, and retry with error feedback automatically. For schema validation, it is the right default. Build on top of it for semantic and business rule validation - those remain your responsibility.

If you are writing your own JSON parsing loop without Instructor, you are solving a solved problem. Stop and use Instructor.

The Named Pattern: Validation Cascade

I call the three-tier structure the Validation Cascade: schema first, semantic second, business rules third.

The cascade runs in this order for a reason. Schema validation is cheap and fast - it runs entirely in memory with no LLM calls. Semantic validation is more expensive - it may involve an LLM-as-judge call. Business rule validation requires database access.

Run the cheapest check first. Gate the expensive checks behind the cheap ones.

A schema failure short-circuits the cascade - no point running semantic validation on output that can't even be parsed. A schema pass but semantic failure short-circuits business rule validation. Only output that passes all upstream tiers reaches the final tier.

This cascade structure means your expensive checks only run on output that has already cleared the cheap ones - dramatically reducing the cost of comprehensive validation.

The Full Validation Pipeline

The Validation Cascade and the Repair Loop work together as a single pipeline. The cascade decides what's wrong. The repair loop decides what to do about it.

mermaid

graph TD
    A[LLM Raw Output] --> B[Tier 1: Schema Validation]
    B -- Fail --> C{Coercible?}
    C -- Yes --> D[Defensive Coercion]
    D --> E[Tier 2: Semantic Validation]
    C -- No --> F[Build Repair Prompt]
    F --> G{Retries Left?}
    G -- Yes --> H[LLM Retry]
    H --> B
    G -- No --> I[Fail Fast / Raise]
    B -- Pass --> E
    E -- Fail: Rule-based --> F
    E -- Fail: LLM-as-judge --> F
    E -- Pass --> J[Tier 3: Business Rule Validation]
    J -- Fail --> F
    J -- Pass --> K[Validated Output]
    K --> L[Downstream System]

    style A fill:#FFD93D,color:#000
    style B fill:#6BCF7F,color:#fff
    style C fill:#7B68EE,color:#fff
    style D fill:#98D8C8,color:#000
    style E fill:#6BCF7F,color:#fff
    style F fill:#9B59B6,color:#fff
    style G fill:#7B68EE,color:#fff
    style H fill:#FFD93D,color:#000
    style I fill:#E74C3C,color:#fff
    style J fill:#6BCF7F,color:#fff
    style K fill:#98D8C8,color:#000
    style L fill:#4A90E2,color:#fff

Three things to note in this diagram. First, defensive coercion is a separate path - it runs before semantic validation, not after failure. If the value can be coerced, it never enters the repair loop. Second, every repair prompt feeds back to Tier 1 - the repaired output starts the full cascade again, not just the tier that failed. Third, LLM Retry is yellow (#FFD93D) - the same color as the original LLM output - because a retry is still a probabilistic call. The cascade treats it with the same skepticism it treats the original.

What Observability Looks Like for This Layer

Validation failure rate by tier - what fraction of outputs fail at schema vs. semantic vs. business rule? A high schema failure rate signals prompt drift or model degradation. A high semantic failure rate signals the model is misunderstanding the task. A high business rule failure rate signals your domain constraints are tighter than the model's outputs.

Repair success rate - of outputs that fail validation and enter the repair loop, what fraction succeed on retry? Below 70% suggests your repair prompts need improvement. Above 95% suggests you could add more aggressive validation - the model corrects reliably enough to warrant stricter checks.

Defensive coercion rate - how often does Pydantic coerce a value rather than reject it? A rising coercion rate on a specific field signals the model is developing a systematic formatting habit that differs from your schema. Address it in the prompt before it becomes a repair loop dependency.

Retry distribution - what fraction of successes required 1 retry vs. 2 vs. 3? If most successes require 3 retries, your base prompt needs work. If most succeed on first attempt, your validation is healthy.

What to Build First

First: Pydantic models for all structured outputs. If you're calling json.loads() without Pydantic validation on the result, you have unvalidated LLM output in production. Add Pydantic models today.

Second: Install Instructor. Replace manual retry loops with Instructor's automatic validation and retry. Immediately reduces boilerplate and improves repair consistency.

Third: Defensive coercion for common type mismatches. Add field validators for the types the model consistently gets wrong: currency strings to integers, date strings to date objects, percentage strings to floats.

Fourth: Schema repair prompts. Write targeted repair prompts for your most common validation failures. Don't rely on Instructor's default error messages - customize them to explain the correct format explicitly.

Fifth: Rule-based semantic validators. Add checks for the semantic failures you can enumerate from your domain knowledge. Unlimited liability with low severity. Negative damage caps. Contradictory date ranges.

Sixth: LLM-as-judge for complex semantic validation. Add LLM-as-judge checks only for semantic failures you cannot encode as rules. Be selective - each LLM-as-judge call doubles your inference cost for that record.

The Principle

The model is a probabilistic system. Probabilistic systems produce incorrect outputs at some non-zero rate. That rate is not zero even for the best models on the best prompts.

Your validation layer is what transforms a probabilistic system into a reliable one.

Not by making the model perfect - that's not the goal and not achievable. By catching every deviation from the contract your downstream systems depend on, and either correcting it automatically or failing in a way that is observable, recoverable, and auditable.

The $0 damage caps ran for six weeks because there was no validation layer. There was no mechanism to catch the deviation, no alert when it started, no visibility into how long it had been happening.

Silent failures are the most expensive kind. The Validation Cascade makes LLM failures loud, caught, and correctable - before they reach your database, your users, or your lawyers.

What's Next in This Series

Part 1: Harness Engineering - The Missing Layer - The full seven-layer Harness Architecture overview
Part 2: Normalization and Input Defense - Prompt injection, input sanitization, and multi-surface consistency
Part 3: Context Engineering - Memory architectures, retrieval strategies, and context compression
Part 4: Gated Execution - Policy engines, human-in-the-loop design, and dry-run patterns
Part 6: Retry, Fallback, and Circuit Breaking - Building resilient LLM infrastructure that survives model outages and latency spikes
Part 7: State Management for Agentic Systems - Checkpoint-resume strategies, cross-session memory, and durable state for long-running agents
Part 8: Deterministic Constraint Systems - Building tool registries and action manifests that prevent hallucinated actions in agentic systems

References

Liu, J. (2023). Instructor: Structured outputs for LLMs. https://github.com/instructor-ai/instructor
Pydantic. (2024). Pydantic V2 Documentation. https://docs.pydantic.dev/latest/
Ouyang, L., Wu, J., Jiang, X., et al. (2022). Training language models to follow instructions with human feedback. arXiv:2203.02155. https://arxiv.org/abs/2203.02155
Shinn, N., Cassano, F., Labash, B., et al. (2023). Reflexion: Language Agents with Verbal Reinforcement Learning. arXiv:2303.11366. https://arxiv.org/abs/2303.11366
Zheng, L., Chiang, W. L., Sheng, Y., et al. (2023). Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena. NeurIPS 2023. https://arxiv.org/abs/2306.05685
Chase, H. (2022). LangChain: Building applications with LLMs through composability. https://github.com/langchain-ai/langchain
Anthropic. (2024). Tool use with Claude. https://docs.anthropic.com/en/docs/build-with-claude/tool-use

Systems Design

Follow for more technical deep dives on AI/ML systems, production engineering, and building real-world applications:

The Bug That Wasn't a Bug

What the Validation Layer Actually Does

The Three Validation Tiers

Tier 1: Schema Validation

Tier 2: Semantic Validation

Tier 3: Business Rule Validation

The Repair Loop Pattern

Instructor: The Production Standard

The Named Pattern: Validation Cascade

The Full Validation Pipeline

What Observability Looks Like for This Layer

What to Build First

The Principle

What's Next in This Series

References

Related Articles

Comments