Skip to content

Structured Output

structured() requests a JSON object or array from the model, parses it, and (optionally) validates it against a custom predicate. On parse or validation failure, it sends a repair prompt that includes the previous response and the error, then retries — up to max_retries times.

When to use / when not to use

Use it when… Avoid it when…
You need a JSON dict or list back from the model. You only need free-form text.
You can write a validator (or trust well-formed JSON). Your provider already supports native JSON-mode and gives you better guarantees.
Occasional repair retries are acceptable. Strict latency budget — repair adds an extra LLM call per failed attempt.
Your task fits in a single prompt-response cycle. The task needs tools — use ReAct instead.

Call flow

sequenceDiagram
    participant App
    participant structured
    participant Provider
    participant Validator
    App->>structured: structured(provider, prompt, validator=v, max_retries=3)
    structured->>Provider: complete(json_prompt(prompt), temperature=0.0)
    Provider-->>structured: response.content
    loop up to max_retries + 1
        structured->>structured: extract_json(text)
        alt parse failed
            structured->>structured: last_error = "JSON parse failed: ..."
        else validator returns error
            structured->>Validator: validator(value)
            Validator-->>structured: error string / False
            structured->>structured: last_error = error
        else accepted
            structured-->>App: PatternResult(value, ...)
        end
        structured->>Provider: complete(repair_prompt with previous + error)
        Provider-->>structured: corrected text
    end
    structured-->>App: raise ValueError(last_error)

Minimal example

import asyncio
import os
from executionkit import Provider, structured

def is_valid_user(value: object) -> str | None:
    """Return None for valid, error string for invalid."""
    if not isinstance(value, dict):
        return "Top-level value must be a JSON object."
    missing = {"name", "email", "age"} - value.keys()
    if missing:
        return f"Missing required fields: {sorted(missing)}"
    if not isinstance(value["age"], int) or value["age"] < 0:
        return "'age' must be a non-negative integer."
    return None  # ✓

async def main() -> None:
    async with Provider(
        base_url="https://api.openai.com/v1",
        api_key=os.environ["OPENAI_API_KEY"],
        model="gpt-4o-mini",
    ) as provider:
        result = await structured(
            provider,
            "Extract the user record from: 'Alice, alice@example.com, 32 years old'.",
            validator=is_valid_user,
            max_retries=2,
        )

        print(result.value)                              # {'name': 'Alice', 'email': '...', 'age': 32}
        print(result.metadata["parse_attempts"])         # 1 = parsed cleanly first try
        print(result.metadata["repair_attempts"])        # 0 = no repair needed
        print(result.metadata["validated"])              # True

asyncio.run(main())

Configuration knobs

Parameter Default Description
validator None (value) -> None / True / "" → accept; False / "..." → repair. None accepts any parsed JSON.
max_retries 3 Number of repair attempts after the initial parse. 0 = parse-once-no-repair.
temperature 0.0 Lower = more deterministic JSON.
max_tokens 4096 Per-completion token cap. Must be >= 1.
max_cost None Optional TokenUsage budget across the initial call + all repairs.
retry DEFAULT_RETRY Per-call retry config for transient transport errors (separate from repair retries).

Validator contract

A validator is Callable[[StructuredValue], str | None | bool] where StructuredValue = dict | list:

Return value Treated as
None ✓ Accept
True ✓ Accept
"" (empty string) ✓ Accept
False ✗ Reject with generic message
any other string ✗ Reject; the string becomes the repair prompt's error description

The validator runs only when JSON parses successfully. Parse errors trigger repair regardless.

Metadata keys

Key Type Meaning
parse_attempts int Total parse attempts made (initial + repairs).
repair_attempts int Number of repair calls issued. parse_attempts - 1 when no early success.
validated bool True if the validator accepted, or True when no validator was supplied.

Cost characteristics

  • Best case: 1 LLM call. Initial response parses and validates on the first try.
  • Worst case: 1 + max_retries LLM calls. Each failed attempt costs one repair call.
  • Sequential. Repair depends on the prior response; no parallelism.
  • temperature=0.0 by default keeps repairs predictable.

Errors

Exception Cause
ValueError max_retries < 0 or max_tokens < 1.
ValueError (after exhaustion) All max_retries + 1 attempts failed parse or validation. The message carries the last error.
BudgetExhaustedError max_cost exceeded mid-loop.
RateLimitError / ProviderError Bubbled from Provider.complete after retry exhaustion.

Tips

  • Use a strict validator. Type-check, bound-check, and field-presence-check at the validator level — the model is fast at obeying clear error messages.
  • Schema-as-validator. Pair structured with a Pydantic model:
from pydantic import BaseModel, ValidationError

class User(BaseModel):
    name: str
    email: str
    age: int

def validate(value):
    try:
        User.model_validate(value)
        return None
    except ValidationError as exc:
        return str(exc)

result = await structured(provider, prompt, validator=validate)
  • Keep max_retries low (23). If the model can't produce valid JSON in three tries, your prompt is the problem, not the retry budget.

Source

executionkit/patterns/structured.py