AI Validator¶
The AI validator lets you write acceptance criteria in natural language and have an LLM evaluate the submission against them. It's the right tool when a check is genuinely subjective — "does this README explain installation, configuration, and licensing?" — or when the format is loose enough that a formal schema would be brittle.
The AI validator is not a replacement for schemas, CEL, or simulations. It's complementary: use it for things humans would naturally evaluate by reading, not for anything you can encode deterministically.
What you'll need¶
- A Validibot account with permission to author workflows.
- A workflow whose allowed file types include the format you want to check (typically PDF, text, Markdown, or JSON).
- Your acceptance criteria, written as plain-English sentences.
- An LLM provider configured for your deployment. See Self-Hosted Editions for which providers Validibot supports and how to wire credentials in. On Validibot Cloud, an LLM is already configured for you.
Setting up an AI step¶
- Open the workflow editor and click Add step.
- Pick AI from the validator library.
- Give the step a name like "README quality gate" and a short description.
- Write your acceptance criteria in the Rules field, one rule per line or as a numbered list. The clearer and more specific each rule is, the more reliably it can be evaluated.
- (Optional) Pick a model tier — most deployments expose a fast tier for quick checks and a stronger tier for deeper reasoning.
- Click Save step.
Writing good AI criteria¶
The validator's reliability depends almost entirely on how you phrase the rules. A few patterns that work:
- Be specific. "The document covers installation" is vague.
"The document explains how to install the package with
pip installand lists supported Python versions" is testable. - One concept per rule. Compound rules ("Has installation AND testing AND licensing") produce muddier findings than three separate rules.
- Say what must be present, not what should not be. "The document mentions a license" works better than "the document does not omit licensing information."
- Anchor on observable behaviour. "Includes a runnable example" is testable. "Is engaging" is not.
What the validator reports¶
For each rule, the AI validator returns a pass / fail plus a short explanation. Findings include the rule that failed and the LLM's reasoning, so submitters can act on it.
The validator is conservative: an ambiguous case is reported as a warning rather than silently passing, and a rule the model cannot evaluate reliably is escalated as a finding rather than fabricated.
A note on determinism and trust¶
LLM evaluations are not bit-for-bit deterministic the way a JSON Schema check is. Two runs of the same submission against the same rules can produce slightly different findings — usually agreeing on the headline result, occasionally differing on edge phrasing.
That means the AI validator is a great gate ("does this look roughly right?") and a less great audit trail ("here is the canonical reason this passed"). When a check has compliance implications, prefer a deterministic validator — schema, SHACL, FMU, EnergyPlus — and use the AI validator alongside, not as a replacement.
Cost and rate¶
Each AI step calls an LLM, which has cost and latency. For high-volume workflows:
- Put cheap, deterministic checks (schemas, CEL) first — the AI step only runs when its predecessors pass.
- Pick the smallest model that handles your rules well.
- On Validibot Cloud, see your plan's metered usage page for current AI-validation pricing.
File types¶
The AI validator can read any file Validibot's text extraction stack supports — typically PDF, plain text, Markdown, HTML, and JSON. Image and audio support depend on the configured provider; check the provider page in your deployment's admin for what's available.
Tips¶
- Treat the AI as a first reader, not a final arbiter. Use it to filter out obvious misses; reserve human review for the borderline cases.
- Iterate on your rules with a known-good and a known-bad sample. Run both through the step, compare the findings, refine the wording until each rule fires when it should and stays quiet when it shouldn't.
- Don't ask the LLM to compute. "Does the total equal the sum of line items?" should be a CEL rule, not an AI rule.
Where to next¶
- Validators Overview — when to choose AI vs. a deterministic validator.
- Basic Validator — the deterministic option for cross-field rules.
- Self-Hosted Editions — configuring an LLM provider for self-hosted deployments.