Evaluate with promptfoo
yori export promptfoo turns a composed artifact and its cases into a ready-to-run promptfoo config.
yori is the source of truth for your prompts; for model-graded evaluation it hands off to promptfoo rather than reinventing it.
yori export promptfoo <name> resolves an artifact's composition (includes,
slots) but leaves variables as {{ placeholders }}, and emits a ready-to-run
config:
yori export promptfoo review > promptfooconfig.yaml
promptfoo evalTest cases live next to the artifact
Cases go in <name>.cases.yaml (or cases.yaml in a skill bundle) — a plain
list of promptfoo test objects, so authoring yori cases is authoring promptfoo
tests:
# review.cases.yaml
- description: flags a real bug
vars: { lang: go, input: "func f() int { }" }
assert:
- { type: contains, value: "return" }
- { type: llm-rubric, value: "explains the root cause" }The provider comes from --provider or the artifact's model: hint.
yori export promptfoo review --provider openai:gpt-4o-mini
yori export promptfoo pr-bot --type agentyori manages and ships your prompts; promptfoo grades them.