Terminology
Terminology
A glossary of key terms used throughout Bolt Foundry's AI evaluation and grading
systems.
Grading & Evaluation
Reference Grading {#reference-grading}
Human validation of AI outputs to establish quality standards and ground truth.
Reference grading provides the baseline against which automatic graders are
calibrated.
Usage: "Submit these samples to the reference grading workflow" Related:
Replaces "manual grading" terminology
Automatic Grading {#automatic-grading}
AI-powered evaluation of outputs using trained graders. Automatic grading
handles routine evaluations at scale, with reference grading providing oversight
and calibration.
Usage: "The automatic grading system processed 1,000 samples" Related:
Reference Grading
Ground Truth {#ground-truth}
Note: This term is under review for potential replacement
Reference standards or expected outputs used to validate AI system performance.
Established through reference grading workflows.
Usage: "These samples represent our ground truth for sports relevance"
Related: Reference Grading
Grader {#grader}
An AI system trained to evaluate specific aspects of outputs (e.g., accuracy,
helpfulness, relevance). Graders are calibrated against reference grading to
ensure consistent evaluation.
Usage: "The helpfulness grader scored this response as +2" Related:
Automatic Grading
Sample {#sample}
An instance of AI behavior, input/output pair, or evaluation target. Samples are
used for training, testing, and validating both AI systems and graders.
Usage: "Add more samples to improve grader accuracy" Note: We use
"sample" instead of "example" throughout our systems
AI Systems
Deck {#deck}
A structured framework containing cards, specs, leads, and context that defines
how an AI system should behave. Decks are the foundation of reliable AI
prompting.
Usage: "Deploy the customer service deck" Components: Cards, Specs,
Leads, Context
Card {#card}
A component within a deck that defines personas or behaviors. Cards can specify
core characteristics (persona cards) or actionable protocols (behavior cards).
Usage: "The coding behavior card defines best practices" Types: Persona
cards, Behavior cards
Specs {#specs}
Rules and guidelines within cards that define specific behaviors or
requirements. Specs can be grouped and may include samples for demonstration.
Usage: "The validation specs contain three rules" Related:
Sample
Product & Process
PromptGrade {#promptgrade}
Bolt Foundry's human-in-the-loop AI evaluation system. Enables automatic grading
with human oversight through reference grading workflows.
Usage: "PromptGrade Phase 6 focuses on automatic grading integration"
Related: Reference Grading,
Automatic Grading
RLHF {#rlhf}
Reinforcement Learning from Human Feedback. The process of using human
evaluations to improve AI system performance.
Usage: "RLHF workflows collect reference grading data" Related:
Reference Grading
Inbox Workflow {#inbox-workflow}
A user interface pattern where items requiring human attention appear in a
centralized inbox for processing. Used extensively in reference grading
workflows.
Usage: "Samples needing reference grading appear in the inbox" Related:
Reference Grading
Legacy Terms
The following terms are deprecated or under review:
- Manual Grading: Use Reference Grading instead
- Human Grading: Use Reference Grading instead
- Ground Truth: Term under review, may be replaced with
Reference Grading in some contexts
Last updated: August 28, 2025