
Gambit: an open source agent harness for reliable agents, assistants, and workflows
Gambit is our open-source agent harness for people who need production-friendly planning, grading, and test agents without rebuilding orchestration from scratch.

Context engineering is the way
Context engineering is the new term for what we've been working on at Bolt Foundry: systematically optimizing LLM performance through structured samples, graders, and proper information hierarchy.

Evals from scratch: Building LLM evals with aibff from Markdown and TOML
We built a reliable eval system using Markdown, TOML, and a command-line tool that adapts when you change prompts, demonstrated through creating graders for an AI-powered sports newsletter.
From inconsistent outputs to perfect reliability in under an hour
How Velvet increased their citation XML output reliability to 100% in under an hour using LLM attention management principles.

5 things about LLM prompts we think everyone should know
Most teams are building LLM prompts wrong. Here are 5 essential concepts for building reliable LLM applications.