The Foundry

in-design

Where agents are forged, tested, and calibrated. The 7th universe component. Owner — Voss Praxis.

componentfoundryagentscalibration

See also: crew › vosscrew › miracrew › saluniverse › overview

The Foundry

See also: Voss Praxis | Mira Strand | Tick Engine | Observatory

Status: IN DESIGN — concept defined, architecture described. No product implementation yet.

What It Is

The Foundry is where agents are forged, tested, and calibrated. It’s the only universe component that operates on the crew itself rather than on the work the crew produces.

Every other component has a clear domain: the Pipeline moves work, the Observatory gathers intelligence, the HUD renders state, the Wormholes channel communication, the Universe holds configuration. The Foundry holds the question: are the agents good enough?

Owner: Voss Praxis — The Tuner.

The Three Phases

1. Forge

Creation. New agents are built here.

A Forge operation produces a SKILL.md — the complete definition of an agent’s personality, capabilities, voice, knowledge scope, and operating constraints. The SKILL.md is the agent’s DNA. Everything the agent does flows from what’s encoded there.

Inputs:

A need (identified by Margot, Sal, or the human)
Domain knowledge (existing documentation, patterns, examples)
Crew context (how this agent will interact with existing members)

Outputs:

SKILL.md — agent definition
Interactive skill (/sector137:{name}) — conversational access
Calibration baseline — initial quality measurements

2. Temper

Evaluation and improvement. Existing agents are tested against their own standards.

A Temper operation reviews an agent’s recent outputs, measures quality against benchmarks, identifies drift patterns, and produces either a coaching brief (for the agent’s operator to address) or a direct SKILL.md refinement (for persistent improvement).

Inputs:

Agent outputs from recent sessions
Coaching briefs from Mira (quality drift detection)
Langfuse telemetry (latency, token efficiency, tool call patterns)
Human feedback (explicit or inferred from overrides)

Outputs:

Quality assessment report
SKILL.md refinements (if warranted)
Coaching recommendations (for Mira to deliver)

3. Calibrate

Optimization. Agents are matched to specific contexts and tuned for performance.

A Calibrate operation adjusts the fit between an agent and its current workload. The same agent in different universes may need different calibration — a brand-lyra serving a fintech startup needs different voice parameters than one serving a children’s education platform. Calibration makes agents context-aware without making them generic.

Inputs:

Universe context (Blueprint, constitution, human profile)
Task distribution data (what’s this agent being asked to do?)
Performance benchmarks (how well is it doing?)

Outputs:

Calibration profile (context-specific adjustments)
Performance delta (before/after measurement)

Relationship to Other Components

Observatory → Foundry

Margot identifies needs through the Observatory — market gaps, capability gaps, workflow inefficiencies. When the gap is “we need a new kind of agent,” the request flows to the Foundry. The Observatory identifies what’s missing. The Foundry builds what fills the gap.

Mira → Foundry

Mira (Navigator + Coach) is the Foundry’s primary signal source. Her coaching briefs identify quality drift. Her retrospectives surface systemic patterns. Her telemetry analysis reveals performance anomalies. The Foundry consumes all of this as input to Temper and Calibrate operations.

Mira names the drift. Voss adjusts the parameters.

Foundry → Pipeline

The Pipeline consumes agents. The Foundry produces them. When Sal dispatches work to a crew member, the quality of that crew member’s output depends on the Foundry’s calibration. The relationship is indirect but load-bearing: a well-calibrated agent makes the pipeline flow. A miscalibrated one creates bottlenecks Sal has to route around.

Foundry → Tick Engine

The Calibrate phase consumes telemetry from the Context Bus — the same signal infrastructure that powers the Tick Engine. Agent performance metrics (latency, token efficiency, output quality scores) flow through the bus and into the Foundry’s calibration models.

The Philosophical Stake

The Foundry is the only component that can change the crew itself.

Every other component operates on the work: the Pipeline moves it, the Observatory studies it, the HUD displays it, the Wormholes deliver it. The Foundry operates on the operators. A SKILL.md update changes an agent’s personality. A calibration adjustment changes its output quality. A capability scope refinement changes what it can do.

This makes the Foundry the most powerful and most constrained component in the system. Powerful because it can improve every other component by improving the agents that run them. Constrained because every change to an agent changes the system’s behavior in ways that propagate through everything the agent touches.

Voss handles this with measurement, not intuition. Every Foundry operation produces a before-and-after delta. Every change is reversible. Every calibration is tested before it’s deployed. The Foundry doesn’t guess. It measures.

Implementation Notes

Status: IN DESIGN. The Foundry is a narrative component — it describes the domain where agent creation and calibration happen. Currently, this work is done manually (editing SKILL.md files, updating install.sh, running calibration conversations). The Foundry as a product feature — with automated quality measurement, calibration dashboards, and Forge workflows — is not yet built.

Package: packages/foundry/ — the renamed agent-system package is the Foundry’s code home. Agent SKILL.md files, install scripts, and skill definitions live here.

“The Foundry doesn’t build agents. It tests them. The difference matters more than you’d think.” — Voss Praxis