Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 8 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -240,18 +240,18 @@ on a never-touched holdout slice.
`runDelegatedLoop` is one entrypoint a worker agent or a scheduled routine calls to run a disciplined loop in a chosen mode, over the hardened engines below. It fails loud on an unwired mode; a thrown engine is captured as `{ ok: false }`, so unattended runs record rather than crash.

```ts
import { runDelegatedLoop, coderLoopRunner, researchLoopRunner, type DelegatedLoopRegistry } from '@tangle-network/agent-runtime'
import { runDelegatedLoop, worktreeLoopRunner, researchLoopRunner, type DelegatedLoopRegistry } from '@tangle-network/agent-runtime'

const registry: DelegatedLoopRegistry = {
code: coderLoopRunner({ sandboxClient, args: { goal: 'fix the flaky retry test', repoRoot: '/repo' }, reviewer, winnerSelection: 'smallest-diff' }),
code: worktreeLoopRunner({ repoRoot: '/repo', taskPrompt: 'fix the flaky retry test', harnesses, budget }),
research: researchLoopRunner({ research, gate: { selfArtifactKinds: ['spec'] }, maxRounds: 3 }),
}
const result = await runDelegatedLoop('code', registry)
```

Modes: `code`, `review`, `research`, `audit`, `self-improve`, `dynamic`. The `agent-runtime-loop` bin runs the registry from a cron or routine and exits 0 (ok), 1 (recorded failure), or 2 (usage or config error).

The coder delegate (`createDefaultCoderDelegate`, `/mcp`) has default-on safety gates: no-op rejection (an empty patch cannot pass trivially), an always-on secret-path floor (`.env`, keys, wallets), an optional `reviewer` gate, and a `winnerSelection` policy (`highest-score`, `smallest-diff`, `highest-readiness`, `first-approved`).
`worktreeLoopRunner` (`code` mode, the generic recursive path) authors one `AgentProfile` per harness and runs them as a `worktreeFanout` (each leaf `gateOnDeliverable`), winner by the shared valid-only selector. The sandbox-session counterpart is `detachedSessionDelegate` (`/mcp`): it drives the in-box harness over a `SandboxClient` to a mechanically-validated patch, with default-on safety gatesno-op rejection, an always-on secret-path floor (`.env`, keys, wallets), an optional `reviewer` gate, and a `winnerSelection` policy. Its worker profile is a parameter the caller authors (`workerProfile`); omit it for a minimal model-only default.

The knowledge-base gate (`createKbGate`, `/mcp`) is fail-closed: a fact's `verbatimPassage` must appear in its `sourceText`, the asserted value must be in the passage, and citations cannot point at self-generated artifacts. `researchLoopRunner` wraps it with a correct-on-veto loop that re-researches the vetoed gaps up to `maxRounds`, then returns the unverified ones rather than dropping them.

Expand All @@ -271,15 +271,15 @@ The shape: `loop` to `loop.round` (move plus rationale) to `loop.iteration` (age

## MCP delegation server

Expose the delegation tools (`delegate_code`, `delegate_research`, `delegate_feedback`, `delegation_status`, `delegation_history`) to a sandbox coding agent. Mount the canonical server instead of forking delegation logic.
Expose the delegation tools to a sandbox coding agent: the generic `delegate` verb (one intent → a supervisor that authors + drives its own worker, returns the delivered output with its real spend) plus the queue-bound `delegate_feedback`, `delegation_status`, `delegation_history` (and `delegate_ui_audit` when a UI-audit runner is wired). Mount the canonical server instead of forking delegation logic.

```ts
import { createMcpServer, createDefaultCoderDelegate } from '@tangle-network/agent-runtime/mcp'
import { createMcpServer } from '@tangle-network/agent-runtime/mcp'

const server = createMcpServer({ coderDelegate: createDefaultCoderDelegate({ sandboxClient }), researcherDelegate })
const server = createMcpServer({ delegateSupervisor: { router, backend, deliverable } })
```

Or mount the `agent-runtime-mcp` stdio bin on a production `AgentProfile.mcp`.
Or mount the `agent-runtime-mcp` stdio bin on a production `AgentProfile.mcp` with `MCP_ENABLE_DELEGATE=1`.

Delegation state is in-memory by default — a server restart drops pending delegations and history. Set `AGENT_RUNTIME_DELEGATION_STATE_FILE=/path/state.json` on the bin (or construct via `DelegationTaskQueue.restore({ store: new FileDelegationStore({ filePath }) })`) to persist records across restarts: `delegation_status`/`delegation_history` keep answering for prior runs, idempotency keys dedupe resubmissions, and in-flight records either resume through the `resumeDelegate` seam (when submitted with a `detachedSessionRef`) or settle as failed with an explicit driver-restart error. A corrupt state file refuses to load (`DelegationStateCorruptError`); `AGENT_RUNTIME_DELEGATION_STATE_RECOVER=1` archives it and starts empty. `AGENT_RUNTIME_DELEGATION_RETAIN_TERMINAL=<n>` caps retained terminal records.

Expand Down Expand Up @@ -342,7 +342,7 @@ Six subpaths — the public surface:
| `@tangle-network/agent-runtime` | chat turns, delegated loop-runner, OTEL export, errors, model resolution |
| `.../agent` | `defineAgent` plus surface and outcome adapters |
| `.../loops` | **the optimization suite** (`Environment`, `defineStrategy`, `runBenchmark`, `runStrategyEvolution`, `authorStrategy`, `promotionGate`) + the recursive atom (`Supervisor`/`Scope`, `createExecutor`), the `runLoop` kernel, the `Driver` type, `loopDispatch` |
| `.../profiles` | `coderProfile`, `researcherProfile`, the `uiAuditorProfile` presets + the UI-audit workspace I/O helpers |
| `.../profiles` | `coderTaskToPrompt` (the coder task formatter), the `uiAuditorProfile` presets + the UI-audit workspace I/O helpers |
| `.../intelligence` | `withTangleIntelligence`, `createIntelligenceClient` — Observe + the provable-OFF billing boundary |
| `.../mcp` | `createMcpServer`, `createDefaultCoderDelegate`, `createKbGate`, the `agent-runtime-mcp` bin |

Expand Down
40 changes: 15 additions & 25 deletions docs/api/agent.md
Original file line number Diff line number Diff line change
Expand Up @@ -1103,7 +1103,7 @@ readonly `string`[]

### CreateSandboxActOptions

Defined in: [agent/sandbox-act.ts:29](https://github.com/tangle-network/agent-runtime/blob/main/src/agent/sandbox-act.ts#L29)
Defined in: [agent/sandbox-act.ts:47](https://github.com/tangle-network/agent-runtime/blob/main/src/agent/sandbox-act.ts#L47)

#### Type Parameters

Expand All @@ -1121,23 +1121,23 @@ Defined in: [agent/sandbox-act.ts:29](https://github.com/tangle-network/agent-ru

> **baseProfile**: `AgentProfile`

Defined in: [agent/sandbox-act.ts:31](https://github.com/tangle-network/agent-runtime/blob/main/src/agent/sandbox-act.ts#L31)
Defined in: [agent/sandbox-act.ts:49](https://github.com/tangle-network/agent-runtime/blob/main/src/agent/sandbox-act.ts#L49)

Canonical agent profile — the same one the prod chat turn composes from.
Canonical agent profile — the same one the prod chat turn uses.

##### sandboxClient

> **sandboxClient**: [`SandboxClient`](runtime.md#sandboxclient-1)

Defined in: [agent/sandbox-act.ts:33](https://github.com/tangle-network/agent-runtime/blob/main/src/agent/sandbox-act.ts#L33)
Defined in: [agent/sandbox-act.ts:51](https://github.com/tangle-network/agent-runtime/blob/main/src/agent/sandbox-act.ts#L51)

Sandbox client used to boot the per-run sandbox.

##### buildPrompt

> **buildPrompt**: (`persona`) => `string`

Defined in: [agent/sandbox-act.ts:35](https://github.com/tangle-network/agent-runtime/blob/main/src/agent/sandbox-act.ts#L35)
Defined in: [agent/sandbox-act.ts:53](https://github.com/tangle-network/agent-runtime/blob/main/src/agent/sandbox-act.ts#L53)

Persona → prompt. Pure; the eval cell's input.

Expand All @@ -1155,20 +1155,18 @@ Persona → prompt. Pure; the eval cell's input.

> **output**: [`OutputAdapter`](runtime.md#outputadapter)\<`TRunOutput`\>

Defined in: [agent/sandbox-act.ts:37](https://github.com/tangle-network/agent-runtime/blob/main/src/agent/sandbox-act.ts#L37)
Defined in: [agent/sandbox-act.ts:55](https://github.com/tangle-network/agent-runtime/blob/main/src/agent/sandbox-act.ts#L55)

Sandbox event stream → typed output the rubric scores.

##### compose?

> `optional` **compose?**: (`persona`) => [`ComposeProductionAgentProfileOptions`](mcp.md#composeproductionagentprofileoptions)
> `optional` **compose?**: (`persona`) => `SandboxActComposeOverrides`

Defined in: [agent/sandbox-act.ts:44](https://github.com/tangle-network/agent-runtime/blob/main/src/agent/sandbox-act.ts#L44)
Defined in: [agent/sandbox-act.ts:60](https://github.com/tangle-network/agent-runtime/blob/main/src/agent/sandbox-act.ts#L60)

Per-persona composition overrides (workspace-augmented system prompt,
extra file mounts, sandbox key). Merged into
[composeProductionAgentProfile](mcp.md#composeproductionagentprofile); `env` here is overridden by the
top-level `env` option when both are set.
Per-persona profile overrides (workspace-augmented system prompt, extra
file mounts, tool flags, MCP connections). Overlaid onto `baseProfile`.

###### Parameters

Expand All @@ -1178,13 +1176,13 @@ top-level `env` option when both are set.

###### Returns

[`ComposeProductionAgentProfileOptions`](mcp.md#composeproductionagentprofileoptions)
`SandboxActComposeOverrides`

##### sandboxOverrides?

> `optional` **sandboxOverrides?**: `Partial`\<`Omit`\<`CreateSandboxOptions`, `"backend"`\>\> & `object`

Defined in: [agent/sandbox-act.ts:46](https://github.com/tangle-network/agent-runtime/blob/main/src/agent/sandbox-act.ts#L46)
Defined in: [agent/sandbox-act.ts:62](https://github.com/tangle-network/agent-runtime/blob/main/src/agent/sandbox-act.ts#L62)

Sandbox-SDK overrides forwarded to `createSandboxForSpec`.

Expand All @@ -1198,15 +1196,15 @@ Sandbox-SDK overrides forwarded to `createSandboxForSpec`.

> `optional` **name?**: `string`

Defined in: [agent/sandbox-act.ts:48](https://github.com/tangle-network/agent-runtime/blob/main/src/agent/sandbox-act.ts#L48)
Defined in: [agent/sandbox-act.ts:64](https://github.com/tangle-network/agent-runtime/blob/main/src/agent/sandbox-act.ts#L64)

Stable run name surfaced in mapped `llm_call` events.

##### mapEvent?

> `optional` **mapEvent?**: (`event`, `opts`) => [`RuntimeStreamEvent`](index.md#runtimestreamevent) \| `undefined`

Defined in: [agent/sandbox-act.ts:50](https://github.com/tangle-network/agent-runtime/blob/main/src/agent/sandbox-act.ts#L50)
Defined in: [agent/sandbox-act.ts:66](https://github.com/tangle-network/agent-runtime/blob/main/src/agent/sandbox-act.ts#L66)

Override the `SandboxEvent → RuntimeStreamEvent` mapper.

Expand All @@ -1226,14 +1224,6 @@ Override the `SandboxEvent → RuntimeStreamEvent` mapper.

[`RuntimeStreamEvent`](index.md#runtimestreamevent) \| `undefined`

##### env?

> `optional` **env?**: `Record`\<`string`, `string` \| `undefined`\>

Defined in: [agent/sandbox-act.ts:55](https://github.com/tangle-network/agent-runtime/blob/main/src/agent/sandbox-act.ts#L55)

Environment source for delegation-MCP composition. Defaults to `process.env`.

***

### AgentSurfaces
Expand Down Expand Up @@ -1600,7 +1590,7 @@ optional on the type; missing means no measurement was wired).

> **createSandboxAct**\<`TPersona`, `TRunOutput`\>(`options`): (`persona`, `ctx`) => [`AgentRunInvocation`](#agentruninvocation)\<`TRunOutput`\>

Defined in: [agent/sandbox-act.ts:64](https://github.com/tangle-network/agent-runtime/blob/main/src/agent/sandbox-act.ts#L64)
Defined in: [agent/sandbox-act.ts:78](https://github.com/tangle-network/agent-runtime/blob/main/src/agent/sandbox-act.ts#L78)

Build an `AgentRuntime.act` implementation backed by a single prod-profile
sandbox run. The returned function honours the `act` contract: it returns
Expand Down
Loading
Loading