Skip to main content
  1. Documentation/

Trust & Taint

3 mins
Table of Contents

The Problem
#

An autonomous agent that fetches web content, processes webhooks, or calls external APIs will inevitably encounter attacker-controlled data. The question isn’t whether — it’s how that data flows through the system without contaminating trusted operations.

Taint Levels
#

Every arc carries a taint_level:

LevelMeaningRestrictions
cleanDefault. No exposure to untrusted data.Cannot access untrusted data tools.
taintedHas been exposed to untrusted data.Restricted from modifying trusted state.
reviewBridge zone for evaluating tainted output.Read tools + submit verdict only.

Taint does not propagate upward: a clean parent can orchestrate tainted children without becoming tainted itself. Clean arcs that attempt to use untrusted data tools receive HTTP 403.

Taint Isolation at Submit
#

When submit_code executes code that imports untrusted tools (e.g., act/web), the raw output — which may contain attacker-controlled content — is never returned to the chat agent’s context. Instead, the agent receives structured metadata: status, output_key, output_bytes, and exit_code. The actual output is stored in arc state for retrieval by review arcs only.

This is enforced fail-closed: if the taint check itself fails, output is withheld. The invariant is absolute — no AI sees tainted data unless it is in a designated review arc.

The Two-LLM Firewall
#

When tainted output needs to be trusted, the system creates a review arc (taint_level=review, agent_type=REVIEWER) as a sibling. Individual reviewer verdicts are advisory. A separate JUDGE arc renders the authoritative verdict.

On judge approval, the target arc’s taint_level is promoted to clean. The judge’s authority is scoped to the target arc only — parent arcs are not automatically promoted.

Review arcs are enforced at creation time: every tainted arc must have at least one reviewer and a judge.

Separation of Powers
#

After a coding-change arc completes, the platform auto-creates verification sibling arcs:

  • A correctness check — does it work?
  • A quality check — is it well-structured? (for platform/tool code)
  • A judge — synthesizes results into an authoritative verdict
  • A documentation arc — updates docs if needed

Each verification arc carries arc_role="verifier" and a verification_target_id pointing to the implementation arc. Self-verification is blocked at creation time: the agent that wrote the code cannot be the agent that judges it.

This is “measure N times, cut once” applied to the full development cycle — not just code review, but correctness and quality verification by independent agents.

Encryption at Rest
#

Untrusted output can be Fernet-encrypted at rest. Keys are generated per reviewer-target pair and stored in review_keys. Only designated reviewers (or anyone after trust promotion) can decrypt.

Trust Audit Log
#

A dedicated append-only table records all boundary decisions: taint assignments, access denials, review verdicts, trust promotions, decryption grants. Separate from arc history to ensure the trust record is always complete and tamper-evident.