Anthropic published a Zero Trust security framework for AI agents covering 12 critical domains. VaultysClaw fully covers 7, partially covers 4, and has the remaining 1 on the public roadmap.
About the framework
Published in May 2026, this framework defines the security requirements organisations should demand of any AI agent platform. It draws directly from Zero Trust principles established by NIST and extends them to the specific threat model of autonomous AI agents: prompt injection, capability abuse, lateral movement, identity spoofing, and data exfiltration.
The 12 domains range from foundational identity controls (DIDs, mutual authentication) through runtime protections (tool gating, input validation, memory isolation) to governance and recovery capabilities. No other open-source agent platform covers this surface area today.
Download the full PDFOverall coverage
Domain-by-domain
Every agent has a non-transferable cryptographic DID.
VaultysId assigns each agent a self-sovereign DID backed by an ECDSA key pair that never leaves the agent. Authentication is mutual challenge-response — no shared secrets, no API keys, no session tokens that can be stolen or replayed.
Capability-based least-privilege, revocable in real time.
Each agent holds an explicit signed capability grant: internet_access, file_access, api_call, code_execution, mail_send, and more. Capabilities are enforced server-side before any intent is dispatched. Revocation takes effect immediately — no restart required.
Realm isolation contains lateral movement by design.
Agents are scoped to realms. A compromised agent cannot reach agents, data, or workflows in other realms. Policy signatures prevent trust escalation across boundaries, and per-agent budget caps limit resource consumption.
Immutable, cryptographically attributed audit trail on every action.
All intents, results, policy changes, and delegation events are signed by their emitter and appended to an append-only log. Every entry is attributable to a specific DID — no ambiguity about who did what, even under delegation chains.
Tools are declared, schema-validated, and policy-gated per agent.
Built-in tools (file ops, shell, HTTP, code runner, remote-agent calls) are registered with Zod schemas. No implicit tool access — each tool requires an explicit capability grant. Execution is logged and bounded by the agent's policy.
Zod-enforced type-safe contracts at every system boundary.
Intent payloads are validated against strict Zod schemas at the control-plane boundary before dispatch. Type-safe ts-rest contracts on all API routes prevent malformed or injected inputs from reaching agent logic.
Per-agent isolated memory store — no cross-agent access.
Each agent's semantic memory (SQLite + vector index) is fully isolated. Retrieval is scoped to the agent's own store; no agent can query another's memory. Memory summarisation runs inside the agent boundary.
Signed policy distribution and budget enforcement implemented; LLM output governance in progress.
Policy documents are cryptographically signed and distributed to agents, which verify signatures before storing. Budget caps, capability grants, and workflow-level human approval gates are enforced. Full prompt-injection detection and LLM output content governance are in active development.
Private keys never leave the agent; LLM key injection via env — vault integration planned.
VaultysId private keys are generated and stored locally on each agent — they are never transmitted. LLM API keys are currently injected via environment variables at startup. Secrets vault integration (e.g. HashiCorp Vault, AWS Secrets Manager) and automated key rotation are on the near-term roadmap.
Signed state and WAL recovery in place; distributed consistency is partial.
Agent certificates, policies, and delegation chains are signed and independently verifiable offline. Control-plane state is backed by SQLite WAL mode with crash recovery. Full distributed state consistency, multi-node failover, and automated recovery orchestration are partially implemented.
Token usage and intent logging implemented; anomaly detection in development.
Per-agent token consumption, task history, and intent logs are tracked and surfaced in the control-plane dashboard. Statistical anomaly detection and behavioural baseline alerting are in active development and will ship as part of the observability roadmap.
Planned: LLM output scanning, PII detection, exfiltration prevention.
Currently, data exposure is limited through capability-gating (agents only access data they have explicit grants for). Dedicated output filtering — automated PII detection, sensitive data redaction, and prompt-injection response scanning — is on the public roadmap.
Open Source · MIT License · Self-hosted
No security team to hire. No SPIRE cluster to maintain. Deploy in five minutes and tick 11 of 12 Anthropic framework domains on day one.