Skip to main content
Anthropic Zero Trust AI Agents Framework · May 2026

12 domains.
11 covered today.

Anthropic published a Zero Trust security framework for AI agents covering 12 critical domains. VaultysClaw fully covers 7, partially covers 4, and has the remaining 1 on the public roadmap.

7
Active
4
Partial
1
Roadmap

Anthropic's Zero Trust AI Agents Framework

Published in May 2026, this framework defines the security requirements organisations should demand of any AI agent platform. It draws directly from Zero Trust principles established by NIST and extends them to the specific threat model of autonomous AI agents: prompt injection, capability abuse, lateral movement, identity spoofing, and data exfiltration.

The 12 domains range from foundational identity controls (DIDs, mutual authentication) through runtime protections (tool gating, input validation, memory isolation) to governance and recovery capabilities. No other open-source agent platform covers this surface area today.

Download the full PDF

Overall coverage

Active7 / 12
Partial4 / 12
Roadmap1 / 12
Evaluated against Anthropic's Zero Trust AI Agents Framework, May 2026.

How VaultysClaw maps to each domain

01Agent Identity & Authentication
Active

Every agent has a non-transferable cryptographic DID.

VaultysId assigns each agent a self-sovereign DID backed by an ECDSA key pair that never leaves the agent. Authentication is mutual challenge-response — no shared secrets, no API keys, no session tokens that can be stolen or replayed.

02Access Control & Privileges
Active

Capability-based least-privilege, revocable in real time.

Each agent holds an explicit signed capability grant: internet_access, file_access, api_call, code_execution, mail_send, and more. Capabilities are enforced server-side before any intent is dispatched. Revocation takes effect immediately — no restart required.

03Resource Perimeter & Blast Radius
Active

Realm isolation contains lateral movement by design.

Agents are scoped to realms. A compromised agent cannot reach agents, data, or workflows in other realms. Policy signatures prevent trust escalation across boundaries, and per-agent budget caps limit resource consumption.

04Observability & Audit
Active

Immutable, cryptographically attributed audit trail on every action.

All intents, results, policy changes, and delegation events are signed by their emitter and appended to an append-only log. Every entry is attributable to a specific DID — no ambiguity about who did what, even under delegation chains.

05Tool Access & Security
Active

Tools are declared, schema-validated, and policy-gated per agent.

Built-in tools (file ops, shell, HTTP, code runner, remote-agent calls) are registered with Zod schemas. No implicit tool access — each tool requires an explicit capability grant. Execution is logged and bounded by the agent's policy.

06Input Validation
Active

Zod-enforced type-safe contracts at every system boundary.

Intent payloads are validated against strict Zod schemas at the control-plane boundary before dispatch. Type-safe ts-rest contracts on all API routes prevent malformed or injected inputs from reaching agent logic.

07Agent Memory Protection
Active

Per-agent isolated memory store — no cross-agent access.

Each agent's semantic memory (SQLite + vector index) is fully isolated. Retrieval is scoped to the agent's own store; no agent can query another's memory. Memory summarisation runs inside the agent boundary.

08AI Governance Policies
Partial

Signed policy distribution and budget enforcement implemented; LLM output governance in progress.

Policy documents are cryptographically signed and distributed to agents, which verify signatures before storing. Budget caps, capability grants, and workflow-level human approval gates are enforced. Full prompt-injection detection and LLM output content governance are in active development.

09Credential Protection
Partial

Private keys never leave the agent; LLM key injection via env — vault integration planned.

VaultysId private keys are generated and stored locally on each agent — they are never transmitted. LLM API keys are currently injected via environment variables at startup. Secrets vault integration (e.g. HashiCorp Vault, AWS Secrets Manager) and automated key rotation are on the near-term roadmap.

10Integrity & Recovery
Partial

Signed state and WAL recovery in place; distributed consistency is partial.

Agent certificates, policies, and delegation chains are signed and independently verifiable offline. Control-plane state is backed by SQLite WAL mode with crash recovery. Full distributed state consistency, multi-node failover, and automated recovery orchestration are partially implemented.

11Behavioural Monitoring
Partial

Token usage and intent logging implemented; anomaly detection in development.

Per-agent token consumption, task history, and intent logs are tracked and surfaced in the control-plane dashboard. Statistical anomaly detection and behavioural baseline alerting are in active development and will ship as part of the observability roadmap.

12Output Filtering & Data Leak Prevention
Roadmap

Planned: LLM output scanning, PII detection, exfiltration prevention.

Currently, data exposure is limited through capability-gating (agents only access data they have explicit grants for). Dedicated output filtering — automated PII detection, sensitive data redaction, and prompt-injection response scanning — is on the public roadmap.

Open Source · MIT License · Self-hosted

The most complete Zero Trust coverage
for AI agents — out of the box.

No security team to hire. No SPIRE cluster to maintain. Deploy in five minutes and tick 11 of 12 Anthropic framework domains on day one.

Get started View on GitHub