Securing Multi-Agent Systems
Multi-agent systems feel like a natural architectural evolution. Instead of one large agent trying to do everything, you compose specialized agents that handle distinct tasks and communicate through structured interfaces. The logic is sound. The security model that teams apply to them is usually the security model for single agents, which doesn't transfer. The hidden tension is that multi-agent …
- Trust propagates implicitly through agent chains, a compromise at any point contaminates all downstream agents unless boundaries are enforced.
- Prompt injection can travel across agent boundaries through synthesized context, schema-constrained outputs are a structural defense.
- Agent-to-agent authentication requires cryptographic identity, not network position.
- Orchestrators are responsible for context minimization; subagents are responsible for scope enforcement, both are required.
- Design multi-agent systems so that subagent compromise is a contained incident, not a system-wide failure.
Trust Propagation Makes Compromised Subagents a System-Wide Problem
In a chain of agents, orchestrator calls agent A, agent A calls agent B, agent B calls agent C, each agent inherits trust from the one above it. If you've granted the orchestrator trust to initiate a workflow, that trust flows downstream unless you explicitly terminate it at each boundary. Most architectures don't terminate it. They forward context and trust implicitly. This means a compromised agent at any point in the chain can contaminate all downstream agents. An injected instruction that s
Prompt Injection Travels Across Agent Boundaries Through Context
Cross-agent prompt injection is a specific attack pattern that doesn't exist in single-agent systems. An attacker plants an injected instruction in data that will be processed by a low-privilege agent early in a workflow. That agent's output, now containing the injected instruction in synthesized form, is passed to a higher-privilege agent downstream. The higher-privilege agent acts on the instruction because it came from a trusted upstream agent. The mechanism works because agents process cont
Agent-to-Agent Authentication Cannot Rely on Network Position
In traditional microservices, network position provides some implicit trust signal, a service inside the perimeter calling another internal service is assumed to be legitimate. Multi-agent systems can't rely on this. An agent that has been compromised is still inside the network. An injected instruction that causes an agent to make calls to unexpected endpoints doesn't change the network position of those calls. Agent-to-agent authentication requires cryptographic identity, not network position
Frequently asked questions
- How do you implement agent-to-agent authentication without adding significant latency?
- The standard approach is short-lived signed tokens generated at workflow initialization rather than per-call authentication. The orchestrator issues signed session tokens to subagents at task start, scoped to the task and with a short TTL. Subagents present these tokens, and downstream agents validate the signature, which is a fast local operation…
- Can you use existing service mesh infrastructure to secure agent-to-agent communication?
- Partially. Service mesh infrastructure handles the transport security well, mTLS between agent services, network policy, traffic observability. What it doesn't handle is the semantic security layer: validating that agent outputs conform to expected schemas before forwarding, enforcing that context doesn't propagate beyond appropriate boundaries, o…
- How do you handle error propagation in a multi-agent system without leaking context?
- Error messages in multi-agent systems are a surprising information leakage vector. A subagent that returns a detailed error message including context about what it was processing can leak workflow state to an upstream agent that didn't need that information. The practice is to define structured error types that convey failure category without incl…
- What does a multi-agent security review checklist look like?
- The key review questions are: Does each subagent have a defined task scope, and is there an enforcement mechanism that rejects out-of-scope requests? Does the orchestrator pass minimum necessary context to each subagent? Are agent identities cryptographically verified and task-scoped? Is there a schema validation layer between agent outputs and do…