Introduction: Start Smaller Than You Think
The first time I configured the Atlas Reasoning Engine inside Agentforce, I made the same mistake most architects make. I tried to build a complete multi-agent system before I understood what a single agent could actually do.
Three months later, after working with half a dozen enterprise teams attempting their own Agentforce implementations, I can tell you that mistake is almost universal. Everyone wants the multi-agent architecture before they’ve validated a single agent working reliably in production.
This guide covers what I’ve actually learned: how multi-agent systems inside Salesforce’s Agentforce platform work, when they make sense, and what you need to get right before you start connecting agents together.
The Foundation: What Agentforce Actually Is
Before you can reason about multi-agent architecture, you need a clear mental model of what you’re working with.
Agentforce is Salesforce’s native AI agent platform. Every agent you build runs on the Atlas Reasoning Engine, a proprietary reasoning layer that breaks incoming requests into subtasks, evaluates available actions, selects the appropriate response, executes it, and verifies the output before proceeding to the next step. This isn’t a prompt chain. It’s structured reasoning with built-in self-correction.
The core components of any Agentforce agent are:
- Topics: Bounded domains of responsibility (“Handle billing inquiries”, “Qualify inbound leads”). Each topic defines what the agent is responsible for and what it isn’t.
- Actions: The actual operations an agent can execute within a topic. These can be Salesforce Flows, Apex classes, prompt templates, API calls, or Data Cloud queries.
- Instructions: Natural language guidance that shapes how the Atlas engine prioritises and sequences actions within a topic.
- Data sources: What the agent can see and reference. CRM records, knowledge articles, Data Cloud unified profiles, external APIs via MuleSoft.
Understanding this structure matters because multi-agent patterns in Agentforce are essentially ways of organising Topics and Actions across multiple agent configurations, not a separate architectural layer.
Multi-Agent Architecture Patterns in Agentforce
There are three primary patterns I’ve seen work in production Agentforce deployments. Each fits different scenarios.
Pattern 1: Specialised Subagents
One orchestrator agent receives the incoming request and routes it to a specialised subagent based on intent classification. The subagent handles the request within its domain and returns the outcome to the orchestrator.
In Agentforce terms, this maps to distinct Topics configured as separate agents, with the primary agent using Einstein’s built-in intent classification to determine routing. The subagents can be purpose-built service agents, sales agents, or internal support agents, each with their own topic and action sets.
When it works well: when your business domains are clearly separated and there’s minimal overlap between what each subagent handles.
Where it breaks: when a single user request spans multiple domains. The routing logic becomes ambiguous and the handoff creates friction.
Pattern 2: Sequential Task Chains
Multiple agents operate in sequence, each completing a stage of a larger workflow and passing structured output to the next agent. No single agent has visibility across the whole chain; each only processes its stage.
This pattern suits complex multi-step business processes where each stage requires different data access, different actions, or different decision logic. A common example: an inbound lead qualification flow where one agent enriches the lead record, a second evaluates fit against ICP criteria, and a third initiates the appropriate outreach sequence.
When it works well: linear processes with clear handoff points and structured outputs at each stage.
Where it breaks: when the process needs to branch based on mid-chain outcomes, or when upstream errors propagate silently downstream.
Pattern 3: Parallel Specialist Execution
Multiple agents work simultaneously on different aspects of a single request, with their outputs synthesised by a coordinator. This pattern is less common in current Agentforce deployments because it requires careful output normalisation, but it’s increasingly relevant as context window limits become a constraint.
When it works well: research and analysis tasks where different specialists retrieve different data sets that need to be combined into a unified response.
Where it breaks: when the synthesis step is complex and the coordinator doesn’t have enough context to reconcile conflicting agent outputs.
The Practical Architecture Decision Framework
| Pattern | Agentforce Implementation | Best Use Case | Primary Constraint |
|---|---|---|---|
| Subagents | Distinct Topics (Billing, Returns) | Distinct business domains (Sales, Service) | Topic definition overlap |
| Skills | Actions (Flows, Apex) inside a Topic | Related tasks (Lookup Order, Update Order) | Context window limits |
| Sequential chains | Agent handoffs via structured outputs | Linear multi-step processes | Error propagation across stages |
| Parallel execution | Concurrent agent calls + synthesis | Research tasks with multiple data sources | Output normalisation complexity |
What I’ve Learned About What Actually Goes Wrong
The failure modes in multi-agent Agentforce deployments are more predictable than you’d expect. Here are the ones I see repeatedly.
Topic Boundary Ambiguity
The most common failure: a user request that sits at the boundary between two topics. The Atlas engine has to make a routing decision, and if your topic definitions overlap or leave gaps, it will make the wrong call. I’ve seen this most often in service deployments where “orders” and “billing” share conceptual territory.
The fix is almost always to rewrite the topic definitions with more explicit scope language, and to add test cases that specifically probe boundary conditions.
Context Loss at Handoffs
When one agent passes control to another in a sequential chain, only the structured output travels forward. The reasoning that produced that output doesn’t. If the receiving agent needs to understand why the upstream agent made certain decisions, you need to explicitly encode that reasoning into the handoff payload.
This sounds obvious. In practice, teams consistently under-specify the handoff format until they debug a failure three weeks into a deployment and realise the downstream agent was operating on incomplete context.
Action Overloading Inside Topics
The temptation when building a topic is to give the agent every action that might conceivably be relevant. This creates a different problem: the Atlas engine has to evaluate more options at each decision step, which increases latency and introduces inconsistency. I’ve found that topics with more than 6-8 actions start showing unpredictable behaviour under edge case inputs.
The right answer is to keep topics focused and split broader domains into subagents rather than bloating a single topic’s action set.
Grounding Gaps
Agents hallucinate when their grounding is incomplete. In multi-agent systems, this compounds: if a first-stage agent returns a partially hallucinated output, the second-stage agent treats it as ground truth. I’ve seen this produce confidently wrong case resolutions that looked entirely plausible in the transcript.
Data Cloud quality and knowledge article coverage need to be validated before multi-agent systems go to production. This is not optional.
When Multi-Agent Architecture Is Actually the Right Answer
I want to be direct here because I’ve watched too many teams build multi-agent complexity before they’ve earned it.
Multi-agent Agentforce architecture is the right answer when:
- A single agent’s topic scope would need to be so broad that it becomes unreliable
- Different parts of a workflow require access to fundamentally different data and actions
- You need to parallelise work that is genuinely independent
- You have demonstrated that a single agent hits context limits on real production queries
It is not the right answer because the architecture diagram looks more impressive, because you anticipate future complexity that hasn’t materialised yet, or because a vendor demo showed a multi-agent flow that looked compelling.
The teams I’ve seen extract the most value from Agentforce all started with a single, tightly scoped agent and only extended into multi-agent patterns when they hit concrete limitations in production.
The Deployment Sequence That Actually Works
If you’re building toward multi-agent Agentforce architecture, here’s the sequence I’d recommend based on what I’ve seen work:
- Single agent, single topic, single action set: prove that the agent handles your core use case reliably. Define evaluation criteria upfront. Run real queries from production data.
- Expand the action set carefully: add actions one at a time, testing after each addition. Monitor for topic routing degradation as the action count grows.
- Add a second topic when you hit scope limits: not before. The second topic should have a clearly different domain and its own explicit action set.
- Introduce agent handoffs only when single-agent scope is genuinely insufficient: design the handoff payload explicitly. Test boundary conditions exhaustively.
- Instrument everything before scaling: Agentforce’s built-in conversation logging is not enough for complex multi-agent systems. You need structured logging at each agent boundary.
The Architecture Is Not the Hard Part
I’ve found that most technical teams can figure out the architecture. The harder challenges are almost always upstream: cleaning the CRM data the agents depend on, writing knowledge articles at the right level of specificity, defining what “good” looks like for an agent response before you build anything, and managing the organisational change when autonomous agents start handling interactions that humans previously owned.
Multi-agent Agentforce systems are technically feasible today. Whether your organisation is ready to extract value from them is a different question, and it’s the one worth pressure-testing before you start connecting agents together.
For organisations evaluating their readiness for agentic AI deployment, the practical starting point is an honest assessment of your data foundation, your use case specificity, and your change management capacity. The architecture follows from that. Not the other way around.


