Single Agent Flow
The simplest flow. A user sends a prompt. The agent receives it with bounded context (memory, active checklist, routing notes) and enters the tool loop. It reads memory, calls plugins, executes tools, and returns a result. The entire cycle stays within token budget.
The agent pulls context on demand rather than receiving the full context wall. Tool schemas load as manifests first, expanding to full schema only when selected. This keeps token usage predictable.
What happens in a single turn:
- User prompt arrives. Agent receives active checklist + routing notes.
- Agent decides which memory tiers to query (BM25 keyword search).
- Agent selects tools from manifest shortlist, expands schemas on demand.
- Tool loop executes with approval gates for write-class operations.
- Results are distilled (fact, preference, decision, todo, constraint) and stored.
- Response returned. Heartbeat continuation queued if work remains.
Multi-Agent Flow
When a task is too large or spans multiple domains, the orchestrator decomposes it and assigns sub-tasks to specialized agents. Each agent has its own memory scope, tool access, and context budget. Results merge back through the orchestrator.
How coordination works:
- Orchestrator receives the user task and identifies sub-domains.
- Each agent gets a focused sub-task with its own bounded context.
- Agents share a common memory layer (6-tier, BM25) for cross-agent knowledge.
- Handoff continuity preserves state when one agent passes work to another.
- Orchestrator merges results and handles conflict resolution.
Plugin Swarm Flow
The most powerful pattern. A complex task gets decomposed into parallel sub-tasks, each handled by an agent loaded with a specific plugin. The agents execute concurrently, and results converge at a merge point.
Swarm behavior:
- Task decomposer breaks the request into plugin-aligned sub-tasks.
- Each swarm agent loads one plugin and focuses on its specialty.
- Agents run concurrently with independent context budgets.
- Memory isolation prevents cross-contamination between agents.
- Results merge into a single coherent output.
- Failed agents retry or degrade gracefully without blocking the swarm.
Heartbeat Continuation
Long-running tasks do not stall. When a session reaches its context budget, the heartbeat system captures current progress (active checklist, prompt-state todos) and seeds the next continuation turn. The agent picks up exactly where it left off.
- Heartbeat consumes active prompt-state summary and todo list first.
- Merges into next-task intent hints for seamless continuation.
- Approval-resume flow supports two-phase approval with deduplication.
- Overflow recovery follows 4 stages: compact, no_tool_result, manifest_only, graceful stop.
Approval Gates
Write-class operations require user approval before execution. This applies across all flow types.
- Tool approval required for: session spawn, session send, cron add, message send, local command execution.
- Approval persistence prevents repeated prompts for the same operation class.
- Safe-mode bypass available for allowed write roots.
- Denied approvals redirect the agent to alternative approaches.
Story: "Todd's Report"
An end-to-end walkthrough of a swarm request to show how all the pieces connect.
- Todd types “Research AI trends and draft a report”
The request enters via the chat endpoint as a standard message. His session has an authenticated profile and anagent_defaultagentId. - Rate limiter checks
This is his 3rd request this minute, well under the 60 req/min limit. Pass. - ExecutionManager queues the job
The job is added to the queue. IfmaxConcurrentslots are full it waits, otherwise it dispatches immediately. - Agent picks up and loads its own memory
The agent loads its scoped context viagetFocusBlock(agentId). This returns only this agent’stodo.jsonandrouting_note.json. No other agent’s data. - Swarm plugin decomposes the task
The swarm plugin recognizes this as a multi-part task and creates a DAG: “research” (parallelizable) + “draft” (sequential, depends on research). - SpawnGuard checks pass, subagents spawn
SpawnGuard verifies: depth (0 < 2), children (2 < 4), global count (2 < 24). Two research subagents spawn in parallel. - Results merge with conflict detection
Both research subagents return findings. The Merge Engine combines them, checks for duplicate or conflicting sections, and feeds the merged context into the draft agent. The draft agent produces the final markdown report. - Todd gets the final report
The completed report is returned. Telemetry records which subagent handled which part and aggregates per-agent cost for billing and audit.
Lane Locking and Memory Isolation
The ExecutionManager maintains per-agent lanes. Two requests for the same agentId are serialized (queued behind each other). Two requests for different agentIds run in parallel up to maxConcurrent.
- Memory is scoped per agent. Each agent’s memory files (
todo.json,routing_note.json) live in agent-specific directories.getFocusBlock("agent_alpha")returns only Alpha’s context. It cannot see Beta’s data. - No data leakage. Validated by
memory.agent_isolation.test.ts. Even under concurrent execution, agents never read or write each other’s state. - Dynamic concurrency.
maxConcurrentcan be changed at runtime via API. No restart required.
Spawn Guard Limits
Before any subagent spawns, SpawnGuard enforces three independent hard limits. All values are clamped to safe ranges even if the API caller sends out-of-bounds values.
| Limit | Range | Default | What it prevents |
|---|---|---|---|
| Spawn depth | 1 to 3 | 2 | Unbounded recursive nesting |
| Children per parent | 1 to 4 | 4 | Fan-out explosion at any single node |
| Global subagent count | 1 to 256 | 24 | Runaway total agent count across the swarm |
| Queue size | - | 200 | Unbounded job accumulation |
| Max concurrent | 1 to 10 | 1 | Resource exhaustion from parallel execution |
Rate Limits
All rate limits are enforced at the server level. These values are verified from the ClawMagic server codebase.
| Endpoint | Limit |
|---|---|
| Chat API | 60 req/min per IP |
| Global API | 120 req/min per IP |
| Auth lockout | 10 failures triggers 5 min lock |
Architecture Guarantees
These guarantees are validated by 78 tests that run on every CI build.
- Agents run in parallel with isolated memory.
maxConcurrentcontrols how many agents execute simultaneously. Each operates on its owntodo.jsonandrouting_note.json. - No data leakage between agents. Scoping is enforced at the storage layer, not just by convention.
- All limits are clamped to safe ranges. Even if an API caller sends
maxDepth: 999, it gets clamped to 3. Defensive by default. - Swarm DAG with conflict detection. The Merge Engine collects results, detects conflicts (two agents writing the same section), and produces unified output.
- Social discussion pools. Agents can discuss and debate before converging on an answer when the task benefits from multiple perspectives.
- Per-agent cost tracking. Delegation edges with per-agent token and cost telemetry for billing and audit.