ClawMagic

Single Agent Flow

The simplest flow. A user sends a prompt. The agent receives it with bounded context (memory, active checklist, routing notes) and enters the tool loop. It reads memory, calls plugins, executes tools, and returns a result. The entire cycle stays within token budget.

The agent pulls context on demand rather than receiving the full context wall. Tool schemas load as manifests first, expanding to full schema only when selected. This keeps token usage predictable.

What happens in a single turn:

User prompt arrives. Agent receives active checklist + routing notes.
Agent decides which memory tiers to query (BM25 keyword search).
Agent selects tools from manifest shortlist, expands schemas on demand.
Tool loop executes with approval gates for write-class operations.
Results are distilled (fact, preference, decision, todo, constraint) and stored.
Response returned. Heartbeat continuation queued if work remains.

Multi-Agent Flow

When a task is too large or spans multiple domains, the orchestrator decomposes it and assigns sub-tasks to specialized agents. Each agent has its own memory scope, tool access, and context budget. Results merge back through the orchestrator.

How coordination works:

Orchestrator receives the user task and identifies sub-domains.
Each agent gets a focused sub-task with its own bounded context.
Agents share a common memory layer (6-tier, BM25) for cross-agent knowledge.
Handoff continuity preserves state when one agent passes work to another.
Orchestrator merges results and handles conflict resolution.

Plugin Swarm Flow

The most powerful pattern. A complex task gets decomposed into parallel sub-tasks, each handled by an agent loaded with a specific plugin. The agents execute concurrently, and results converge at a merge point.

Swarm behavior:

Task decomposer breaks the request into plugin-aligned sub-tasks.
Each swarm agent loads one plugin and focuses on its specialty.
Agents run concurrently with independent context budgets.
Memory isolation prevents cross-contamination between agents.
Results merge into a single coherent output.
Failed agents retry or degrade gracefully without blocking the swarm.

Heartbeat Continuation

Long-running tasks do not stall. When a session reaches its context budget, the heartbeat system captures current progress (active checklist, prompt-state todos) and seeds the next continuation turn. The agent picks up exactly where it left off.

Heartbeat consumes active prompt-state summary and todo list first.
Merges into next-task intent hints for seamless continuation.
Approval-resume flow supports two-phase approval with deduplication.
Overflow recovery follows 4 stages: compact, no_tool_result, manifest_only, graceful stop.

Approval Gates

Write-class operations require user approval before execution. This applies across all flow types.

Tool approval required for: session spawn, session send, cron add, message send, local command execution.
Approval persistence prevents repeated prompts for the same operation class.
Safe-mode bypass available for allowed write roots.
Denied approvals redirect the agent to alternative approaches.

Story: "Todd's Report"

An end-to-end walkthrough of a swarm request to show how all the pieces connect.

Todd types “Research AI trends and draft a report”
The request enters via the chat endpoint as a standard message. His session has an authenticated profile and an agent_default agentId.
Rate limiter checks
This is his 3rd request this minute, well under the 60 req/min limit. Pass.
ExecutionManager queues the job
The job is added to the queue. If maxConcurrent slots are full it waits, otherwise it dispatches immediately.
Agent picks up and loads its own memory
The agent loads its scoped context via getFocusBlock(agentId). This returns only this agent’s todo.json and routing_note.json. No other agent’s data.
Swarm plugin decomposes the task
The swarm plugin recognizes this as a multi-part task and creates a DAG: “research” (parallelizable) + “draft” (sequential, depends on research).
SpawnGuard checks pass, subagents spawn
SpawnGuard verifies: depth (0 < 2), children (2 < 4), global count (2 < 24). Two research subagents spawn in parallel.
Results merge with conflict detection
Both research subagents return findings. The Merge Engine combines them, checks for duplicate or conflicting sections, and feeds the merged context into the draft agent. The draft agent produces the final markdown report.
Todd gets the final report
The completed report is returned. Telemetry records which subagent handled which part and aggregates per-agent cost for billing and audit.

Lane Locking and Memory Isolation

The ExecutionManager maintains per-agent lanes. Two requests for the same agentId are serialized (queued behind each other). Two requests for different agentIds run in parallel up to maxConcurrent.

Memory is scoped per agent. Each agent’s memory files (todo.json, routing_note.json) live in agent-specific directories. getFocusBlock("agent_alpha") returns only Alpha’s context. It cannot see Beta’s data.
No data leakage. Validated by memory.agent_isolation.test.ts. Even under concurrent execution, agents never read or write each other’s state.
Dynamic concurrency. maxConcurrent can be changed at runtime via API. No restart required.

Spawn Guard Limits

Before any subagent spawns, SpawnGuard enforces three independent hard limits. All values are clamped to safe ranges even if the API caller sends out-of-bounds values.

Limit	Range	Default	What it prevents
Spawn depth	1 to 3	2	Unbounded recursive nesting
Children per parent	1 to 4	4	Fan-out explosion at any single node
Global subagent count	1 to 256	24	Runaway total agent count across the swarm
Queue size	-	200	Unbounded job accumulation
Max concurrent	1 to 10	1	Resource exhaustion from parallel execution

Rate Limits

All rate limits are enforced at the server level. These values are verified from the ClawMagic server codebase.

Endpoint	Limit
Chat API	60 req/min per IP
Global API	120 req/min per IP
Auth lockout	10 failures triggers 5 min lock

Architecture Guarantees

These guarantees are validated by 78 tests that run on every CI build.

Agents run in parallel with isolated memory. maxConcurrent controls how many agents execute simultaneously. Each operates on its own todo.json and routing_note.json.
No data leakage between agents. Scoping is enforced at the storage layer, not just by convention.
All limits are clamped to safe ranges. Even if an API caller sends maxDepth: 999, it gets clamped to 3. Defensive by default.
Swarm DAG with conflict detection. The Merge Engine collects results, detects conflicts (two agents writing the same section), and produces unified output.
Social discussion pools. Agents can discuss and debate before converging on an answer when the task benefits from multiple perspectives.
Per-agent cost tracking. Delegation edges with per-agent token and cost telemetry for billing and audit.