ClawMagic

6-Tier Architecture

Memory is organized into six tiers, each serving a distinct purpose. The agent queries only the tiers relevant to the current task, keeping token usage low.

Tier	Purpose	Contains	Priority
Active	Current session state	Active checklist, in-progress tasks, live status	Highest
Short-term	Recent session extracts	Facts, preferences, decisions, todos, constraints from recent sessions	High
Mid-term	Goals, milestones, learning indexes	Current goals, risks, conversation-to-tool mappings	Medium
Long-term	Accumulated knowledge	Mission, principles, constraints, user contract, learned patterns	High
Specialist	Domain expertise	Per-category JSONL: findings, playbooks, experiments, notes	High
Contract	User-agent agreements	Behavioral terms, explicit user preferences, operational boundaries	Highest

BM25 Search

ClawMagic uses BM25 (Best Matching 25) for memory retrieval. This is the same algorithm used by search engines like Google and Elasticsearch. It runs locally with zero API calls.

Feature	Basic Token Overlap	BM25
Term frequency	No (binary match)	Yes (repeated terms weighted)
Inverse document frequency	No	Yes (rare terms matter more)
Document length normalization	No	Yes
Relevance quality	Basic	Industry standard
API cost	Zero	Zero

BM25 is implemented as a pure TypeScript module with zero dependencies. Recall@10 and MRR both reach 1.0 on internal benchmarks.

Distill Pipeline

After each session, the distill pipeline extracts structured knowledge from the conversation and stores it in the appropriate memory tier. Five distill types enable filtered retrieval later.

Fact — objective information learned during the session.
Preference — user preferences and style choices.
Decision — choices made and the reasoning behind them.
Todo — action items identified for future sessions.
Constraint — boundaries and rules that apply going forward.

Distilled points can be promoted from short-term to long-term memory based on repeated relevance. Stale entries decay naturally.

Thought Chains

Thought chains are procedural memory. When the agent solves a task, it can save the approach as a chain. Next time a similar task appears, the chain fires first, skipping the planning phase entirely.

Chains include metadata: name, tags, description for workflow matching.
Auto-reuse means repeat tasks execute faster with fewer tokens.
Chain-first routing skips planning for recognized patterns.
Users can inspect, edit, and delete chains from the control panel.

Storage and Configuration

Memory storage follows the same zero-dependency philosophy as the rest of ClawMagic. Defaults work everywhere. Upgrades are opt-in.

Default (JSONL + BM25): flat files on disk, zero external services, Git-backup friendly. Works on any machine with Node.js.
SQLite toggle: enables FTS5 for BM25 at larger scale. Faster disk I/O for big document counts. Independent of embeddings.
Embeddings toggle: adds cosine similarity search via embedding API calls. Works with or without SQLite. Opt-in only.

All toggles are configurable from the control panel. API contracts remain the same regardless of which backends are active. The LLM never generates SQL, even when SQLite is enabled.

Token Efficiency

The memory system is designed around a core principle: retrieve what is needed, not everything available.

Default mode uses 1 API call per chat turn (LLM only). No embedding calls.
Context budgets are bounded per turn. Overflow triggers a 4-stage recovery: compact, no_tool_result, manifest_only, graceful stop.
Pull-based retrieval means the agent asks for context rather than receiving it all upfront.
Tool schemas load as manifests first, expanding to full schema only when selected.