Skills & Memory

Table of Contents

Skills
#

Skills are knowledge-only markdown packages. They implement three-stage progressive disclosure to minimize context window consumption:

Stage	What the agent sees	Token cost
1. Advertise	Compact index in system prompt: name + description	~100 tokens per skill
2. Load	Full `SKILL.md` via `load_skill`	Varies
3. Read	Specific resource files via `load_skill_resource`	On demand

Skills live at skills_dir/{name}/SKILL.md with optional resources/ directories. Metadata is synced to a skills table in the database.

Skill Creation and Modification
#

Agents can create or modify skills via submit_skill, which goes through a skill-specific review pipeline: taint check, hash check, structure validation, injection scan, and AI reviewer.

Tainted conversations cannot modify skills without human review. This prevents a poisoning chain: web content → agent → skill modification → persistent knowledge base corruption.

The platform ships with built-in skills (e.g., failure-patterns for common failure modes, iterative-planning for the Ralph Loop pattern). Built-in skills are synced at startup without overwriting user modifications, so they serve as defaults that evolve through experience.

Memory Architecture
#

Conversation Boundaries
#

In single-stream messaging (Signal, Telegram), a 6-hour silence gap triggers a new conversation. At the boundary, a background thread generates a structured summary covering:

Topics discussed
Key decisions made
User preferences noted
Pending items

New conversations receive the prior summary rather than raw message history, preserving continuity without consuming context on stale details.

Cross-Conversation Recall
#

The knowledge base provides full-text search (FTS5 with BM25 ranking and Porter stemming) across:

conversations/ — past conversation summaries
reflections/ — self-reflection outputs

Recent conversation hints (last 3 titles and dates) are appended to the system prompt automatically.

Reflective Self-Improvement
#

Cadenced self-reflection runs via cron, forming a compression chain:

Cadence	Input	Output
Daily	Raw activity data: conversations, arcs, tools, errors	Daily notes
Weekly	Daily reflections + 7-day statistics	Weekly patterns
Monthly	Weekly reflections + 30-day stats + skill catalog	Monthly insights

Each reflection creates a dedicated conversation, invokes the full chat agent with tool access, saves to the reflections table, and archives the conversation. An activity threshold skips API calls on quiet days.

Reflection Auto-Action
#

Reflections produce proposed_actions, but observations alone don’t close the loop. After a reflection completes, an auto-action process examines proposed actions and submits workable changes through the existing review pipelines.

The review level is configurable: AI review, human review, or no review. If the reflection touched tainted data, a separate tainted_review_mode applies (default: human). Rate limits prevent runaway self-modification.

Skills#

Skill Creation and Modification#

Memory Architecture#

Conversation Boundaries#

Cross-Conversation Recall#

Reflective Self-Improvement#

Reflection Auto-Action#