Skip to content

YantrikDB Hermes Plugin — Persistent Memory for Hermes Agents

Hermes is an open-source agent runtime. The yantrikdb-hermes-plugin gives Hermes agents persistent memory via YantrikDB — embedded mode by default, no separate server, no token-mint step, no cluster.

If you’re running a Hermes agent and want it to remember across conversations, this is the smallest possible integration: install the plugin, set three environment variables, restart the agent.

Terminal window
pip install yantrikdb-hermes-plugin

Pulls ~10 MB total (yantrikdb engine + plugin + bundled embedder). No torch, no transformers, no ONNX runtime.

Add to ~/.hermes/.env:

Terminal window
YANTRIKDB_MODE=embedded
YANTRIKDB_DB_PATH=~/.hermes/memory.db
YANTRIKDB_NAMESPACE=default

That’s it. Verify the agent picked it up:

Terminal window
hermes memory status
# → yantrikdb available ✓

The plugin registers three tools the agent can call autonomously:

ToolWhat it does
yantrikdb_rememberStore a memory with importance + domain tags. ~0.08s first call (engine warmup), sub-ms after.
yantrikdb_recallSemantic search across stored memories. Returns ranked results with why_retrieved explanations the agent can integrate.
yantrikdb_statsNamespace stats — active memories, conflicts, decay state. Useful for the agent to introspect its own memory.

Default ranking uses YantrikDB’s full scoring: similarity × importance × decay × graph proximity. The why_retrieved field tells the agent why a memory surfaced (e.g., ["semantically similar (0.62)", "important (decay=0.98)", "graph-connected via Alice"]) — this lets the agent give natural-language explanations of recall, rather than just dumping vector hits.

The plugin supports two backends:

ModeWhen to useLatencySetup cost
embedded (default)Single agent, local data, no replication neededSub-ms recall3 env vars
httpCluster deployment, replication, multi-agent shared memory~10-30 msYantrikDB server + token mint + URL

For most Hermes agent setups (one agent, local memory), embedded is the right choice. For multi-agent systems or production with replication, switch to HTTP and run a YantrikDB server.

# Hermes agent run
python run_agent.py --base_url https://api.deepseek.com/v1 --model deepseek-chat

Without the plugin: every conversation starts blank.

With the plugin: the agent autonomously calls yantrikdb_remember on decisions, preferences, and facts during the conversation, and yantrikdb_recall at the start of subsequent turns. A real DeepSeek session was verified end-to-end on 2026-05-09 — 3 yantrikdb_remember calls + 1 yantrikdb_recall (correctly ranked) + 1 yantrikdb_stats, all sub-millisecond on the embedded backend.

The agent’s natural-language explanation of why it recalled what it recalled uses YantrikDB’s why_retrieved annotations directly — no extra prompting needed.

The plugin ships YantrikDB’s core memory primitives (record / recall / stats). It does not ship:

  • Skill management (/v1/skills/* endpoints) — skills are server-side in YantrikDB, and Hermes has its own filesystem-based skill catalog that stays canonical. See the Skill as Memory paper for why this separation is deliberate.
  • Schema validation for agent-written content — embedded mode has no admission control. Agent-written records are accepted as-is.
  • Raft replication — single-machine embedded backend by definition.
  • Knowledge graph operationsrelate / entity_profile are available via the engine API but the plugin exposes only the three core tools by default.

For these, run the YantrikDB server and set YANTRIKDB_MODE=http.