Autonomous Skills on YantrikDB — Agent-Authored Procedures That Compound Across Sessions

Hermes Agent ships with the headline promise of being the agent that grows with you — its learning loop analyzes successful task completions, identifies reusable patterns, and writes Markdown skill files capturing the workflow as a reusable procedure. Filesystem skills are durable, version-controllable, human-editable. They’re a good fit for human-curated procedures.

They’re a worse fit for agent-authored procedures, which is what the learning loop actually produces. Filesystem skills are queryable only by filename and grep. The agent has no way to ask “which skill applies here?” without loading every file. There’s no outcome ledger. There’s no contradiction tracking when a skill gets superseded. There’s no way for two agents on different stacks to share a skill catalog without each writing files into the other’s directory tree.

That’s the gap yantrikdb-hermes-plugin fills. It exposes a DB-native skill substrate as a peer surface to Hermes’ filesystem skills — semantic search, outcome ledger, contradiction tracking, shared namespace across consumers, all behind three tools the agent calls explicitly when it judges a pattern worth crystallizing.

Watch it close the loop

LLM-driven skill lifecycle

gpt-4o-mini receives the plugin’s 11 tool schemas via OpenAI’s chat-completions API. The substrate is seeded with 6 production-shaped skills (procedures, references, lessons, rules across research / deployment / incident / workflow / review domains) before session 1 begins — so the agent contributes to a lived-in catalog, not a toy.

Session 1: agent gets an incident report (a staging service hangs after an ALLOWED_KINDS deploy). It searches the substrate, finds two relevant skills, composes a diagnosis across them, then records outcomes on both with explanatory notes. Session 2 — fresh provider, same substrate. A similar-shape incident arrives. The agent searches, finds those same skills now with outcome history attached, applies them, records two more outcomes, and — because the pattern has now recurred — calls skill_define itself to crystallize a new lesson (incident.ingest.allowed_kinds_deploy_race). Final substrate: 7 skills, 4 outcomes, one new agent-authored lesson. ~80 seconds end-to-end.

Source: demo_llm.py + transcript-llm.txt on GitHub. Everything below the LLM-call layer (tool dispatch via handle_tool_call, engine, substrate, response shapes) is the same code Hermes invokes when its agent loop encounters a yantrikdb tool.

The three tools

# Agent observes a useful pattern, distills it into a reusable skill.
yantrikdb_skill_define(
    skill_id="workflow.git.commit_clean",
    skill_type="procedure",   # procedure | reference | lesson | pattern | rule
    applies_to=["git", "release"],
    body="Before commit: run pytest, run lint, write a clear subject + body. "
         "Never include co-authored-by unless asked.",
)

# Next session — fresh agent searches before acting.
yantrikdb_skill_search(query="how to commit cleanly", top_k=5)
# → returns ranked skills with `why_retrieved` reasons

# After using a skill, record whether it worked. Append-only.
yantrikdb_skill_outcome(
    skill_id="workflow.git.commit_clean",
    succeeded=True,
    note="caught a flake8 issue pre-push",
)

Three tools, one substrate. The plugin handles the schema validation, embedding, semantic search, outcome ledger, and namespace scoping. The agent makes all the judgment calls about when to define and when to look up.

Real data: 17 agent-authored skills on one production substrate

These aren’t hypothetical. The skill substrate on one production deployment currently holds 17 skills, all authored by Claude across separate sessions, none written by a human. The agent observed patterns and chose to crystallize them.

Distribution by type:

skill_type	Count
`procedure`	11
`reference`	5
`lesson`	1

Sample of categories the agent chose to crystallize:

Workflow patterns — release sequence on yantrikos repos with branch protection, shipping a dashboard update
Incident lessons — extending ALLOWED_KINDS in both polling watcher AND ingest (the deploy-ordering trap), dashboard recent-events endpoint shows older lane data fine but brain-shape kinds delayed
Cross-session norms — Pranab is sole author, no co-authored-by tag unless asked, for user-visible product changes do not lead with marketing voice
Research protocols — pre-registration first; declare hypothesis before any data analysis
Communication patterns — one-word greenlight ("ok" / "go" / "lets go") means execute, not ask clarifying questions

Reuse is real but modest. 9 of the 17 skills have access_count > 0 — meaning a separate session searched for and used them. The most-reused skill (Pranab is sole author + commit conventions) has been pulled 3 times by independent inference instances looking up commit norms before shipping. That’s the autonomy loop closing across sessions.

How `skill_define` fits the actual Hermes use cases

Hermes’ user-stories page describes seven documented use cases. The autonomous-skills surface maps directly into five of them:

Hermes use case	What skill_define captures
Personal assistant (Telegram / WhatsApp / Discord)	“How I handle a new package-delivery notification” — observed once, distilled, reused next time
Multi-platform messaging	”How to interpret message from `whatsapp:user-A` vs `telegram:user-A`” — when owner-scoping (v0.4.10+) collapses platforms to one identity, the skills travel with the canonical owner
Developer productivity	”How I review a Hermes plugin PR” — distilled from successful reviews, surfaced next time before the agent starts
Research & content curation	”How I survey a new topic over a week” — repeat the protocol that worked, not the one that didn’t
Self-improvement / skill auto-generation	This is the use case. Hermes’ built-in learning loop writes filesystem Markdown; the plugin exposes the same loop into a DB substrate so the skills are searchable, outcome-tracked, and shareable with other agents on the same substrate.

Why filesystem skills aren’t enough on their own

Hermes’ built-in Markdown skills and yantrikdb’s DB substrate aren’t competitors — they’re peer surfaces with different lifecycles. The right call depends on who’s authoring and how the skill needs to evolve.

Property	Filesystem skill (`$HERMES_HOME/skills/*.md`)	yantrikdb skill substrate
Author	Best fit for human-authored, durable, version-controlled, code-review-shaped	Best fit for agent-authored, runtime-evolving, outcome-tracked
Lookup	filename, grep, manual indexing	semantic search via `yantrikdb_skill_search`, ranked by relevance + recency + outcome
Outcome tracking	Implicit (you’d see if a procedure failed during a session, but no ledger)	First-class append-only outcome ledger via `yantrikdb_skill_outcome`
Contradiction	If two skills contradict, you’d notice during use	`conflicts()` surfaces the contradiction; the agent can `resolve_conflict()` explicitly
Sharing across stacks	One agent’s filesystem; you’d have to copy files into another	Single shared `skill_substrate` namespace; another consumer (the MCP server, Lane B SDK, etc.) reads and writes from the same store
Schema validation	Free-form Markdown	Schema-validated at write time — skill_id format, type enum, body length bounds, applies_to format
Provenance	Git history if you commit them	`metadata.source=hermes` (or `mcp`, etc.) — each entry tagged with the consumer that authored it
Version compatibility	You manage it	The plugin handles it; entries from older schemas migrate forward

Filesystem skills win when a procedure is curated, reviewed, and stable. DB-native skills win when the procedure is observed-then-crystallized in the moment, and you want a different agent in a different session to find it again.

Install

hermes plugins install yantrikos/yantrikdb-hermes-plugin --enable
pip install yantrikdb               # same Python env as Hermes
hermes memory setup                 # select "yantrikdb"
hermes gateway restart

Skills are enabled by setting YANTRIKDB_SKILLS_ENABLED=true in your Hermes .env. The substrate writes to the shared skill_substrate namespace; if you also run the yantrikdb-mcp server on the same backing store, that server’s tools and this plugin’s tools read and write the same catalog.

Where to look next

Hermes Plugin guide — full setup, owner-scoping for multi-platform, embedded vs HTTP backend
Hermes Dashboard guide — visual operator console for browsing the substrate (memory + skills + identity scope)
Skill-as-Memory research paper — the substrate thesis behind why agent-authored procedures want a DB surface, not a filesystem one
yantrikdb-hermes-plugin on GitHub — issues, PRs, CHANGELOG

The 17 skills above are evidence that the autonomy loop closes in practice. Skills the agent chose to crystallize, looked up in subsequent sessions, and accumulated outcomes against. The substrate kept score; the agent made the calls.