Skip to content

Multi-Agent Ops: Which Agent Memory Is Stale?

Five agents watched the same incident. The database figured out which memory was stale.

This is the multi-agent AI showcase — the one that matters most to anyone building agent fleets today.

A normal agent stack collapses disagreement into one averaged answer. YantrikDB preserves every claim with its source, validity window, and confidence band — then tells the coordinator exactly which beliefs are fresh, which are stale, and which contradict each other.


Black Friday 2026, 15:20 UTC. The on-call engineer at a large e-commerce platform opens the incident coordinator and asks:

“Is the checkout rollout active? Are customers impacted?”

Five sub-agents have been watching five different sources. They all believe they’re telling the truth. Two of them are working from stale data.

AgentSourceState at 15:20
agent.deployCI/CD pipeline✅ Fresh — rollout completed 15:10
agent.telemetryPrometheus metrics✅ Fresh — error spike at 15:13
agent.supportCustomer support inbox✅ Fresh — 3 tickets since 15:12
agent.configFeature-flag API snapshotStale — last polled at 14:50
agent.statusPublic status page scrapeStale — last updated 14:40

Each agent writes structured claims with extractor, valid_from, valid_to, and confidence_band. None of them gets to flatten the truth.


[1] POLARITY_CONTRADICTION
checkout_rollout --is_active--> true
(agent.deploy) CLAIMS YES [15:10 - now] conf=high
(agent.config) CLAIMS NO [14:50 - 15:10] conf=high
[2] POLARITY_CONTRADICTION
checkout_service --is_healthy--> true
(agent.status) CLAIMS YES [14:40 - 15:13] conf=medium
(agent.telemetry) CLAIMS NO [15:13 - now] conf=high

Both contradictions are real — at the moment each claim was written, each agent was correct. The validity windows are what disambiguate.

“What did we believe at 15:20?”

checkout_rollout YES [agent.deploy, 15:10–now, high]
checkout_service NO [agent.telemetry, 15:13–now, high]
customer_impact YES [agent.support, 15:12–now, medium]

“What would we have believed at 14:55?”

checkout_rollout NO [agent.config, 14:50–15:10, high]
checkout_service YES [agent.status, 14:40–15:13, medium]

Same database. Same claims. Different moment in time = different truth. The on-call engineer can query the fleet’s belief state at any point in history, because validity windows are first-class.

Current facts (chosen by source-freshness + confidence):
checkout_rollout --is_active--> true = YES [authority: agent.deploy]
checkout_service --is_healthy--> true = NO [authority: agent.telemetry]
Agents with STALE beliefs (excluded from verdict):
- agent.config
- agent.status
Recommended action for the on-call engineer:
* checkout_v8 rollout IS live (deploy completed at 15:10)
* checkout service IS degraded (telemetry confirms error spike)
* customer impact IS real (support tickets arriving)
* ROLL BACK checkout_v8 via feature flag

Why couldn’t Postgres + embeddings + a dashboard do this?

Section titled “Why couldn’t Postgres + embeddings + a dashboard do this?”

A vector database returns all 5 agent observations as “similar” and lets the LLM guess which to trust. A SQL database stores each agent’s state as flat rows — but the moment agent.config is updated, the old value is overwritten and the 14:50 snapshot is gone. A graph database can model relationships but has no notion of polarity or validity.

None of them can store agent.deploy’s YES and agent.config’s NO on the same (checkout_rollout, is_active, true) tuple as two coexisting rows with opposite polarity and non-overlapping validity windows, automatically flag them as a contradiction, and let you query “what did we believe at 14:55 vs 15:20?” — all in one round-trip.

That’s what YantrikDB does. That’s the category.


Every agent fleet today has this problem:

  • RAG/retrieval agents — different retrievers surface different documents; current stacks average them, losing the disagreement
  • Tool-using agents — the API says one thing, the cached result says another; current stacks pick a random winner
  • Monitoring/ops agents — five dashboards, five opinions; current stacks rely on the human to reconcile
  • Multi-modal agents — vision says X, audio says Y, text says Z; current stacks force early commitment

With YantrikDB, every agent’s observation becomes a claim with provenance, validity, and polarity. Contradiction detection fires automatically. The coordinator asks not “what’s the answer?” but “what beliefs are live right now, and which sources have gone stale?”

This is not agent orchestration. This is belief management under contradiction — and no other memory system ships it.


Terminal window
git clone https://github.com/yantrikos/yantrikdb-server
python yantrikdb-server/docs/showcase/multi_agent_ops_engine.py \
ydb_your_token \
http://your-cluster:7438

Requires yantrikdb-server v0.7.2+ and yantrikdb v0.6.1+.

Full script: multi_agent_ops_engine.py


Your agents disagreed. The database knew which memory was stale.