Cluster & Replication
YantrikDB Server has production-style clustering (alpha — running live on homelab clusters, not yet battle-tested at scale) with:
- CRDT-based replication — converges automatically, no merge conflicts
- Raft-lite leader election — automatic failover in <10 seconds
- Witness daemon — run safe HA with only 2 data nodes
- Read-only enforcement — followers reject writes, point clients to the leader
- Multi-database — each database replicates independently
- Cluster master token — one token works on every node
Topology
Section titled “Topology”Recommended setup: 2 voters + 1 witness.
┌──────────────────┐ heartbeats ┌──────────────────┐│ data node 1 │ ◄───────────▶ │ data node 2 ││ (voter) │ oplog sync │ (voter) ││ full storage │ │ full storage │└────────┬─────────┘ └────────┬─────────┘ │ │ │ ┌──────────────────┐ │ └────▶│ witness │◄────────┘ │ (vote-only) │ │ ~10 MB RAM │ └──────────────────┘The witness is a tiny daemon (~3 MB binary, no disk storage) whose only job is to vote in elections. It breaks ties so 2 data nodes can run safe HA without needing a 3rd full node.
This is the same pattern as Azure SQL (witness instance), MongoDB (arbiter), Redis Sentinel, and MariaDB Galera (garbd).
Step-by-step setup
Section titled “Step-by-step setup”1. On node1, generate config
Section titled “1. On node1, generate config”yantrikdb cluster init \ --node-id 1 \ --output /etc/yantrikdb.toml \ --data-dir /var/lib/yantrikdb \ --peers 192.168.1.2:7440 \ --witnesses 192.168.1.3:7440Output:
config written to /etc/yantrikdb.toml
cluster_secret: ydb_cluster_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx(use this as the auth token from any client to access the default database)Save the cluster_secret. You’ll need it on every other node and as the auth token from clients.
2. On node2, generate config with the same secret
Section titled “2. On node2, generate config with the same secret”yantrikdb cluster init \ --node-id 2 \ --output /etc/yantrikdb.toml \ --data-dir /var/lib/yantrikdb \ --peers 192.168.1.1:7440 \ --witnesses 192.168.1.3:7440 \ --secret <PASTE_SECRET_FROM_NODE1>3. Create the database on each voter
Section titled “3. Create the database on each voter”yantrikdb db --data-dir /var/lib/yantrikdb create default4. On node3, start the witness
Section titled “4. On node3, start the witness”yantrikdb-witness \ --node-id 99 \ --port 7440 \ --cluster-secret <PASTE_SECRET_FROM_NODE1> \ --state-file /var/lib/yantrikdb-witness/state.jsonThe witness needs no database, no config file, no embedding model — just the secret and a state file.
5. Start the voters
Section titled “5. Start the voters”On node1 and node2:
yantrikdb serve --config /etc/yantrikdb.tomlAfter ~5 seconds, an election runs and one voter becomes leader.
6. Verify
Section titled “6. Verify”yql --host 192.168.1.1 -t <cluster_secret>yantrikdb> \cluster node #1 — Leader term: 1 leader: 1 healthy: yes | writable: yes quorum: 2
+---------+-------------------+---------+-----------+------+----------+| node_id | addr | role | reachable | term | last_seen|+---------+-------------------+---------+-----------+------+----------+| 2 | 192.168.1.2:7440 | voter | ✓ | 1 | 0.5s ago || 99 | 192.168.1.3:7440 | witness | ✓ | 1 | 0.5s ago |+---------+-------------------+---------+-----------+------+----------+Test failover
Section titled “Test failover”Kill the leader (Ctrl+C or systemctl stop yantrikdb).
Within 5–10 seconds:
- The other voter detects missed heartbeats
- Runs an election
- The witness grants its vote
- The follower promotes itself to leader
curl -s http://192.168.1.2:7438/v1/cluster | jq .role# "Leader"When the old leader rejoins, it sees the higher term and demotes itself to follower automatically.
Failure modes
Section titled “Failure modes”| Failure | Behavior |
|---|---|
| Leader voter dies | Other voter + witness elect new leader in <10s |
| Follower voter dies | Leader keeps writing (still has quorum with witness) |
| Witness dies | Both voters keep going, no new elections allowed |
| Witness + follower die | Leader becomes read-only (no quorum) |
| Network partition isolates a voter | Isolated voter loses quorum, becomes read-only |
| All nodes die | Restart any node — it loads persistent state, rejoins cluster |
Manual failover
Section titled “Manual failover”To force a specific node to become leader (e.g. for maintenance):
yantrikdb cluster promote --url http://192.168.1.2:7438 -t <cluster_secret>This triggers an election from that node.
Cluster authentication
Section titled “Cluster authentication”When clustering is enabled, the cluster_secret doubles as a master Bearer token that works on any node in the cluster:
TOKEN=ydb_cluster_xxxxxxxx...
# This works whether node1 or node2 is leadercurl http://192.168.1.1:7438/v1/stats -H "Authorization: Bearer $TOKEN"curl http://192.168.1.2:7438/v1/stats -H "Authorization: Bearer $TOKEN"Per-node tokens (created with yantrikdb token create) still work for fine-grained access.
Configuration reference
Section titled “Configuration reference”Full [cluster] section:
[cluster]node_id = 1 # unique integer per noderole = "voter" # voter | read_replica | witness | singlecluster_port = 7440 # peer-to-peer portheartbeat_interval_ms = 1000 # leader → follower heartbeat rateelection_timeout_ms = 5000 # follower → candidate transition delaycluster_secret = "ydb_cluster_..."replication_mode = "async" # async (default) or sync
[[cluster.peers]]addr = "192.168.1.2:7440"role = "voter"
[[cluster.peers]]addr = "192.168.1.3:7440"role = "witness"How replication works
Section titled “How replication works”Under the hood, every write is recorded as an oplog entry with a hybrid logical clock (HLC) timestamp. Followers continuously pull new ops from the leader and apply them locally via the same CRDT semantics that the engine already uses.
- Add-wins set for memories (UUIDv7 keys, no collisions)
- LWW for graph edges (HLC tiebreaker)
- Set-union for consolidation
- Forget always wins (tombstones are absolute)
This means the cluster converges naturally even after network partitions — there’s no manual conflict resolution needed.
For a deeper dive, see the Raft-lite design.