Language-agnostic contract for the Hyperswarm mediator — the service that carries Archon DID operations between nodes over the public hyperswarm P2P network.
The canonical implementation is services/mediators/hyperswarm/. Any implementation that honors this spec is drop-in compatible: other nodes on the hyperswarm topic will exchange operations with it without modification.
Related specs. Reads from / writes to the Gatekeeper via HTTP, resolves its own node DID through the Keymaster, and advertises an IPFS peer identity via Kubo. Read those specs first.
The hyperswarm mediator is a stateless relay. On startup it:
ARCHON_NODE_ID (must already exist in the
associated Keymaster).ARCHON_PROTOCOL.didDocumentData.node), so other nodes can peer with it directly.hyperswarm
registry queue to all connected peers, then clears the queue.ping and walks known
peer DIDs to maintain IPFS peering.sync message.batch, queue, sync, ping) from peers
and feeds them into the import queue → Gatekeeper importBatch /
processEvents.It is not responsible for:
The mediator carries no private key material. Its only trust surface is the admin API key it uses to call Gatekeeper and Keymaster on the local node.
topic = sha256(ARCHON_PROTOCOL) // 32 bytes, used as the hyperswarm discovery key
Default ARCHON_PROTOCOL is /ARCHON/v0.8-beta. All nodes that want to peer
MUST share the same protocol string.
Each mediator joins the topic as both client and server ({client:
true, server: true}) so peers can dial either direction.
Three layers of identity are in play:
| Layer | What | Lives where |
|---|---|---|
| Transport | hyperswarm peer keypair (ed25519) | generated per process, printed as shortName(key) in logs |
| Application | ARCHON_NODE_NAME free-form label |
exchanged in every message’s node field |
| Protocol | node DID (ARCHON_NODE_ID) |
resolved via Keymaster, published in ping.peers[] |
The DID is the stable identity across restarts. The hyperswarm keypair is regenerated every process lifetime (the connection is ephemeral; identity comes from the DID document contents).
On startup, the mediator:
ipfs peering rm --all).keymaster.mergeData(nodeDID, { node: { name, ipfs: { id, addresses } } }).ping messages with peers[], resolves each peer DID,
extracts didDocumentData.node.ipfs, and calls ipfs peering add.The goal is convergent IPFS peering across the hyperswarm: every node discovers every other node’s IPFS endpoint and pins peering with it.
Every message is a single UTF-8 JSON object per hyperswarm conn.write
call. There is no length framing — the receiver uses JSON.parse on the
entire data event buffer. Messages MUST fit within one write-buffer
boundary (8 MB in the reference implementation). Senders that build
larger batches split them recursively before sending.
Note. The reference implementation computes the limit as
8 * 1024 * 1014(note1014, not1024), so the actual cap is 8,118,272 bytes (~7.74 MB). This is almost certainly a typo, but documents the current behavior.
Every message shares:
{
"type": "batch" | "queue" | "sync" | "ping",
"time": "<RFC 3339>", // producer clock; informational
"node": "<free-form node name>", // e.g. "gondor" — matches ARCHON_NODE_NAME
"relays": ["<hexkey>", ...] // peer keys this message has already passed through
}
The relays field prevents infinite forwarding loops. A relayer appends
its own peer key before forwarding.
batch messageCarries an array of DID Operations pulled from the sender’s Gatekeeper
DID chains.
{
"type": "batch",
"time": "...",
"node": "...",
"relays": [],
"data": [Operation, ...] // bare operations, not GatekeeperEvents
}
On receipt: the receiver wraps each operation in a GatekeeperEvent
(registry "hyperswarm", ordinal [producerTime, index], time = now)
and submits via gatekeeper.importBatch(events). See
§4.2.
queue messageSame payload as batch, but the sender is relaying operations from
its local outbound queue (the Gatekeeper’s hyperswarm registry
queue) rather than from its DB. The receiver MUST:
importBatch the payload (as with batch).relays.relays.sync messageZero-payload request. When a peer sends sync, the receiver responds by
streaming batch messages that together cover its entire Gatekeeper DB:
for did_batch in chunks(getDIDs(), 1000):
events = gatekeeper.exportBatch(did_batch) // non-local events only
operations = events.map(e => e.operation)
send_batch(conn, operations) // splits if > 8 MB
Used at connection time to catch up a new peer with historical state.
Senders SHOULD wait until their own import queue is idle before emitting
sync to avoid echoing in-flight batches back.
ping messageLiveness + peer announcement.
{
"type": "ping",
"time": "...",
"node": "...",
"relays": [],
"peers": ["<DID>", ...] // DIDs of peers the sender knows
}
The receiver updates connectionInfo[peerKey].nodeName and walks
peers[] to add new IPFS peerings.
Sender: 8 MB per single conn.write. If a batch exceeds the limit,
recursively split in half and send each half. If a single operation
exceeds the limit, log an error and drop it.
Receiver: no explicit size cap, but garbage messages fail JSON parse and are logged + counted.
A sender’s retransmits (e.g. via queue → relay) may reach the same
receiver multiple times. The receiver MUST hash each incoming batch
(cipher.hashJSON(operations)) and drop duplicates:
batchesSeen[hash(data)] = true // in-memory, ephemeral
The mediator_duplicate_batches_total counter tracks these drops.
Interval: ARCHON_HYPR_EXPORT_INTERVAL seconds (default 2). On each tick:
batch = gatekeeper.getQueue("hyperswarm") — grab the outbound queue.queue message ({ type: "queue", data: batch, ... }).
b. gatekeeper.clearQueue("hyperswarm", batch) — remove those
operations from the server queue.
c. Relay the message to every connected peer (relays: []).
d. mergeBatch(batch) locally (re-submits to own Gatekeeper; idempotent).exportInterval.Skipping 3 when the import queue is idle prevents the export loop from running faster than the importer can drain.
mergeBatch(batch)for chunk in chunks(batch, BATCH_SIZE): # BATCH_SIZE = 100
events = wrap_as_gatekeeper_events(chunk, registry="hyperswarm",
ordinal=[producerTime, i])
gatekeeper.importBatch(events) # queued in gatekeeper
gatekeeper.processEvents() # drain the queue
mergeBatch is called after every received batch/queue message and
after every export-loop flushQueue (to re-import the sender’s own ops).
shareDb(conn)Walks every DID in gatekeeper.getDIDs() in 1000-DID chunks, calls
gatekeeper.exportBatch(chunk), extracts event.operation from each
result, and sends via batch messages (split to fit the 8 MB write
limit).
The mediator’s shareDb does no filtering of its own — it simply maps
exports.map(event => event.operation). It relies on Gatekeeper’s
exportBatch to exclude local-registry events. Implementations
SHOULD verify this assumption against the Gatekeeper spec rather than
relying on mediator-side filtering.
Three async queues, all concurrency: 1:
| Queue | Task | Gate |
|---|---|---|
syncQueue |
Send sync to a new connection |
Waits for importQueue to drain (10s polling) before sending |
importQueue |
Call mergeBatch for inbound batch / queue |
Gated on gatekeeper.isReady() |
exportQueue |
Call shareDb in response to sync |
Gated on gatekeeper.isReady() |
Implementations MAY use any concurrency primitive; the contract is that
no two mergeBatch or shareDb calls overlap, and that sync messages
don’t go out while the mediator is still draining its own inbox.
The mediator exposes a tiny HTTP surface for health/metrics only — there are no admin or client routes.
Binds to ${ARCHON_HYPR_METRICS_PORT} (default 4232).
| Method | Path | Body |
|---|---|---|
GET |
/health |
{ "ok": true } (always) |
GET |
/ready |
{ "ready": <all-deps-ready> } — true iff Gatekeeper, Keymaster, and IPFS are healthy and own node info has been initialized |
GET |
/version |
{ "version": string, "commit": string } |
GET |
/metrics |
Prometheus exposition (see §7) |
No CORS, no admin auth, no rate limiting. The port is intended to be scraped by Prometheus on a private network only.
waitUntilReady=true, 5s poll).waitUntilReady=true, 5s poll).ARCHON_NODE_ID MUST be set; resolve it via Keymaster; if it doesn’t
resolve, exit.waitUntilReady=true, 5s poll) and
resetPeeringPeers().exportLoop()).{ node: { name, ipfs: { id, addresses } } } onto the node
DID document via keymaster.mergeData(nodeDID, ...).connectionLoop()). The swarm itself is
created lazily inside connectionLoop → checkConnections →
createSwarm() on the first tick when there are no active
connections — it is not a separate startup step.Interval: 60 seconds.
createSwarm(), which destroys the
existing swarm (if any) and creates a new one. The first
connectionLoop tick is what creates the initial swarm.connectionInfo entries that haven’t produced data in > 3
minutes.ping message listing all known peer DIDs and relay it to
every connection.graceful-goodbye handler calls swarm.destroy(). No explicit
Gatekeeper / Keymaster cleanup — those are HTTP clients.
| Variable | Default | Meaning |
|---|---|---|
ARCHON_NODE_ID |
unset, required | Name of this node’s agent ID in the Keymaster wallet. |
ARCHON_NODE_NAME |
anon |
Free-form label used in message node field. |
ARCHON_PROTOCOL |
/ARCHON/v0.8-beta |
Hyperswarm topic seed. Must match peers you want to talk to. |
ARCHON_GATEKEEPER_URL |
http://localhost:4224 |
|
ARCHON_KEYMASTER_URL |
http://localhost:4226 |
|
ARCHON_IPFS_URL |
http://localhost:5001/api/v0 |
Kubo HTTP API. |
ARCHON_ADMIN_API_KEY |
empty | Used for both Gatekeeper and Keymaster admin calls. |
ARCHON_HYPR_EXPORT_INTERVAL |
2 |
Export loop interval, seconds. |
ARCHON_HYPR_METRICS_PORT |
4232 |
Metrics HTTP port. |
ARCHON_DEBUG |
false |
Reserved; currently ignored. |
GIT_COMMIT |
unknown |
Build commit (truncated to 7 chars). |
| Metric | Type | Labels |
|---|---|---|
mediator_active_connections |
gauge | (none) |
mediator_import_queue_depth |
gauge | (none) |
mediator_export_queue_depth |
gauge | (none) |
mediator_sync_queue_depth |
gauge | (none) |
mediator_known_nodes |
gauge | (none) — DIDs with resolved node.ipfs info |
mediator_known_peers |
gauge | (none) — distinct IPFS peer IDs |
mediator_messages_received_total |
counter | type (batch/queue/sync/ping) |
mediator_messages_relayed_total |
counter | (none) |
mediator_operations_imported_total |
counter | (none) |
mediator_operations_exported_total |
counter | (none) |
mediator_connections_total |
counter | event (established/closed) |
mediator_duplicate_batches_total |
counter | (none) |
mediator_errors_total |
counter | operation (sync/import/export/peer/parse) |
mediator_import_batch_duration_seconds |
histogram | buckets: [0.01, 0.05, 0.1, 0.5, 1, 2, 5, 10, 30, 60] |
mediator_export_db_duration_seconds |
histogram | buckets: [0.1, 0.5, 1, 2, 5, 10, 30, 60, 120] |
service_version_info |
gauge | version, commit |
Gauges are refreshed on every /metrics scrape via updateGauges().
Plus the standard Prometheus process metrics.
This service only has the health/version/metrics routes, all of which
have trivial paths. Implementations MAY but need not add an
http_requests_total{method,route,status} counter; the existing Grafana
dashboards don’t depend on one.
Plain stdout via console.log. Notable recurring lines:
new hyperswarm peer id: <shortId> (<nodeName>) joined topic: <shortTopic> using protocol: <protocol>received connection from: <shortPeer>--- N nodes connected, detected nodes: <names...>received <type> from: <shortPeer> (<nodeName>)* merging batch (N events) from: <shortPeer> (<nodeName>) *importBatch: <shortHash> merging N events...mergeBatch: {added, merged, rejected, pending}export loop waiting Ns for import queue to clear: N...export loop waiting Ns...New implementations SHOULD keep the shapes of these messages stable if dashboards/log aggregators are parsing them.
ghcr.io/archetech/hyperswarm-mediatorA conformant third implementation MUST:
batch / queue / sync / ping messages byte-compatible
with the reference above.ARCHON_PROTOCOL.importBatch / processEvents / getQueue /
clearQueue / exportBatch endpoints (all documented in
the Gatekeeper spec).resolveDID and mergeData (Keymaster spec §7).