The leakage record L(q)

The previous section sorted what the server can and can't observe. This section is about how we know that list is complete.

A BitcoinPIR client emits a sequence of wire rounds for every query — request bytes out, response bytes back, repeated round by round. An honest-but-curious server records the whole transcript: how many rounds happened, how big each request and response was, what items each request asked for. The verification effort defines exactly which bits of this transcript are allowed to vary across queries — that's the leakage record L(q) — and proves that everything else is fixed by the protocol.

The four fields of L

Two fields are intentional public metadata that the protocol publishes as part of its API. The other two are protocol-observable axes: one is pinned to a constant by closure invariant, and one is an explicit documented leak introduced to keep the protocol practical.

  • query_db_id — which database the query is against (the mainnet snapshot, a particular delta epoch, etc.). Intentional public metadata: choosing a DB to query is a visible part of the protocol's API.
  • session_query_index — the position of this query inside the HarmonyPIR session, used to schedule hint refresh. Function of session length, already public.
  • index_max_items_per_group_per_level — pinned to 2 by the INDEX Merkle Group-Symmetry invariant. Multi-query batches are routed across all three candidate PBC groups via pbc_plan_rounds, so colliding scripthashes never accumulate items in a single group.
  • chunk_max_items_per_group_per_level — the approximate per-query CHUNK-Merkle item count (equivalently, approximate per-query UTXO count). This was formerly fixed by M-padding; the M=16 mechanism was intentionally removed in Phase 4 / WS-A to reduce chunk-layer overhead, so this field is now an admitted leak axis. Found-vs-not-found behavior is still protected via CHUNK Round-Presence Symmetry, which forces at least one CHUNK-Merkle pass in all cases.

The simulator-property claim

For any backend b and any two queries q1, q2 with L(q1) = L(q2), the wire transcripts of Real(b, q1) and Real(b, q2) are equal. The visualization below is what that looks like in practice — two queries with matching L side by side.

L(q1) db_id : 0 session_index : 7 index_max_items : 2 chunk_max_items : 1 L(q2) db_id : 0 session_index : 7 index_max_items : 2 chunk_max_items : 1 L(q1) = L(q2) field-by-field Wire transcript Wire transcript byte diff INDEX phase · K = 75 PBC groups INDEX phase · K = 75 PBC groups CHUNK phase · variable item count CHUNK phase · variable item count MERKLE phase · per-level siblings MERKLE phase · per-level siblings aggregate byte-diff across full transcript: 0 Asserted by leakage_integration_test.rs · *_found_vs_not_found_have_byte_identical_profiles
Two queries with matching L(q) values produce wire transcripts that match byte-for-byte across every round.

The picture answers what the verification claims. The next page covers how we believe it — three layers of evidence, each addressing a different way the claim could be wrong.