Query padding — the privacy linchpin

Suppose Alice has 3 real scripthashes to look up in this round. Each one lands in a (different) PBC group out of K = 75. A naive client would send 3 queries — one per real item, none for the other 72 groups. A server watching the wire would learn that Alice's 3 items live in those specific 3 groups. Across enough rounds, this leaks information about which groups — and which address patterns — she cares about.

BitcoinPIR forbids that. In every round, the client sends exactly K = 75 INDEX queries, exactly K_CHUNK = 80 CHUNK queries, and exactly one Merkle-sibling query per tree level being verified. Real queries go into their computed PBC groups; every empty group gets a dummy query with a random DPF key. From the server's point of view, the traffic is identical for any number of real queries from 0 up to the batch limit.

// pir-sdk-client/src/dpf.rs:199 — INDEX padding loop
for b in 0..k {
    for h in 0..INDEX_CUCKOO_NUM_HASHES {
        let alpha = if b == assigned_group {
            my_locs[h]                          // real query
        } else {
            rng.next_u64() % bins as u64        // dummy random alpha
        };
        let (k0, k1) = dpf.gen(alpha, dpf_n);
        s0_group.push(k0.to_bytes());
        s1_group.push(k1.to_bytes());
    }
}
UNPADDED — 3 real queries server: "groups 12, 34, 57 are active" PADDED — 75 queries, all look alike server: "something happened in every group, somewhere"
Left: server sees 3 real queries and can pinpoint the active groups. Right: server sees 75 indistinguishable queries — real and dummy are the same wire bytes.