Hiding the empty count

The hint-row page showed a request of the form Q = [3, 0, ⊥, ⊥] — four slots, with two real database indices and two markers for empty cells in segment s. That picture is correct as the algorithm, but it is not what actually goes on the wire. The implementation replaces every with a freshly sampled random index, sends a fixed number of distinct indices per request, and XOR-cancels the dummies client-side. This page is about why.

The leak

Without padding, the request would carry only the non-empty cells of segment s. The number of indices the server receives would equal the count of non- cells in that segment. That count varies — and it varies in a way the server can model.

Initially each segment is roughly half-full: the count of real cells is Binomial(T, ½) over the offline phase's random permutation. But every time the client runs a query, relocation moves real values into previously-empty cells in other segments (covered on the permutation page). The non-empty count of every still-usable segment drifts upward with hint usage, approaching T − 1 by the time the client is near the end of its query budget. Two things follow for the server.

  • Hint ageing. Fitting the per-group count trajectory tells the server roughly how many queries have been run against each group since the last offline rehint, and when a fresh offline pass is imminent.
  • Real-query identification. In a single round where the client touches K PBC groups (one real query plus K − 1 padding queries — section 08 covers that), the slot whose count is drifting fastest is the real-query slot. The others stay closer to the fresh distribution.

The fix

Every HarmonyPIR per-group request emits exactly T − 1 sorted distinct u32 indices drawn from [0, N), regardless of segment state, query count, or round.

  1. Read the T − 1 cells of segment s that exclude position r. Collect their values; call the non- ones real.
  2. Sample (T − 1) − |real| dummy indices uniformly from [0, N) ∖ real, using the client's secure RNG. Track which sorted-merged slots are dummies in a per-call is_dummy bit-vector.
  3. Merge real + dummy indices, sort ascending, serialize as (T − 1) × 4 bytes little-endian. That is the wire.

The server cannot tell reals from dummies — both look like uniform integers in [0, N), and the per-request count is constant. On response, the server returns T − 1 entries in sorted-request order. The client XOR-cancels the dummies:

answer = H[s] ⊕ XOR(entries[i] for i where !is_dummy[i])

Dummy entries contribute nothing to answer; only the real-segment values do. The XOR with H[s] isolates the target's value DB[q] exactly as in the hint-row walkthrough.

A worked example

Reuse the toy from the hint-row page: N = 8, T = 4, segment 0 has cells [3, 6, ⊥, ⊥], query q = 6 at position r = 1. The paper-algorithm view says the request is Q = [3, 0, ⊥, ⊥]. The wire-level view says:

  • The T − 1 = 3 cells of segment 0 excluding position 1 are cells 0, 2, 3, with values 3, ⊥, ⊥. The non- set is 3.
  • Sample 2 fresh dummies from [0, 8) ∖ 3. Say the RNG gives 7. The merged set is 7.
  • Sort ascending and serialize. Wire bytes are [3, 5, 7] as three little-endian u32s; is_dummy is [false, true, true].
  • Server returns three entries in the same sorted order: R = [DB[3], DB[5], DB[7]].
  • Client XOR-cancels the dummies: answer = H[0] ⊕ DB[3]. Since H[0] = DB[3] ⊕ DB[6], answer = DB[6]. ✓

The "replacement at position r" idea from the paper-algorithm view collapses into "skip position r entirely" here — the target is recovered through the segment parity, not by reading its slot back. Either way the server learns nothing about which cell held the target.

The padding helper lives in harmonypir-wasm/src/lib.rs::HarmonyGroup::build_request_inner (the build_request wrapper preserves per-call state for the XOR cancellation); response handling is in the same crate's process_response. Cover queries against groups the client doesn't need produce wire-shape-equivalent requests via build_synthetic_dummy, which samples T − 1 random indices directly without touching any segment. The design rationale is in PLAN_HARMONY_COUNT_LEAK_FIX.md upstream and the HarmonyPIR Per-Group Request-Count Symmetry section of CLAUDE.md.