Hiding the empty count
The hint-row page showed a request of the form
Q = [3, 0, ⊥, ⊥] — four slots, with two
real database indices and two ⊥ markers for empty
cells in segment s. That picture is correct as the
algorithm, but it is not what actually goes on the wire.
The implementation replaces every ⊥ with a freshly
sampled random index, sends a fixed number of distinct indices
per request, and XOR-cancels the dummies client-side. This page
is about why.
The leak
Without padding, the request would carry only the non-empty
cells of segment s. The number of indices the
server receives would equal the count of non-⊥
cells in that segment. That count varies — and it varies in a
way the server can model.
Initially each segment is roughly half-full: the count of
real cells is Binomial(T, ½) over the
offline phase's random permutation. But every time the client
runs a query, relocation moves real values into
previously-empty cells in other segments (covered on the
permutation page). The non-empty count of every still-usable
segment drifts upward with hint usage, approaching
T − 1 by the time the client is near the end of
its query budget. Two things follow for the server.
- Hint ageing. Fitting the per-group count trajectory tells the server roughly how many queries have been run against each group since the last offline rehint, and when a fresh offline pass is imminent.
- Real-query identification. In a single round
where the client touches
KPBC groups (one real query plusK − 1padding queries — section 08 covers that), the slot whose count is drifting fastest is the real-query slot. The others stay closer to the fresh distribution.
The fix
Every HarmonyPIR per-group request emits exactly
T − 1 sorted distinct u32 indices
drawn from [0, N), regardless of segment state,
query count, or round.
-
Read the
T − 1cells of segmentsthat exclude positionr. Collect their values; call the non-⊥ones real. -
Sample
(T − 1) − |real|dummy indices uniformly from[0, N) ∖ real, using the client's secure RNG. Track which sorted-merged slots are dummies in a per-callis_dummybit-vector. -
Merge real + dummy indices, sort ascending, serialize as
(T − 1) × 4bytes little-endian. That is the wire.
The server cannot tell reals from dummies — both look like
uniform integers in [0, N), and the per-request
count is constant. On response, the server returns
T − 1 entries in sorted-request order. The client
XOR-cancels the dummies:
answer = H[s] ⊕ XOR(entries[i] for i where !is_dummy[i])
Dummy entries contribute nothing to answer; only
the real-segment values do. The XOR with H[s]
isolates the target's value DB[q] exactly as in
the hint-row walkthrough.
A worked example
Reuse the toy from the hint-row page: N = 8,
T = 4, segment 0 has cells
[3, 6, ⊥, ⊥], query q = 6 at position
r = 1. The paper-algorithm view says the request
is Q = [3, 0, ⊥, ⊥]. The wire-level view says:
-
The
T − 1 = 3cells of segment 0 excluding position 1 are cells 0, 2, 3, with values3, ⊥, ⊥. The non-⊥set is3. -
Sample 2 fresh dummies from
[0, 8) ∖ 3. Say the RNG gives7. The merged set is7. -
Sort ascending and serialize. Wire bytes are
[3, 5, 7]as three little-endianu32s;is_dummyis[false, true, true]. -
Server returns three entries in the same sorted order:
R = [DB[3], DB[5], DB[7]]. -
Client XOR-cancels the dummies:
answer = H[0] ⊕ DB[3]. SinceH[0] = DB[3] ⊕ DB[6],answer = DB[6]. ✓
The "replacement at position r" idea from the
paper-algorithm view collapses into "skip position
r entirely" here — the target is recovered through
the segment parity, not by reading its slot back. Either way the
server learns nothing about which cell held the target.
The padding helper lives in
harmonypir-wasm/src/lib.rs::HarmonyGroup::build_request_inner
(the build_request wrapper preserves per-call state
for the XOR cancellation); response handling is in the same
crate's process_response. Cover queries against
groups the client doesn't need produce wire-shape-equivalent
requests via build_synthetic_dummy, which samples
T − 1 random indices directly without touching
any segment. The design rationale is in
PLAN_HARMONY_COUNT_LEAK_FIX.md upstream and the
HarmonyPIR Per-Group Request-Count Symmetry section
of CLAUDE.md.