
This preprint studies finite-sample reliability issues in low-false-positive-rate membership inference audits. It evaluates simple score-based membership inference audit summaries across tabular benchmark settings, with emphasis on tie handling, bootstrap uncertainty, reference-centered score checks, split/seed sensitivity, and reproducibility reporting. The paper is a preprint and has not been peer reviewed. The associated reproducibility package is available at https://doi.org/10.5281/zenodo.20552369.
