SAGEBench: Simulated Bruker timsTOF .d fixtures and ground-truth FDR/TPR benchmark for the Sage DDA search engine

Simulated Bruker .d fixtures and a benchmark harness for the Sage DDA database-search engine. Two purposes: CI-grade Bruker test data for Sage so regressions in the timsTOF code path get caught before release (motivated by lazear/sage#228). A larger ground-truth-backed evaluation set so anyone can compute true FDR / TPR for Sage on simulated DDA data, in the spirit of timsim-bench for DIA. All datasets generated with TimSim; ground truth is exact (every injected peptide is recorded in synthetic_data.db alongside each .d). Files in this record: sagebench-ci-smoke.tar.gz (~457 MB) — two 5-min HeLa .d files, seed CSV, configs, regen script. Drop-in CI fixture. sagebench-hela-150k-g30m.tar.gz (~3.6 GB) — HeLa, 150 000 peptides, 30-min gradient (rep 001). sagebench-hla-10k-g40.tar.gz (~2.6 GB) — HLA Thunder, 10 000 peptides, 40-min gradient, 3 replicates. sagebench-hla-100k-g3600.tar.gz (~6.6 GB) — HLA Thunder, 100 000 peptides, 60-min gradient, 3 replicates. sagebench-results.tar.gz (~288 KB) — first-run report (REPORT.html, RESULTS.md, eval CSVs) against Sage 0.15.0-beta.2. Each archive contains its own README.md with usage instructions. The SAGEBench repository (github.com/theGreatHerrLebert/SAGEBench) hosts the harness used to score search-engine output against the recorded ground truth.

Found an issue? Give us feedback