
Description This dataset provides CSV lists containing structured metadata for cryo-electron microscopy (cryo-EM) map processing tasks used in the CryoFM research papers. The data lists are curated from entries in the Electron Microscopy Data Bank (EMDB) and organized into CSV files with detailed metadata for training, validation, and testing of deep learning models. Dataset Contents This repository contains CSV data lists for two main research projects: 1. CryoFM1 (ICLR 2025): CSV lists for cryo-EM map processing at two different resolutions cryofm1_1-5apix_dataset: High-resolution dataset (~1.5 Å/pixel) cryofm1_3apix_dataset: Standard-resolution dataset (~3.0 Å/pixel) 2. CryoFM2: CSV lists for foundation model pre-training and fine-tuning cryofm2_pretrain_dataset: Pre-training dataset with half-map pairs cryofm2_emhancer_dataset: Enhancement dataset with half-map pairs and model-based LocScale maps cryofm2_emready_dataset: EMReady dataset with deposited and simulated maps CSV List Structure CryoFM1 CSV lists contain: EMDB ID, relative path to map file, voxel dimensions (nz, ny, nx), and pixel size (apix). Maps are rescaled to specified resolutions (1.5 or 3.0 Å/pixel). CryoFM2 CSV lists contain: map paths, statistical features (mean, std, quantile_max_value), pixel size (apix), and EMDB/PDB IDs. All maps are resized to 1.5 Å/pixel. Detailed schema descriptions are provided in `schema.md` files within each dataset directory. Note This dataset contains CSV metadata lists only; the actual map files are not included. Map files should be downloaded from EMDB using the provided EMDB IDs.
Machine Learning, Cryoelectron Microscopy, Cryoelectron Microscopy/methods
Machine Learning, Cryoelectron Microscopy, Cryoelectron Microscopy/methods
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
