
This dataset comprises 4,542 inorganic crystalline materials, provided into a unified repository of Crystallographic Information Files (CIF) paired with high-accuracy electronic band gap values. The structural data were curated from the Crystallography Open Database (COD), a comprehensive open-access collection of crystal structures [1, 2]. To ensure predictive reliability, the corresponding electronic band gaps were sourced from the validated HSE database developed by Kim et al. (2020) [3]. In this underlying work, electronic structures were characterized using hybrid density functional theory (DFT) with the Heyd–Scuseria–Ernzerhof (HSE06) screened hybrid functional. This approach significantly mitigates the well-known "band-gap problem" inherent in standard semilocal exchange-correlation approximations, such as the Local Density Approximation (LDA) or Generalized Gradient Approximation (GGA), thereby providing a more physically accurate representation of the semiconducting properties within the dataset. Methods Data Acquisition and Workflow The dataset was constructed through a multi-stage integration of the HSE band-gap database and the COD Database. Chemical formulas were systematically extracted from the HSE repository and utilized as primary keys for programmatic queries within the COD, facilitated by the aiida-cod database importer. To ensure high data fidelity, a strict string-matching protocol was implemented: CIF entries were retrieved only when an exact correspondence was established between the query formula and the chemical formula_sum field within the COD metadata. Following verification, the corresponding crystallographic files were archived locally using a standardized naming convention based on stoichiometric identifiers. The final curation stage involved filtering for completeness, retaining only those entries where structural coordinates and hybrid-DFT electronic data were concurrently present. This pipeline yielded a validated ensemble of 4,542 inorganic compounds, providing a robust basis for structure-property relationship analysis. Data Sources - Crystal Structures: Crystallography Open Database (COD) [1,2]- Band Gap Values: (Hybrid DFT calculations using the Heyd–Scuseria–Ernzerhof (HSE06) functional) [3]
Crystallography, CIF, Hybrid DFT, Crystal Structures, Band Gap, Open Database, HSE06, Materials Properties, Inorganic Materials
Crystallography, CIF, Hybrid DFT, Crystal Structures, Band Gap, Open Database, HSE06, Materials Properties, Inorganic Materials
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
