OneDZ: A Global Detrital Zircon Database and Implications for Constructing Giant Geoscience Database

Li, Keran; Hu, Xiumian; Chai, Rong; Yang, Jianghai; Xue, Weiwei; Pan, Yingdi; Li, Taiyang; Fang, Can; Ma, Anlin; Huang, Hu; Guo, Qianqian; Yang, Wentao; Hu, Lisha; Qi, Liang; Chen, Guohui; Sun, Gaoyuan; Zhang, Shijie; Deng, Tao; Li, Kuizhou; Sun, Jiaopeng; Gao, Biao

Found an issue? Give us feedback

ZENODOarrow_drop_down

ZENODO

Dataset

Data sources: ZENODO

OneDZ: A Global Detrital Zircon Database and Implications for Constructing Giant Geoscience Database

Research datakeyboard_double_arrow_right Dataset Under curationPublisher:Zenodo

Authors: Li, Keran; Hu, Xiumian; Chai, Rong; Yang, Jianghai; Xue, Weiwei; Pan, Yingdi; Li, Taiyang; +14 Authors

doi: 10.5281/zenodo.19690702

OneDZ: A Global Detrital Zircon Database and Implications for Constructing Giant Geoscience Database

- Summary

Abstract

# onedz_datasets_csv This directory contains the **split CSV datasets** of the ZirconRegular_LLM project. All files are partitioned into manageable parts (~100,000–130,000 rows each) for batch processing, LLM ingestion, or memory-constrained workflows. ## Directory Structure ``` onedz_datasets_csv/ │ ├── Total_UPb_split_parts/ # Main U-Pb geochronology database │ ├── zircon_upb_part_01.csv │ ├── zircon_upb_part_02.csv │ └── ... (22 parts total) │ ├── Total_LuHf_split_parts/ # Lu-Hf isotope database, note that all files have been checked by experts │ ├── zircon_luhf_part_01.csv │ ├── zircon_luhf_part_02.csv │ └── zircon_luhf_part_03.csv │ └── Experts_checked_UPb_split_parts/ # Expert-reviewed U-Pb subsets ├── expert_upb_part_01.csv ├── expert_upb_part_02.csv └── ... (14 parts total) ``` ## Dataset Summary | Dataset | Parts | Est. Total Rows | Columns | Content | |---------|-------|-----------------|---------|---------| | `Total_UPb_split_parts` | 22 | ~2,550,000 | 64 | Full detrital zircon U-Pb age database | | `Total_LuHf_split_parts` | 3 | ~297,000 | 33 | Lu-Hf isotope data linked to U-Pb records (expert-checked) | | `Experts_checked_UPb_split_parts` | 14 | ~1,497,000 | 64 | Peer-reviewed regional compilations (quality-controlled) | --- ## File Format All CSV files follow the project standard: | Property | Specification | |----------|---------------| | **Encoding** | UTF-8 with BOM (`utf-8-sig`) | | **Delimiter** | Comma (`,`) | | **Line endings** | LF (`\n`) | | **Header** | Single header row with standardized column names | | **Quoting** | Double-quoted fields when containing commas or newlines | ### U-Pb Standard Columns (64 total) - **Bibliographic**: `Lead_Author`, `Year`, `Journal`, `Vol`, `Pages`, `Title`, `Web_Link` - **Sample**: `Published_Sample_ID`, `Country_State`, `Region`, `Continent`, `Major_Geographic_Geologic_Unit`, `Minor_Geologic_Geographic_Unit`, `Group`, `Formation`, `Member`, `Locality`, `Profile`, `Latitude`, `Longitude` - **Depositional Age**: `Depos_Age_Period`, `Depos_Age_Epoch`, `Depos_Age_Stage`, `Max_Depos_Age_Ma`, `Est_Depos_Age_Ma`, `Min_Depos_Age_Ma` - **Analytical**: `Spectrometer`, `Spectrometer_Location`, `Institution`, `Spectrometer_Mode`, `Rock_Type_one`, `Rock_Type_two`, `Rock_Type_three`, `Grain`, `Spot_Location`, `Spot_diam` - **Isotope Ratios**: `Pb206U238_iso`, `Pb207U235_iso`, `Pb207Pb206_iso`, `Pb208Th232_iso` (with one-sigma uncertainties) - **Calculated Ages**: `Pb206U238_age`, `Pb207U235_age`, `Pb207Pb206_age`, `Best_age` (with one- and two-sigma uncertainties), `Discord` - **Elemental**: `U_ppm`, `Th_ppm`, `Pb_ppm`, `Pb206Pb204`, `Pb204Pb206`, `UTh_ratio`, `ThU_ratio` ### Lu-Hf Columns (33 total) Includes all bibliographic and sample metadata columns above, plus: - `Upb_Age`, `Upb_Age_two_sigma` - `176Hf177Hf_iso`, `176Lu177Hf_iso`, `176Yb177Hf_iso` (with 2-sigma uncertainties) - `epsilon_Hf_0`, `epsilon_Hf_t` (with 1-sigma and 2-sigma uncertainties) - `TDM1_Ma`, `TDM2_Ma` (with 2-sigma uncertainties) --- ## Usage Notes 1. **Load order**: When reassembling the full dataset, load parts in numerical order (`01` → `22`). 2. **Row overlap**: Parts are split sequentially; no duplicate rows exist across parts of the same dataset. 3. **Cross-dataset linkage**: Use `Lead_Author` + `Year` + `Published_Sample_ID` + `Grain` to link U-Pb records with Lu-Hf records. 4. **Expert vs. Total**: `Experts_checked_UPb_split_parts` is a **subset** of the total database, curated from peer-reviewed regional compilations. It does not contain all rows from `Total_UPb_split_parts`.

Found an issue? Give us feedback