Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Dataset
Data sources: ZENODO
addClaim

ANTAQ Ship-Turnaround Dataset and Reproduction Artifacts

Authors: Caruso Barbosa Pacheco, Eduardo;

ANTAQ Ship-Turnaround Dataset and Reproduction Artifacts

Abstract

# ANTAQ Ship-Turnaround Dataset & Reproduction Artifacts **Data and intermediate artifacts for the paper:** > E. C. B. Pacheco, R. P. Martins, A. dos Santos Gualberto, J. V. Souza Germano,> M. Miranda Neto, and F. Louzada, *"Methodological Pitfalls in Predicting Ship> Turnaround Time at Brazilian Ports: An Empirical Audit and a Reproducible> Pipeline,"* IEEE Transactions on Intelligent Transportation Systems, 2026. This Zenodo record hosts the **data and result artifacts** that are too largefor GitHub. The **code** lives in the companion repository: - Code: <https://github.com/eduardocbpacheco/ANTAQ>- Data DOI: **10.5281/zenodo.20549161 --- ## What is in this archive `antaq-turnaround-data-v1.2.zip` unzips to a single `data/` folder designed todrop directly into the root of the code repository (`Codigo Reproducao/`): ```data/├── raw/ (~6.0 GB) raw ANTAQ tables, one folder per year│ ├── 2018/ … 2024/ {ano}Atracacao.txt, {ano}Carga.txt, …│ └── categories/ Mercadoria.txt, Instalacao_*.txt, …├── critic_datasets/ (~90 MB) ORIGINAL authors' published datasets (Abreu/Rao)│ ├── CargasBR2018EDA-LOGII.csv 153,331 × 35 — their "EDA" dataset│ └── EN-CargasBR2018Modelo-LOGII.csv 150,669 × 21 — their "cleaned model" dataset├── processed/ (~167 MB) df_{ano}.parquet — cargo-level merge (step 1)├── embeddings/ (~1.7 GB) 384-d sentence embeddings per year (step 2)├── processed_agg/ (~1.6 GB) df_{ano}_agg.parquet — berthing level (step 3)└── output/ (~1.1 GB) final matrices + hyperparameters + results ├── train.parquet, test.parquet ~391k / ~98k berthings × 1613 cols ├── encoder.pkl, imputer.pkl, feature_sets.json (M1=838 … M4=1606 features) ├── best_params_target_*.json ├── piso/ (Floor / ceteris paribus) hyperparams/ + results_piso.csv + stats_piso.json └── teto/ (Ceiling / per-cell tuned) hyperparams/ + results_teto.csv + stats_teto.json``` Total uncompressed: ~11 GB. ## How to use it 1. Clone the code repository: ```bash git clone https://github.com/eduardocbpacheco/ANTAQ.git cd ANTAQ # the "Codigo Reproducao" package ```2. Download `antaq-turnaround-data-v1.2.zip` from this Zenodo record and unzip it inside that folder, so that `data/` sits next to `config.py`: ```bash unzip antaq-turnaround-data-v1.2.zip # creates ./data/ ```3. Reproduce the paper's tables in seconds, or rebuild everything from raw. See `REPRODUCTION_GUIDE.md` (EN) / `GUIA_REPRODUCAO.md` (PT) / `reproduce.ipynb` in the repository. `config.py` reads from `./data` by default; point it elsewhere with the`ANTAQ_DATA_BASE` environment variable. ## Provenance & licensing - The raw ANTAQ tables (`data/raw/`) are public open data from the Brazilian *Agência Nacional de Transportes Aquaviários* (ANTAQ), <https://web3.antaq.gov.br/ea/sense/download.html>. They are redistributed here unmodified, for archival and reproducibility, under ANTAQ's open-data terms. All processed/derived artifacts are released under the same license as the code repository.- `data/critic_datasets/` contains the two datasets **published by the original authors** (Abreu et al. 2023 / Rao et al. 2025) as supplementary material to their papers. They are included so reviewers can independently reproduce the empirical critique in `EDA_1_critique_of_original_papers.ipynb`. Credit and rights for these two files belong to their original authors. ## Reference values (sanity check, Floor experiment, M4 vs M1) | Target | M1 RMSE (h) | M4 vs M1 *d* | M4 vs M1 *p* ||---|---|---|---|| TEstadia | 76.21 ± 0.10 | +10.17 | < 10⁻⁷ || TEsperaAtracacao | 67.10 ± 0.07 | +17.27 | < 10⁻⁷ || TAtracado | 28.97 ± 0.06 | +4.69 | < 10⁻⁷ || TEsperaInicioOp | 18.41 ± 0.04 | −0.98 | 1.000 (n.s.) || TOperacao | 19.34 ± 0.02 | +5.62 | < 10⁻⁷ || TEsperaDesatracacao | 6.43 ± 0.02 | +0.07 | 0.159 (n.s.) | ## How to cite Please cite **both** the paper and this dataset. ```bibtex@article{pacheco2026turnaround, author = {Pacheco, Eduardo Caruso Barbosa and Martins, Reynaldo Pereira and Gualberto, Alexandre dos Santos and Germano, Jo\~{a}o Vitor Souza and Miranda Neto, Milton and Louzada, Francisco}, title = {Methodological Pitfalls in Predicting Ship Turnaround Time at Brazilian Ports: An Empirical Audit and a Reproducible Pipeline}, journal = {IEEE Transactions on Intelligent Transportation Systems}, year = {2026}, note = {Code: https://github.com/eduardocbpacheco/ANTAQ}} @dataset{pacheco2026turnaround_data, author = {Pacheco, Eduardo Caruso Barbosa and Martins, Reynaldo Pereira and Gualberto, Alexandre dos Santos and Germano, Jo\~{a}o Vitor Souza and Miranda Neto, Milton and Louzada, Francisco}, title = {{ANTAQ Ship-Turnaround Dataset and Reproduction Artifacts (v1.2)}}, year = {2026}, publisher = {Zenodo}, version = {1.2}, doi = {10.5281/zenodo.20549161}}```

Powered by OpenAIRE graph
Found an issue? Give us feedback