Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Software
Data sources: ZENODO
addClaim

geonlp-pipeline-paper-2026: A Reproducible Pipeline for Geoscientific Text Mining

Authors: Heasman, Drew; Eglington, Bruce;

geonlp-pipeline-paper-2026: A Reproducible Pipeline for Geoscientific Text Mining

Abstract

Production pipeline source code, database schema, migrations, and Kubernetes deployment manifests accompanying Heasman and Eglington (2026), a methodology paper describing a reproducible Python and PostgreSQL pipeline for assembling domain-specific text corpora from the xDD Snippet API. Includes pre-flight hit checking, in-stream Counter pruning for memory-bounded streaming, pool-segregated parallel workers, and in-database information-theoretic statistics.

Powered by OpenAIRE graph
Found an issue? Give us feedback