Oregon State University Superfund Research Center: Automated Python Superfund NPL Site Scraper

Barton, Michael; Deal, Carter; Germano, Francesca; Anderson, Kim; Rohlman, Diana

Found an issue? Give us feedback

ZENODOarrow_drop_down

ZENODO

Software . 2025

License: CC BY

Data sources: ZENODO

ZENODO

Software . 2025

License: CC BY

Data sources: Datacite

ZENODO

Software . 2025

License: CC BY

Data sources: Datacite

Oregon State University Superfund Research Center: Automated Python Superfund NPL Site Scraper

integration_instructionsResearch softwarekeyboard_double_arrow_right Software 13 Nov 2025Publisher:ZenodoFunded by:Identification of Remedia...

Authors: Barton, Michael; Deal, Carter; Germano, Francesca; Anderson, Kim; Rohlman, Diana;

doi: 10.5281/zenodo.17595062 , 10.5281/zenodo.17595061

Oregon State University Superfund Research Center: Automated Python Superfund NPL Site Scraper

- Summary
- Subjects
- Metrics

Abstract

The Superfund NPL Site Scraper automates the collection and standardization of data from U.S. EPA Superfund resources. Built in Python, the tool retrieves site-level information from EPA online tables, Microsoft Excel files, and individual site profile pages. It uses requests, BeautifulSoup, and pandas to parse structured and semi-structured content, extract cleanup milestones, and normalize outputs into consistent CSV schemas (e.g., site ID, site name, location, operational status, milestone history).The scraper is fully configurable, enabling users to add or modify target data fields without restructuring the codebase. Designed for repeated use, it supports research tracking, program reporting, and integration with Google Sheets and other database systems.

Related Organizations

Oregon State University
United States

Keywords

NPL (National Priorities List), Data parsing, Research data management, Web scraping

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

0

Average

Funded by

[no funder available]| Identification of Remediation Technologies and Conditions that Minimize Formation of Hazardous PAH Breakdown Products at Superfund Sites