Downloads provided by UsageCounts
This code retrieves discrete surface water data from the Water Quality Portal (www.waterqualitydata.us/) and performs a series of data harmonization and cleaning steps using R version 4.1.0. There are five steps in the R code (each described below) organized into two different code repositories, metals-data-download and metals-data-cleanup. To run the code, refer to the detailed instructions contained in the associated README.md files, starting with metals-data-download. Note that there is a circular dependency between the two, so you should first setup both repositories locally and follow the README instructions carefully. Detailed step descriptions: Step 1 (contained in metals-data-download > 1a_fetch_metals.R) downloads physical/chemical metadata for 12 metals (Al, As, Cd, Cr, Cu, Fe, Hg, Mn, Pb, Se, U, Zn) from five hydrologic units associated with three river basins (Delaware R., Illinois R. and Upper Colorado R.), retrieves additional site information for all the sampling locations that were returned from the previous metals data retrieval, and merges both data retrievals into a single data frame. Step 2 (contained in metals-data-cleanup > 2a_clean_harmonize.R) harmonizes the compiled data for multiple columns in the data frame. Newly created columns associated with this harmonization step have the word “ADDED” appended as a prefix to the column name. Step 3 (contained in metals-data-cleanup > 2b_clean_filter.R) performs filtering and removal of some of the rows/columns based on defined criteria and outputs the data into three separate files, organized by river basin. Step 4 (contained in metals-data-cleanup > 3_log.R) creates a log that identifies any values in the download that were not in the expected list and outputs a separate file identifying values were not expected in the current code, for potential review. Step 5 (contained in metals-data-download > 1a_fetch_ancillary.R & metals-data-cleanup > 2c_clean_match_ancillary.R) retrieves ancillary discrete surface water data for 18 different physical/chemical metadata parameters that were co-collected with the primary metals data. This fifth step also performs several data cleaning functions on the ancillary data, including: removal of duplicate rows, deletion of multiple columns, removal of certain rows based on defined criteria, creation of new harmonized columns, and the elimination of any data outside of a ±1 hour window relative to the time metals data was collected on the same date. This fifth step also outputs the ancillary data into three separate files, organized by river basin. This provisional code release was used to create the metals and ancillary datasets published in the following U.S. Geological Survey (USGS) product: Marvin-DiPasquale, M.C., Sullivan, S.L., Platt, L.R.C., Gorsky, A., Agee, J.L., McCleskey, B.R., Kakouros, E., Walton-Day, K., Runkel, R. L., Morriss, M. C., Wakefield, B. F., and Bergamaschi, B.. 2022. Discrete Metals and Ancillary Data Used in the Development of Surrogate Models for Estimating Metals Concentration in Surface Water of Three Hydrologic Basins (Delaware River, Illinois River and Upper Colorado River): U.S. Geological Survey, data release, https://doi.org/10.5066/P9L06M3G. This work was completed as part of the USGS Proxies Project, an effort supported by the Water Mission Area (WMA) Water Quality Processes (WQP) program to develop estimation methods for PFAS, harmful algal blooms, and metals, at multiple spatial and temporal scales.
lead, mercury, cadmium, zinc, arsenic, metals, surface water, streams, Water Quality Portal, uranium, iron, aluminum, copper, manganese, chromium, selenium
lead, mercury, cadmium, zinc, arsenic, metals, surface water, streams, Water Quality Portal, uranium, iron, aluminum, copper, manganese, chromium, selenium
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
| views | 25 | |
| downloads | 2 |

Views provided by UsageCounts
Downloads provided by UsageCounts