Views provided by UsageCounts
Abstract Summary The processing pipeline in this repository utilizes Landsat Collection 2 Surface Reflectance data and European Centre for Medium-Range Weather Forecasts (ECMWF) Reanalysis v5 (ERA5) data to develop machine learning models and stepwise linear regression models to estimate Secchi depth at Lake Yojoa, Honduras. This codebase acquires and harmonizes the Landsat and ERA5 data using the Google Earth Engine (GEE) Python API (Gorelick, et al. 2017). Remote sensing data are matched with in situ Secchi depth measurements and the matchups are partitioned into train-test ('stringent' data handling) or train-test-validate ('very stringent' data handling) data sets. Models were created using both xgboost gbtree methods and stepwise linear regression methods. Methods Landsat Stack Median Landsat Collection 2 Surface Reflectance (Masek, et al., 2006, Vermote, et al., 2016) values (Rrs) were obtained for the 18 sampling locations in Lake Yojoa following the methods described in Topp, et al. (2021. Minor adaptations were made for the transition for Collection 1 to Collection 2 Landsat data to account for differences in scaling factors between collections. Rrs summaries included only ‘confident’ water pixels as defined by the dynamic surface water extent algorighm (Jones ,2019). Data were filtered for reasonable values for water reflectance (-0.01 < Rrs < 0.2) for all bands (Blue, Red, Green, Near Infrared, Shortwave Infrared 1, Shortwave Infrared 2). Inter-mission handoff coefficients to standardize Rrs values due to slight changes in sensors and atmospheric correction between missions (Gardner, et al. 2021) were calculated based on Landsat Collection 2 data acquired from all lakes greater than 25 hectares within Guatemala, Honduras, and El Salvador (described in Regional Handoff Coefficients below). All Landsat data were acquired using the Google Earth Engine Python Application Programming Interface (API) in RStudio version 2023.03.0, R version 4.2.3 (R Core Team 2023), and Python version 3.8 (Python Software Foundation, https://www.python.org/). = Secchi-RS matchups Secchi data were matched with records from the Landsat remote sensing stack at a number of time windows: same, one, two, three, and five day windows. In addition to these standard windows, we employed a variable window defined by local knowledge. In this method, we allowed for matchups up to ±7 days for all months where the conditions of the lake are relatively consistent and in October and November, months that often have sudden clarity changes, only matches ±1 day were permitted. Windows up to ±7 days have yielded reasonable results when paired with remote sensing data (Kloiber, et al. 2002). We avoided discrete over-matching of our data by assuring that each satellite overpass instance was only paired with the nearest-in-time Secchi measurement and that each discrete Secchi measurement was only paired with a single, nearest-in-time valid satellite overpass. Only matchup datasets of ±5 days and ±7/1 produced reasonable models and are the only datasets summarized in this methods overview. Climate Data Climate data were acquired from the ERA5 dataset (Muñoz, 2019) in the Google Earth Engine Code Editor (Gorelick, 2017) from a single data point at the approximate geographical center of Lake Yojoa (14.8768°N, 87.9791°W) for all available data. ERA5 data provide daily modeled values for an extensive list of parameters; however, we only used total precipitation, mean air temperature, total solar radiation, and mean wind speed in our analysis. These data were aggregated for the previous 3, 5, and 7 days. xgboost Model We used the R package {xgboost} (Chen, et al. 2023) to develop the best performing gradient tree boost algorithm for these data. Model features were median Rrs values for the blue, green, red and near infrared bands, the ratio of red to green, blue to green, red to blue, green to red, total solar radiation, maximum air temperature, mean air temperature, minimum air temperature, total precipitation, and mean wind speed. Various input feature window combinations were tested, including providing the program with the previous day’s meteorology as well as a summary of the previous 5 or 7 days. Stepwise Regression Model We used the R packages {caret} (Kuhn, 2008) and {leaps} (Lumley, 2020) to perform backwards stepwise regression. We tested the same combinations of matchup windows and meteorological data as with the {xgboost} package, except all input data were normalized to values between 0 and 1 using a min-max scaling method. Because this method uses cross-wise validation (10 times cv), no test data were provided to the model. 70% of the data were used in model development and the remaining 30% were holdout data to examine the results independently. Data partitioning was completed by total number of matchups not by image date. Complete methodology and full citations are provided in the file ‘Methods.Rmd’ and the rendered ‘Methods.html’ file within this repository.
remote sensing, xgboost, surface reflectance, water clarity, Google Earth Engine, Landsat
remote sensing, xgboost, surface reflectance, water clarity, Google Earth Engine, Landsat
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
| views | 7 |

Views provided by UsageCounts