Annual maps of cropland abandonment, land cover, and other derived data for time-series analysis of cropland abandonment

Crawford, Christopher L.; Yin, He; Radeloff, Volker C.; Wilcove, David S.

Found an issue? Give us feedback

ZENODOarrow_drop_down

ZENODO

Dataset . 2022

License: CC BY

Data sources: Datacite

ZENODO

Dataset . 2022

License: CC BY

Data sources: ZENODO

ZENODO

Dataset . 2022

License: CC BY

Data sources: Datacite

Annual maps of cropland abandonment, land cover, and other derived data for time-series analysis of cropland abandonment

Research datakeyboard_double_arrow_right Dataset 26 Mar 2022 English Publisher:Zenodo

Authors: Crawford, Christopher L.; Yin, He; Radeloff, Volker C.; Wilcove, David S.;

doi: 10.5281/zenodo.5348287 , 10.5281/zenodo.5348286

Annual maps of cropland abandonment, land cover, and other derived data for time-series analysis of cropland abandonment

- Summary
- Subjects
- Metrics

Abstract

This archive contains raw annual land cover maps, cropland abandonment maps, and accompanying derived data products to support: Crawford C.L., Yin, H., Radeloff, V.C., and Wilcove, D.S. 2022. Rural land abandonment is too ephemeral to provide major benefits for biodiversity and climate. Science Advances doi.org/10.1126/sciadv.abm8999. An archive of the analysis scripts developed for this project can be found at: https://github.com/chriscra/abandonment_trajectories (https://doi.org/10.5281/zenodo.6383127). Note that the label "_2022_02_07" in many file names refers to the date of the primary analysis. "dts” or “dt” refer to “data.tables," large .csv files that were manipulated using the data.table package in R (Dowle and Srinivasan 2021, http://r-datatable.com/). “Rasters” refer to “.tif” files that were processed using the raster and terra packages in R (Hijmans, 2022; https://rspatial.org/terra/; https://rspatial.org/raster). Data files fall into one of four categories of data derived during our analysis of abandonment: observed, potential, maximum, or recultivation. Derived datasets also follow the same naming convention, though are aggregated across sites. These four categories are as follows (using “age_dts” for our site in Shaanxi Province, China as an example): observed abandonment identified through our primary analysis, with a threshold of five years. These files do not have a specific label beyond the description of the file and the date of analysis (e.g., shaanxi_age_2022_02_07.csv); potential abandonment for a scenario without any recultivation, in which abandoned croplands are left abandoned from the year of initial abandonment through the end of the time series, with the label “_potential” (e.g., shaanxi_potential_age_2022_02_07.csv); maximum age of abandonment over the course of the time series, with the label “_max” (e.g., shaanxi_max_age_2022_02_07.csv); recultivation periods, corresponding to the lengths of recultivation periods following abandonment, given the label “_recult” (e.g., shaanxi_recult_age_2022_02_07.csv). This archive includes multiple .zip files, the contents of which are described below: age_dts.zip - Maps of abandonment age (i.e., how long each pixel has been abandoned for, as of that year, also referred to as length, duration, etc.), for each year between 1987-2017 for all 11 sites. These maps are stored as .csv files, where each row is a pixel, the first two columns refer to the x and y coordinates (in terms of longitude and latitude), and subsequent columns contain the abandonment age values for an individual year (where years are labeled with "y" followed by the year, e.g., "y1987"). Maps are given with a latitude and longitude coordinate reference system. Folder contains observed age, potential age (“_potential”), maximum age (“_max”), and recultivation lengths (“_recult”) for all sites. Maximum age .csv files include only three columns: x, y, and the maximum length (i.e., “max age”, in years) for each pixel throughout the entire time series (1987-2017). Files were produced using the custom functions "cc_filter_abn_dt()," “cc_calc_max_age()," “cc_calc_potential_age(),” and “cc_calc_recult_age();” see "_util/_util_functions.R." age_rasters.zip - Maps of abandonment age (i.e., how long each pixel has been abandoned for), for each year between 1987-2017 for all 11 sites. Maps are stored as .tif files, where each band corresponds to one of the 31 years in our analysis (1987-2017), in ascending order (i.e., the first layer is 1987 and the 31st layer is 2017). Folder contains observed age, potential age (“_potential”), and maximum age (“_max”) rasters for all sites. Maximum age rasters include just one band (“layer”). These rasters match the corresponding .csv files contained in "age_dts.zip.” derived_data.zip - summary datasets created throughout this analysis, listed below. diff.zip - .csv files for each of our eleven sites containing the year-to-year lagged differences in abandonment age (i.e., length of time abandoned) for each pixel. The rows correspond to a single pixel of land, and the columns refer to the year the difference is in reference to. These rows do not have longitude or latitude values associated with them; however, rows correspond to the same rows in the .csv files in "input_data.tables.zip" and "age_dts.zip." These files were produced using the custom function "cc_diff_dt()" (much like the base R function "diff()"), contained within the custom function "cc_filter_abn_dt()" (see "_util/_util_functions.R"). Folder contains diff files for observed abandonment, potential abandonment (“_potential”), and recultivation lengths (“_recult”) for all sites. input_dts.zip - annual land cover maps for eleven sites with four land cover classes (see below), adapted from Yin et al. 2020 Remote Sensing of Environment (https://doi.org/10.1016/j.rse.2020.111873). Like “age_dts,” these maps are stored as .csv files, where each row is a pixel and the first two columns refer to x and y coordinates (in terms of longitude and latitude). Subsequent columns contain the land cover class for an individual year (e.g., "y1987"). Note that these maps were recoded from Yin et al. 2020 so that land cover classification was consistent across sites (see below). This contains two files for each site: the raw land cover maps from Yin et al. 2020 (after recoding), and a “clean” version produced by applying 5- and 8-year temporal filters to the raw input (see custom function “cc_temporal_filter_lc(),” in “_util/_util_functions.R” and “1_prep_r_to_dt.R”). These files correspond to those in "input_rasters.zip," and serve as the primary inputs for the analysis. input_rasters.zip - annual land cover maps for eleven sites with four land cover classes (see below), adapted from Yin et al. 2020 Remote Sensing of Environment. Maps are stored as ".tif" files, where each band corresponds one of the 31 years in our analysis (1987-2017), in ascending order (i.e., the first layer is 1987 and the 31st layer is 2017). Maps are given with a latitude and longitude coordinate reference system. Note that these maps were recoded so that land cover classes matched across sites (see below). Contains two files for each site: the raw land cover maps (after recoding), and a “clean” version that has been processed with 5- and 8-year temporal filters (see above). These files match those in "input_dts.zip." length.zip - .csv files containing the length (i.e., age or duration, in years) of each distinct individual period of abandonment at each site. This folder contains length files for observed and potential abandonment, as well as recultivation lengths. Produced using the custom function "cc_filter_abn_dt()" and “cc_extract_length();” see "_util/_util_functions.R." derived_data.zip contains the following files: "site_df.csv" - a simple .csv containing descriptive information for each of our eleven sites, along with the original land cover codes used by Yin et al. 2020 (updated so that all eleven sites in how land cover classes were coded; see below). Primary derived datasets for both observed abandonment (“area_dat”) and potential abandonment (“potential_area_dat”). area_dat - Shows the area (in ha) in each land cover class at each site in each year (1987-2017), along with the area of cropland abandoned in each year following a five-year abandonment threshold (abandoned for >=5 years) or no threshold (abandoned for >=1 years). Produced using custom functions "cc_calc_area_per_lc_abn()" via "cc_summarize_abn_dts()". See scripts "cluster/2_analyze_abn.R" and "_util/_util_functions.R." persistence_dat - A .csv containing the area of cropland abandoned (ha) for a given "cohort" of abandoned cropland (i.e., a group of cropland abandoned in the same year, also called "year_abn") in a specific year. This area is also given as a proportion of the initial area abandoned in each cohort, or the area of each cohort when it was first classified as abandoned at year 5 ("initial_area_abn"). The "age" is given as the number of years since a given cohort of abandoned cropland was last actively cultivated, and "time" is marked relative to the 5th year, when our five-year definition first classifies that land as abandoned (and where the proportion of abandoned land remaining abandoned is 1). Produced using custom functions "cc_calc_persistence()" via "cc_summarize_abn_dts()". See scripts "cluster/2_analyze_abn.R" and "_util/_util_functions.R." This serves as the main input for our linear models of recultivation (“decay”) trajectories. turnover_dat - A .csv showing the annual gross gain, annual gross loss, and annual net change in the area (in ha) of abandoned cropland at each site in each year of the time series. Produced using custom functions "cc_calc_abn_diff()" via "cc_summarize_abn_dts()" (see "_util/_util_functions.R"), implemented in "cluster/2_analyze_abn.R." This file is only produced for observed abandonment. Area summary files (for observed abandonment only) area_summary_df - Contains a range of summary values relating to the area of cropland abandonment for each of our eleven sites. All area values are given in hectares (ha) unless stated otherwise. It contains 16 variables as columns, including 1) "site," 2) "total_site_area_ha_2017" - the total site area (ha) in 2017, 3) "cropland_area_1987" - the area in cropland in 1987 (ha), 4) "area_abn_ha_2017" - the area of cropland abandoned as of 2017 (ha), 5) "area_ever_abn_ha" - the total area of those pixels that were abandoned at least once during the time series (corresponding to the area of potential abandonment, as of 2017), 6) "total_crop_extent_ha" - the total area of those pixels that were classified as cropland at least once during the time series, 7) "total_area_abn_remaining_2017" - duplicate of "area_abn_ha_2017," the area abandoned as of 2017 (ha), taken from "area_recult_threshold," 8) "total_initial_area_abn" - the sum of the initial area of each cohort of abandonment when it is first classified as "abandoned," i.e., at the 5 year mark (note that this is cumulative, and because it counts those pixels that were abandoned more than once, it is therefore larger than "area_ever_abn_ha"), taken from "area_recult_threshold" 9) "total_area_abn_recultivated_2017" - the area of abandoned land that was recultivated as of 2017 (cumulatively, i.e., "total_initial_area_abn" - "area_abn_ha_2017"), taken from "area_recult_threshold," 10) "proportion_recultivated" - the proportion of all abandoned cropland (including multiple periods per pixel) that was recultivated by 2017, taken from "area_recult_threshold," 11) "area_2017_as_prop_site" - area abandoned as of 2017 as a proportion of the total site area, 12) "area_2017_as_prop_total_crop" - area abandoned as of 2017 as a proportion of the total crop extent, 13) "area_2017_as_prop_crop87" - area abandoned as of 2017 as a proportion of cropland area in 1987, 14) "area_ever_abn_as_prop_site" - area ever abandoned as a proportion of the total site area, 15) "area_ever_abn_as_prop_total_crop" - area ever abandoned as a proportion of the total crop extent, 16) "area_ever_abn_as_prop_crop87" - area ever abandoned as a proportion of cropland area in 1987. See script "1_summary_stats.Rmd." area_recult_threshold - Contains data on the proportion of observed abandoned cropland area that is recultivated by the end of our time series. This includes the area of abandoned cropland as of 2017 ("total_area_abn_remaining_2017") and the sum of the initial area of each cohort of abandonment when it is first classified as abandoned (at year 5; "total_initial_area_abn"). This "total_initial_area_abn" is cumulative, and allows for pixels that were abandoned multiple times during the time series to be counted multiple times. The difference between these two columns yields the "total_area_abn_recultivated_2017," which in turn is used to calculate the "proportion_recultivated," and the (ascending) "order" of sites based on this proportion. This file includes recultivation stats for each site for three abandonment definitions: 5, 7, and 10 years. See script "1_summary_stats.Rmd." abn_lc_area_2017 - Contains the number of pixels and corresponding area (in ha) of abandoned cropland in the year 2017 at each site, according to the land cover class (either woody vegetation [2], or herbaceous vegetation [4]) and the age in 2017 (5 to 30 years). See script "cluster/6_lc_of_abn.R." abn_prop_lc_2017 - Contains the number of pixels and corresponding area (ha) of cropland abandoned in the year 2017 in each land cover type (woody vegetation [2], or herbaceous vegetation [4]). It also shows this area as a proportion of the total area abandoned at each site (i.e., in either land cover class: 2 or 4). See script "cluster/6_lc_of_abn.R." Carbon carbon_df – contains the observed and potential carbon accumulation in abandoned croplands in each site in each year (in Mg C), for two abandonment thresholds: 5 years (our default abandonment definition) and 1 year (i.e., no threshold). Each data point corresponds to one of two scenarios (“type” column), either “observed” or “potential.” Carbon accumulation figures are for both the sum of forest and soil carbon at each site in a given year. Carbon accumulation is listed in three columns: 1) “C_up_to_20” contains the total carbon accumulated in those abandoned croplands with abandonment durations between 5 and 20 years. 2) “C_21_30” contains the total carbon accumulation in croplands with durations between 21 and 30 years, which are differentiated in order to account for non-linear carbon accumulation rates in soils over time, and 3) “total_C_Mg” contains the sum of the previous two columns, representing the total carbon accumulated across all abandoned croplands in each year. soc_mean – contains mean soil organic carbon accumulation rates for years 1-20 and years 21-80, derived from Sanderman et al. 2020 (in Mg C; https://doi.org/10.7910/DVN/HA17D3). These values correspond to accumulation rates in croplands upon abandonment and regeneration to natural vegetation (Sanderman et al. 2020’s “rewilding” scenario). These mean values are calculated across those pixels identified as cropland by Sanderman et al. 2020 at each site. Mean values in year 20 and 80 are contained in columns “mean_soc_20” and “mean_soc_80” respectively, and the annualized rate over the first 20 years and the subsequent years 21 through 80 are contained in columns “mean_annual_soc_1_20” and “mean_annual_soc_21_80” respectively. Decay model data – two R data files containing data products for our linear models of abandonment recultivation trajectories. decay_endpoints_files – an R data file (.rds) containing seven data products produced as part of our common endpoint analysis, which calculated mean trajectories for each site across a range of common endpoints, ensuring that means were based on coefficient estimates derived from a consistent number of observations for each cohort. These files are: common_endpoint_dat – a .csv containing subsets of “persistence_dat” for each “endpoint” (7 through 29). endpoint_n – a .csv describing, for each endpoint, the corresponding number of observations per cohort (“n_obs”), the number of cohorts (“n_cohorts”), the total number of observations across cohorts included (“total_obs”), and the cohorts that meet the endpoint threshold (“cohorts”). coef_l3_endpoints – corresponding model coefficients for our primary model (“l3”) parameterized by the range of subsets across endpoints. augment_endpoints – fitted values (i.e., model predictions) for linear models produced across the full range of endpoint subsets. fitted_endpoints – a simplified .csv containing the mean linear and log coefficients for each site at each endpoint, and the corresponding predicted proportion remaining abandoned through time (based on the “age,” or duration, of abandonment). time_to_endpoints – a .csv containing, for mean trajectories for each endpoint at each site, the estimated time required for a given amount of abandoned cropland in a cohort to be recultivated (deciles, 10% through 100%). endpoint_half_lives – a .csv containing the half-lives calculated for the mean trajectories for each endpoint at each site. decay_mod_archive - an R data file (.rds) containing eleven data products derived from linear models of abandonment recultivation ("decay"): lm_mega_lin_log_lin_l – the primary linear model produced in our analysis. This model is referred to as “lin_log_lin” (or “l3”) because the model predicts linear persistence (“lin”) as a function of a log term of time (“log”) and a linear term of time (“lin”). “mega” refers to the fact that this model is run for the full dataset, pooled acro

{"references": ["Yin, H., A. Brand\u00e3o, J. Buchner, D. Helmers, B. G. Iuliano, N. E. Kimambo, K. E. Lewi\u0144ska, E. Razenkova, A. Rizayeva, N. Rogova, S. A. Spawn, Y. Xie, and V. C. Radeloff. 2020. Monitoring cropland abandonment with Landsat time series. Remote Sensing of Environment 246:111873. https://doi.org/10.1016/j.rse.2020.111873", "Sanderman, J. Woolf, D., Lehmann, J., Rivard, C., Poggio, L., Heuvelink, G., Bossio, D. 2020. Soils Revealed soil carbon futures. doi:10.7910/DVN/HA17D3.", "Hijmans, R. J. 2022. Terra: Spatial data analysis. https://rspatial.org/terra/", "Dowle, M., Srinivasan, A. 2021 Data.table: Extension of 'data.frame.' http://r- datatable.com/"]}

This work was supported by the High Meadows Foundation and the NASA Land Cover and Land Use Change Program (Grant no. 80NSSC18K0343) and analyses were performed using Princeton Research Computing resources at Princeton University.

Related Organizations

University System of Ohio
United States
College of New Jersey
United States
Kent State University
United States
University of Wisconsin–Oshkosh
United States

Keywords

Carbon sequestration, Cropland abandonment, Agricultural abandonment, Agriculture, Land-cover mapping, Farmland abandonment, Biodiversity conservation, Secondary succession

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	1
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average