
The General Estimates System (GES), redesigned as the Crash Report Sampling System (CRSS) in 2016, is a nationally representative sample of police-reported motor vehicle crashes in the U.S., maintained by the National Highway Traffic Safety Administration (NHTSA). While the raw GES/CRSS data are publicly available, they are distributed in formats that require significant preprocessing before analysis. This dataset provides a processed version of the GES/CRSS database (2014-2023), optimized for immediate use in R, Python, Excel, and other modern data tools. It was prepared by the rfars R package, enabling researchers, policymakers, and educators to access and analyze crash data more efficiently and reproducibly. This dataset is provided in three different file formats to support a wide range of users and analysis environments: CSV (.csv) - a plain text format, with each of the five tables saved separately. These files are the most universally compatible and can be opened directly in Excel, though they are larger. Parquet (.parquet) - a modern, compressed, cross-platform columnar format. Each of the five tables is provided separately (fars_accident.parquet, fars_vehicle.parquet, etc.). This version is recommended for Python, SQL, and advanced R users, as it is smaller than CSV and loads quickly in most data science environments. RDS (.rds) - a native R format containing the full dataset as a list of five related tables (accident, vehicle, person, drugs, distract). This version is recommended for R users, since it can be loaded in one step with readRDS() and preserves all variable types exactly as processed.
Version 2025.2 adds the multi_per tables to the data files.
Motor vehicle, Road safety, Transportation, Safety, Land transportation, Safety analysis
Motor vehicle, Road safety, Transportation, Safety, Land transportation, Safety analysis
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
