
The promised benefits of the General Transit Feed Specification (GTFS) Schedule and Realtime standards are dependent on the underlying quality of the data. Despite this fundamental reliance, there has been relatively little research on techniques and strategies to assess GTFS accuracy. The need for such assessment is growing as federal and state governments increasingly require transit agencies to make these data available to the public. This research fills this gap by presenting a suite of methods and metrics to assess the temporal accuracy of GTFS Realtime and the spatial accuracy of GTFS Schedule feeds. The temporal assessment demonstrates an approach to collect and clean TripUpdate messages to identify (and derive) a set of values for measuring the accuracy of the vehicle arrival predictions. These metrics are carefully designed to provide transit agencies insight into the quality of the data they provide to customers in terms of the impact of those inaccuracies on the customer experience. The spatial assessment demonstrates an approach to match scheduled information on the location of transit routes and stops with the actual travel patterns demonstrated in the realtime VehiclePosition messages. The measured divergence between the planned and provided transit service yields a series of location accuracy metrics. All of the proposed metrics can be scaled to examine GTFS accuracy from the stop to the systemwide level. All of the proposed metrics can be easily generated from publicly available GTFS feeds without any additional data sources. Finally, all of the proposed metrics can help transit agencies continuously assess and therefore improve the quality of GTFS data they share with the public.
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 1 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
