
Abstract Travel delays and bus overcrowding are some of the daily dissatisfactions of public transportation users. These problems may be caused by bus bunching, an event that occurs when two or more buses are running the same route together, i.e. out of schedule. Due to the stochastic nature of the traffic, a static schedule is not effective to avoid the occurrence of these events; thus, preventive actions are necessary to improve the reliability of the public transportation system. In this context, we propose a decision tree ensemble model to predict bus bunching. We use an ensemble of Random Forest, eXtreme Gradient Boosting and Categorical Boosting models applied to Global Positioning System, General Transit Feed Specification, weather and traffic situation data. The efficacy of the proposed model has been demonstrated using real data sets and has been compared with four baselines: Linear Regression, Logistic Regression, Support Vector Machine and Relevance Vector Machine. According to the results, the proposed model can achieve an efficacy between 74 and 80% and can be used to predict bus bunching in real time up to 10 stops before its occurrence.
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 7 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 10% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Top 10% |
