Rumboost: Gradient Boosted Random Utility Models

Name: Rumboost: Gradient Boosted Random Utility Models
Keywords: FOS: Computer and information sciences, Computer Science - Machine Learning, Statistics - Machine Learning, Machine Learning (stat.ML), Machine Learning (cs.LG)

Nicolas Salvadé; Tim Hillel

Found an issue? Give us feedback

Transportation Resea...arrow_drop_down

Transportation Research Part C Emerging Technologies

Article . 2025 . Peer-reviewed

License: CC BY

Data sources: Crossref

arXiv.org e-Print Archive

Preprint . 2024

Data sources: arXiv.org e-Print Archive

https://doi.org/10.2139/ssrn.4...

Article . 2024 . Peer-reviewed

Data sources: Crossref

https://dx.doi.org/10.48550/ar...

Article . 2024

License: CC BY

Data sources: Datacite

DBLP

Article . 2024

Data sources: DBLP

Rumboost: Gradient Boosted Random Utility Models

descriptionPublicationkeyboard_double_arrow_right Article , Preprint 01 Jan 2024Embargo end date: 01 Jan 2024Publisher:Elsevier BVJournal:Transportation Research Part C: Emerging Technologies, volume 170, page 104,897 (issn: 0968-090X,

Copyright policy )

Authors: Nicolas Salvadé; Tim Hillel;

doi: 10.2139/ssrn.4701222 , 10.1016/j.trc.2024.104897 , 10.48550/arxiv.2401.11954

arXiv: 2401.11954

Rumboost: Gradient Boosted Random Utility Models

- Summary
- Subjects
- Metrics

Abstract

This paper introduces the RUMBoost model, a novel discrete choice modelling approach that combines the interpretability and behavioural robustness of Random Utility Models (RUMs) with the generalisation and predictive ability of deep learning methods. We obtain the full functional form of non-linear utility specifications by replacing each linear parameter in the utility functions of a RUM with an ensemble of gradient boosted regression trees. This enables piece-wise constant utility values to be imputed for all alternatives directly from the data for any possible combination of input variables. We introduce additional constraints on the ensembles to ensure three crucial features of the utility specifications: (i) dependency of the utilities of each alternative on only the attributes of that alternative, (ii) monotonicity of marginal utilities, and (iii) an intrinsically interpretable functional form, where the exact response of the model is known throughout the entire input space. Furthermore, we introduce an optimisation-based smoothing technique that replaces the piece-wise constant utility values of alternative attributes with monotonic piece-wise cubic splines to identify non-linear parameters with defined gradient. We demonstrate the potential of the RUMBoost model compared to various ML and Random Utility benchmark models for revealed preference mode choice data from London. The results highlight the great predictive performance and the direct interpretability of our proposed approach. Furthermore, the smoothed attribute utility functions allow for the calculation of various behavioural indicators and marginal utilities. Finally, we demonstrate the flexibility of our methodology by showing how the RUMBoost model can be extended to complex model specifications, including attribute interactions, correlation within alternative error terms and heterogeneity within the population.

Related Organizations

University College London
United Kingdom
UNIVERSITY COLLEGE LONDON, Bartlett School of Planning
United Kingdom

Keywords

FOS: Computer and information sciences, Computer Science - Machine Learning, Statistics - Machine Learning, Machine Learning (stat.ML), Machine Learning (cs.LG)

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	4
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 10%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

4

Top 10%

Average

Green

hybrid

Related to Research communities

Transport Research