Deep Reinforcement Learning for Solving Vehicle Routing Problems With Backhauls

Name: Deep Reinforcement Learning for Solving Vehicle Routing Problems With Backhauls
Keywords: Deep reinforcement learning, Databases and Information Systems, vehicle routing problem (VRP), logistics, OS and Networks, Reviews, Genetic algorithms, Vehicle routing, 004, neural heuristic

Conghui Wang; Zhiguang Cao; Yaoxin Wu; Long Teng; Guohua Wu

Found an issue? Give us feedback

IEEE Transactions on...arrow_drop_down

IEEE Transactions on Neural Networks and Learning Systems

Article . 2024

License: taverne

Data sources: Eindhoven University of Technology Research Portal

IEEE Transactions on Neural Networks and Learning Systems

Article . 2025 . Peer-reviewed

License: IEEE Copyright

Data sources: Crossref

IEEE Transactions on Neural Networks and Learning Systems

Article

Data sources: Europe PubMed Central

Institutional Knowledge (InK) at Singapore Management University

Article . 2024

Data sources: Bielefeld Academic Search Engine (BASE)

Deep Reinforcement Learning for Solving Vehicle Routing Problems With Backhauls

descriptionPublicationkeyboard_double_arrow_right Article 01 Mar 2025Publisher:Institute of Electrical and Electronics Engineers (IEEE)Journal:IEEE Transactions on Neural Networks and Learning Systems, volume 36, pages 4,779-4,793 (issn: 2162-237X, eissn: 2162-2388,

Copyright policy )

Authors: Conghui Wang; Zhiguang Cao; Yaoxin Wu; Long Teng; Guohua Wu;

doi: 10.1109/tnnls.2024.3371781

pmid: 38551826

Deep Reinforcement Learning for Solving Vehicle Routing Problems With Backhauls

- Summary
- Subjects
- Metrics

Abstract

The vehicle routing problem with backhauls (VRPBs) is a challenging problem commonly studied in computer science and operations research. Featured by linehaul (or delivery) and backhaul (or pickup) customers, the VRPB has broad applications in real-world logistics. In this article, we propose a neural heuristic based on deep reinforcement learning (DRL) to solve the traditional and improved VRPB variants, with an encoder-decoder structured policy network trained to sequentially construct the routes for vehicles. Specifically, we first describe the VRPB based on a graph and cast the solution construction as a Markov decision process (MDP). Then, to identify the relationship among the nodes (i.e., linehaul and backhaul customers, and the depot), we design a two-stage attention-based encoder, including a self-attention and a heterogeneous attention for each stage, which could yield more informative representations of the nodes so as to deliver high-quality solutions. The evaluation on the two VRPB variants reveals that, our neural heuristic performs favorably against both the conventional and neural heuristic baselines on randomly generated instances and benchmark instances. Moreover, the trained policy network exhibits a desirable capability of generalization to various problem sizes and distributions.

Related Organizations

Hong Kong Polytechnic University
China (People's Republic of)
Central South University
China (People's Republic of)
Technical University Eindhoven
Netherlands
Singapore Management University
Singapore
Eindhoven University of Technology
Netherlands

Keywords

Deep reinforcement learning, Databases and Information Systems, vehicle routing problem (VRP), logistics, OS and Networks, Reviews, Genetic algorithms, Vehicle routing, 004, neural heuristic, Backhaul networks, Deep reinforcement learning (DRL), Search problems, Heuristic algorithms, two-stage attention

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	3
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 10%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

3

Top 10%

Average

hybrid

Related to Research communities

Netherlands Research Portal