Hierarchical Reinforcement Learning And Game-Theoretic Model For Stochastic Last-Mile Delivery With Crowdshipping
Abstract
Dynamic crowdshipping vehicle routing problem can be challenging as the dynamically
arrived crowd drivers, orders, and the dynamic capacity and time availability of each
driver. In the ecommerce context, it must also include the presence of company inhouse
drivers – as they are the main source of delivery in e-commerce. The thesis applies a
hierarchical reinforcement learning method that combines upper-level agent for
balancing the opportunities and risks brought by delayed batch-matching, Nash game
for classifying crowdsource and company orders, and lower-level agents for route
planning. The model secures a 10% improvement versus novel solutions, which indicate
improvements and higher chance for further development