This repository contains the code and data for the paper “Dynamic Vehicle Routing Problem with Prompt Confirmation of Advance Requests”, published at the 17th ACM/IEEE International Conference on Cyber-Physical Systems (ICCPS 2026).
arXiv version: https://arxiv.org/abs/2603.07422
Transit agencies that operate on-demand transportation services have to respond to trip requests from passengers in real time, which involves solving dynamic vehicle routing problems with pick-up and drop-off constraints. Based on discussions with public transit agencies, we observe a real-world problem that is not addressed by prior work: when trips are booked in advance (e.g., trip requests arrive a few hours in advance of their requested pick-up times), the agency needs to promptly confirm whether a request can be accepted or not, and ensure that accepted requests are served as promised. State-of-the-art computational approaches either provide prompt confirmation but lack the ability to continually optimize and improve routes for accepted requests, or they provide continual optimization but cannot guarantee serving all accepted requests. To address this gap, we introduce a novel problem formulation of dynamic vehicle routing with prompt confirmation and continual optimization. We propose a novel computational approach for this vehicle routing problem, which integrates a quick insertion search for prompt confirmation with an anytime algorithm for continual optimization. To maximize the number requests served, we train a non-myopic objective function using reinforcement learning, which guides both the insertion and the anytime algorithms towards optimal, non-myopic solutions. We evaluate our computational approach on a real-world microtransit dataset from a public transit agency in the U.S., demonstrating that our proposed approach provides prompt confirmation while significantly increasing the number of requests served compared to existing approaches.
During service, (1) requests arrive following a known distribution (e.g., based on historical data); (2) upon the arrival of each request, we decide whether to accept or reject the request given the current state of the service; (3) we promptly notify the passenger of our decision; (4) until the next request arrives, we continuously optimize route plans to better accommodate future requests. Symbols
Distribution of request rejection rates across five different episodes from the real-world microtransit data.
Distribution of request rejection rates across five different episodes from the NYC taxi data.
- Create a virtual environment using Conda:
conda create -n ovwab_env python=3.11- Activate the environment:
conda activate ovwab_env- Install the required libraries from
requirements.txtfile (configured withkeras==3.6.0andtorch==2.5.0):
pip install -r requirements.txt --no-cache-dir- Install
graphvizusingconda install graphvizto save the model images. - The environment variable
KERAS_BACKENDis configured at runtime by the scripts usingimportlib. - Make sure to install one of the
KERAS_BACKENDbefore executing any of the following sample codes. - To run MCVRP baseline (Wilbur et al. 2022),
please install
julia(probably anLTSversion) and install dependencies frombaselines/mcvrp/julia_dependencies.jl(copy of https://github.com/smarttransit-ai/iccps-2022-paratransit-public/blob/main/bin/julia_packages.jl).
baselines/mcvrp- Julia implementation of MCVRP baseline (Wilbur et al. 2022)baselines/rh- slightly modified code from original published code (https://github.com/MAS-Research/RollingHorizon) (Kim et al. 2023)common- common functions used in the codebaseenv- implementation of the environment that could be used in training and evaluationlearn- implementation of the learning modulemodel- implementation of generating feature vectors and neural network models
python generate_matrix_and_processed_data.pyPlease run the above script to generate the travel-time matrices (for different solvers) and another Python-byte object, which are required for efficient solving.
python main.py --execution_mode gather-experience --objective rl-custom --insertion_approach exhaustive --data_dir ../data --data_id MTD --save_experiencesdata_id: MTD for microtransit data, NYC for NYC taxi data, ABS for abstract problem instances.
python main.py --execution_mode offline-train --load_experiences --experience_dir sample_experiences --learning_rate 0.001 batch_size 1024 --model_arch mlp --model_config_version 0 --model_version 0 --use_gpu
python main.py --execution_mode offline-train --load_experiences --experience_dir sample_experiences --learning_rate 0.001 batch_size 1024 --model_arch kan --model_config_version 0 --model_version 0 --use_gpu
python main.py --execution_mode offline-train --load_experiences --experience_dir sample_experiences --learning_rate 0.001 batch_size 1024 --model_arch cnn --model_config_version 0 --model_version 0 --use_gpuNote: KAN (model_arch:kan) is only supported in PyTorch while performing offline training; others can work with both TensorFlow and PyTorch backends.
experience_dir: absolute path or relative path with respect tocoderoot; experiences directory should be organized asexperience_dir/train(train experiences) andexperience_dir/test(test experiences)learning_rate: learning rate of the Adam optimizerbatch_size: batch size to be considered inmodel.fit()functionmodel_arch: model architecturesmlp(Multi Layer Perceptron),kan(Kolmogorov–Arnold Network), andcnn(Convolutional Neural Network)model_config_version: 0 is valid for all three model architectures, 1 is valid for onlymlpandcnnmodel_version: formlp(0-116), forcnn(0-71) and forkan(0-89) (model architecture without a dropout)feature_code: formlpandkan: (cia,cia-cac, andcac); forcnn: (ciaonly)look_ahead_window_size: any value greater than 0, (ideal values: 30, 60, ...); values divide the look-ahead horizon tolook_ahead_window_size(in seconds)look_ahead_grid_size: any value greater than 0, (ideal values: 1, 2, 3); values divide the city into alook_ahead_grid_sizebylook_ahead_grid_sizegridlook_ahead_slot_count: range indicating the number of consecutive slots to consider (i.e., from1-look_ahead_slot_count)look_ahead_slot_step_size: minimum slot sizenumber_of_routes_to_consider: up to how many routes we want to consideruse_gpu: if you want to use GPU, otherwise skip (default: use CPU)
Please refer to best_model_combo.csv for more details. While evaluating, please pass the correct parameters accordingly.
For example,
model_arch,model_dir,look_ahead_slot_count,look_ahead_slot_step_size,look_ahead_grid_size,number_of_routes_to_consider,feature_code,model_version
mlp,../model/MTD/MLP,1,2,1,1,cia-cac,48
In this case, add the following arguments to evaluate the scripts:
--model_arch mlp --model_dir ../model/MTD/MLP --look_ahead_slot_count 1 --look_ahead_slot_step_size 2 --look_ahead_grid_size 1 --number_of_routes_to_consider 1 --feature_code cia-cac --model_version 48
sample_idx: can take values 40, 41, 42, 43, 44
When evaluating using a trained model, restrict the input arguments (i.e., model_arch, model_config_version, model_version, model_dir) within these values.
-
Running our trained policy with the anytime algorithm:
python main.py --execution_mode evaluation-best --objective rl-custom --search_objective rl-custom --insertion_approach exhaustive --search_approach sim_anneal --data_dir ../data --data_id MTD --sample_idx 40 --perform_search --model_config_version 0 --model_arch mlp --model_dir ../model/MTD/MLP --look_ahead_slot_count 1 --look_ahead_slot_step_size 2 --look_ahead_grid_size 1 --number_of_routes_to_consider 1 --feature_code cia-cac --model_version 48
-
Running our trained policy with the anytime algorithm (but with maximizing idle time):
python main.py --execution_mode evaluation-best --objective rl-custom --search_objective idle-time --insertion_approach exhaustive --search_approach sim_anneal --data_dir ../data --data_id MTD --sample_idx 40 --perform_search --model_config_version 0 --model_arch mlp --model_dir ../model/MTD/MLP --look_ahead_slot_count 1 --look_ahead_slot_step_size 2 --look_ahead_grid_size 1 --number_of_routes_to_consider 1 --feature_code cia-cac --model_version 48
-
Running our trained policy with the anytime algorithm (with insertion controlled by maximizing idle time for selecting an insertion spot):
python main.py --execution_mode evaluation-best --objective rl-custom --search_objective rl-custom --use_ve_only_at_decision --insertion_approach exhaustive --search_approach sim_anneal --data_dir ../data --data_id MTD --sample_idx 40 --perform_search --model_config_version 0 --model_arch mlp --model_dir ../model/MTD/MLP --look_ahead_slot_count 1 --look_ahead_slot_step_size 2 --look_ahead_grid_size 1 --number_of_routes_to_consider 1 --feature_code cia-cac --model_version 48
-
Running our trained policy with the anytime algorithm (with fixed search duration between request arrivals):
python main.py --execution_mode evaluation-best --objective rl-custom --search_objective rl-custom --insertion_approach exhaustive --search_approach sim_anneal --data_dir ../data --data_id MTD --sample_idx 40 --perform_search --fixed_search --fixed_search_duration 10 --model_config_version 0 --model_arch mlp --model_dir ../model/MTD/MLP --model_config_version 0 --model_arch mlp --look_ahead_slot_count 1 --look_ahead_slot_step_size 2 --look_ahead_grid_size 1 --number_of_routes_to_consider 1 --feature_code cia-cac --model_version 48
-
Running our trained policy without the anytime algorithm:
python main.py --execution_mode evaluation-best --objective rl-custom --search_objective rl-custom --insertion_approach exhaustive --search_approach sim_anneal --data_dir ../data/MTD --data_id MTD --sample_idx 40 --model_config_version 0 --model_arch mlp --model_dir ../model/MTD/MLP --look_ahead_slot_count 1 --look_ahead_slot_step_size 2 --look_ahead_grid_size 1 --number_of_routes_to_consider 1 --feature_code cia-cac --model_version 48
-
Running our simple policy with the anytime algorithm:
python main.py --execution_mode evaluation --objective idle-time --search_objective idle-time --insertion_approach exhaustive --search_approach sim_anneal --data_dir ../data --data_id MTD --sample_idx 40 --perform_search
-
Running our simple policy with the anytime algorithm (with fixed search duration between request arrivals):
python main.py --execution_mode evaluation --objective idle-time --search_objective idle-time --insertion_approach exhaustive --search_approach sim_anneal --data_dir ../data --data_id MTD --sample_idx 40 --perform_search --fixed_search --fixed_search_duration 10
-
Running our simple policy without the anytime algorithm:
python main.py --execution_mode evaluation --objective idle-time --search_objective idle-time --insertion_approach exhaustive --search_approach sim_anneal --data_dir ../data --data_id MTD --sample_idx 40
-
Running Google OR-Tools with the anytime algorithm:
python main.py --execution_mode evaluation --objective idle-time --search_objective idle-time --insertion_approach routing --search_approach routing --data_dir ../data --data_id MTD --sample_idx 40 --perform_search
-
Running Google OR-Tools without the anytime algorithm:
python main.py --execution_mode evaluation --objective idle-time --search_objective idle-time --insertion_approach routing --search_approach routing --data_dir ../data --data_id MTD --sample_idx 40
-
Running Rolling Horizon baseline (Kim et al. 2023):
- Before executing, create an empty conda environment called
python_update. - Code is based on
MOSEK(license is required to run experiments).
M-series MacBook:
python main.py --execution_mode evaluation --insertion_approach rolling-horizon --data_dir ../data --data_id MTD --sample_idx 40 --rtv_rh 1 --rtv_interval 60 --rtv_time_limit 1 --rtv_bin_name rolling_horizon_m1
Linux:
python main.py --execution_mode evaluation --insertion_approach rolling-horizon --data_dir ../data --data_id MTD --sample_idx 40 --rtv_rh 1 --rtv_interval 60 --rtv_time_limit 1 --rtv_bin_name rolling_horizon_linux
rtv_rh: value of the rolling horizon factorrtv_interval: time period to consider the set of requests as a single batchrtv_time_limit: number of seconds that the RTV graph generator can spend on each vehiclertv_bin_name: for M1 choose,rolling_horizon_m1, for Linux choose,rolling_horizon_linux
- Before executing, create an empty conda environment called
-
Running MC-VRP baseline (Wilbur et al. 2022):
MTD dataset
julia baselines/mcvrp/iccps_baseline_MTD.jl 1 5 25 1.0
- The first argument indicates the start index of the test set (please choose values from 1 to 5).
- The second argument indicates the end index of the test set (please choose values from 1 to 5).
- The third argument indicates the number of train chains (please choose values from 1 to 25).
- The fourth argument indicates compute time between at each decision epoch (please choose values from starting from 1.0).
NYC dataset
julia baselines/mcvrp/iccps_baseline_NYC.jl 1 5 25 1.0
- The first argument indicates the start index of the test set (please choose values from 1 to 5).
- The second argument indicates the end index of the test set (please choose values from 1 to 5).
- The third argument indicates the number of train chains (please choose values from 1 to 25).
- The fourth argument indicates compute time between at each decision epoch (please choose values from starting from 1.0).
- Kim, Y., Edirimanna, D., Wilbur, M., Pugliese, P., Laszka, A., Dubey, A., & Samaranayake, S. (2023). Rolling horizon based temporal decomposition for the offline pickup and delivery problem with time windows. In Proceedings of the 37th AAAI Conference on Artificial Intelligence (AAAI 2023).
- Wilbur, M., Kadir, S. U., Kim, Y., Pettet, G., Mukhopadhyay, A., Pugliese, P., Samaranayake, S., Laszka, A. and Dubey, A. (2022). An online approach to solve the dynamic vehicle routing problem with stochastic trip requests for paratransit services. In Proceedings of the 15th ACM/IEEE International Conference on Cyber-Physical Systems (ICCPS 2024).


