Efficient UCB-Based Assignment Algorithm Under Unknown Utility with Application in Mentor-Mentee Matching
Pub. online: 24 September 2024
Type: Statistical Data Science
Open Access
†
This paper is part of the first author’s PhD dissertation at Georgia Institute of Technology.
Received
18 June 2024
18 June 2024
Accepted
22 August 2024
22 August 2024
Published
24 September 2024
24 September 2024
Abstract
The assignment problem, crucial in various real-world applications, involves optimizing the allocation of agents to tasks for maximum utility. While it has been well-studied in the optimization literature when the underlying utilities between all agent-task pairs are known, research is sparse when the utilities are unknown and need to be learned from data on the fly. This paper addresses this gap, as motivated by mentor-mentee matching programs at many U.S. universities. We develop an efficient sequential assignment algorithm, with the aim of nearly maximizing the overall utility simultaneously over different time periods. Our proposed algorithm is to use stochastic bandit feedback to adaptively estimate the unknown utilities through linear regression models, integrating the Upper Confidence Bound (UCB) algorithm in the multi-armed bandit problem with the Hungarian algorithm in the assignment problem. We provide theoretical bounds of our algorithm for both the estimation error and the total regret. Additionally, numerical studies are also conducted to demonstrate the practical effectiveness of our algorithm.
Supplementary material
Supplementary MaterialThe supplementary materials online includes: Proofs of Lemmas and Theorems used in the paper, and Python code needed to reproduce the results.