Skip to Main content Skip to Navigation
New interface
Journal articles

On the Whittle index of Markov Modulated Restless Bandits

Abstract : In this paper we study a Multi-Armed Restless Bandit Problem (MARBP) subject to time fluctuations. This model has numerous applications in practice, like in cloud computing systems or in wireless communications networks. Each bandit is formed by two processes: a controllable process and an environment. The transition rates of the controllable process are determined by the state of the environment, which is an exogenous Markov process. The decision maker has full information on the state of every bandit, and the objective is to determine the optimal policy that minimises the long-run average cost. Given the complexity of the problem, we set out to characterise the Whittle index, which is obtained by solving a relaxed version of the MARBP. As reported in the literature, this heuristic performs extremely well for a wide variety of problems. Assuming that the optimal policy of the relaxed problem is of threshold type, we provide an algorithm that finds Whittle's index. We then consider a multi-class queue with linear cost and impatient customers. For this model, we show threshold optimality, prove indexability, and obtain Whittle's index in closed-form. We also study the limiting regimes in which the environment is relatively slower and faster than the controllable process. By numerical simulations, we assess the suboptimality of Whittle's index policy in a wide variety of scenarios, and the general observation is that, as in the case of standard MARBP, the suboptimality gap of Whittle's index policy is small.
Complete list of metadata
Contributor : Urtzi Ayesta Connect in order to contact the contributor
Submitted on : Friday, February 18, 2022 - 10:16:12 AM
Last modification on : Tuesday, October 25, 2022 - 11:58:10 AM
Long-term archiving on: : Thursday, May 19, 2022 - 6:30:55 PM


Files produced by the author(s)



Santiago Guillermo Duran, Urtzi Ayesta, Ina Maria Maaike Verloop. On the Whittle index of Markov Modulated Restless Bandits. Queueing Systems, 2022, pp.1-55. ⟨10.1007/s11134-022-09737-y⟩. ⟨hal-03579521⟩



Record views


Files downloads