Title: Decouple and Balance: Approximation algorithms for hard bandit problems

Speaker: Sudipto Guha, UPenn

Abstract: In this talk we would talk about algorithms for optimizing the reward derived from a collection of
restless markovian bandits. Restless bandits capture the generalization of multi-armed bandit (MAB) environments where the state of an arm changes without playing the arm. This feature by itself makes these problems PSPACE hard to solve/approximate, and the existing well-known algorithmic ideas for MAB does not apply. In this talk we identify the subclasses of restless bandits and provide techniques for developing the first known near-optimal strategies.

Interestingly, although the analysis will use linear programs, the actual algorithms will be based on simple binary searches. We will also show how the same analysis technique ports to other pesky variants of MAB, including those with switching costs, and provide the first provably near optimal strategies as well.

A fascinating outcome of the entire set of ideas is that they allow us to analyze the performance of (just ever so slightly modified) well known index based algorithms which are widely used in practice, such as those of Gittins and Whittle.

This is a collection of ideas based on work with Kamesh Munagala and Peng Shi (Duke Univ.)
Extended and recently revised version of restless bandits paper is at http://arxiv.org/abs/0711.3861
The other paper is at http://www.springerlink.com/content/t423223163483083/