Improved sequential decision-making with structural priors: Enhanced treatment personalization with historical data
Personalizing treatments for patients involves a period where different treatments out of a set of available treatments are tried until an optimal treatment is found, for particular patient characteristics. To minimize suffering and other costs, it is critical to minimize this search. When treatments have primarily short-term effects, the search can be performed with multi-armed bandit algorithms (MABs). However, these typically require long exploration periods to guarantee optimality. With historical data, it is possible to recover a structure incorporating the prior knowledge of the types of patients that can be encountered, and the conditional reward models for those patient types. Such structural priors can be used to reduce the treatment exploration period for enhanced applicability in the real world. This thesis presents work on designing MAB algorithms that find optimal treatments quickly, by incorporating a structural prior for patient types in the form of a latent variable model. Theoretical guarantees for the algorithms, including a lower and a matching upper bound, and an empirical study is provided, showing that incorporating latent structural priors is beneficial. Another line of work in this thesis is the design of simulators for evaluating treatment policies and comparing algorithms. A new simulator for benchmarking estimators of causal effects, the Alzheimer’s Disease Causal estimation Benchmark (ADCB) is presented. ADCB combines data-driven simulation with subject-matter knowledge for high realism and causal verifiability. The design of the simulator is discussed, and to demonstrate its utility, the results of a usage scenario for evaluating estimators of causal effects are outlined.