Institutionernas doktorandkurser

Startdatum och periodicitet för kurser kan variera. Se detaljer för respektive kurs för aktuell information. För anmälan, kontakta respektive kursansvarig.


Decision making under uncertainty

  • Kurskod: FDAT070
  • ECTS-poäng: 7,5
  • Institution: DATA- OCH INFORMATIONSTEKNIK
  • Forskarskola: Data- och informationsteknik
  • Periodicitet: Next start date decided by interest.
  • Undervisningsspråk: Kursen kommer att ges på engelska
  • Nordic Five Tech (N5T): Kursen är kostnadsfri för doktorander från N5T-universitet
Course description
This course gives a firm foundation to decision theory from mainly a statistical, but also a philosophical perspective. The aim of the course is two-fold. Firstly, to give a thorough understanding of statistical decision theory, the meaning of hypothesis testing, automatic methods for designing and interpreting experiments and the relation of statistical decision making to human decision making. Secondly, to relate the theory to practical problems in reinforcement learning and artificial intelligence. 

The course may be split in two parts. In the first part, we introduce probability and measure theory, the concepts of subjective probability and utility, and how they can be used to represent and solve decision problems. We then cover the estimation of unknown parameters, hypothesis testing and the interpretation of experiments. Finally, we discuss sequential sampling, sequential experiments, and more generally, sequential decision making. This covers the important problems of resource allocation, reinforcement learning and prediction with expert advice. These include applications such as bandwitdh allocation in a netwroked system, robots that learn how to act in an unknown environment with only limited feedback, learning how to play games optimally, as well as many applications in finance. 

The second part focuses on recent research in decision making under uncertainty, and in particular reinforcement learning and learning with expert advice. First, we examine a few representative statistical models. We then give an overview of algorithms for making optimal decisions using these models. Finally, we look at the problem of learning to act by following expert advice, which a field that has recently found many applications in online advertising, game-tree search and optimisation. 

Prerequisites 
Basic calculus and probability. Programming experience relating to numerical problems (any language will do). Assessment Continous assessment, based on exercises, course participation and a project. The exercises require a mixture of basic applied mathematics (calculus and probability) , programming and some thought. The project is organised as a competition. Students form 2-person teams. Each team creates an environment and an agent. Everybody's agents are tested on everybody's environments.
Litteratur
Statistics 

De Groot, "Optimal Statistical Decisions".  
Gives a good overview of Bayesian statistics and decision theory, and an introduction to Markov decision processes from a statistical perspective. Most reinforcement learning problems can be reduced to MDPs of this type. 

Savage, "The foundations of statistics". 
This early work gives a more verbose (but rigorous) alternative to de Groot. Most basic ideas of statistical decision theory are laid out in this book, including minimax problems. 

Robert and Casella, "Monte Carlo Statistical Methods". 
Gives an excellent overview of Monte Carlo methods, including theory and many useful examples. 


Reinforcement learning 

Puterman, "Markov Decision Processes". 
Examines various types of Markov decision processes and gives detailed proofs of basic dynamic programming algorithms, as well as important cases not treated elsewhere. 

Bertsekas and Tsitsiklis, "Neuro-dynamic programming". 
This work connects basic MDP theory to reinforcement learning, in the case where we perform approximate dynamic programming via simulation. Many algorithms are described and analysed (in the limit). 

Sutton and Barto, "Reinforcement learning: an introduction". 
Gives an overview of the problem of reinforcement learning, and many important basic algorithms. Includes no analysis, but it is recommended for a first approach to the subject. 


General sequential learning problems 

Cesa-Bianchi and Lugosi, "Prediction, Learning and Games". 
Considers the case where we do not necessarily have an explicit, or estimated environment model, and where we may be acting against an adversarial environment. This book requires patience, but is very rewarding.
Föreläsare
Christos Dimitrakakis
Mer information
E-post: chrdimi at chalmers.se Tele: 031-772 10 44

Publicerad: to 14 okt 2010. Ändrad: on 23 aug 2017