Abstract
Model-based Bayesian Reinforcement Learning (BRL) provides a principled solution to dealing with the exploration-exploitation trade-off, but such methods typically assume a fully observable environments. The few Bayesian RL methods that are applicable in partially observable domains, such as the Bayes-Adaptive POMDP (BA-POMDP), scale poorly. To address this issue, we introduce the Factored BA-POMDP model (FBA-POMDP), a framework that is able to learn a compact model of the dynamics by exploiting the underlying structure of a POMDP. The FBA-POMDP framework casts the problem as a planning task, for which we adapt the Monte-Carlo Tree Search planning algorithm and develop a belief tracking method to approximate the joint posterior over the state and model variables. Our empirical results show that this method outperforms a number of BRL baselines and is able to learn efficiently when the factorization is known, as well as learn both the factorization and the model parameters simultaneously.
Original language | English |
---|---|
Title of host publication | 18th International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2019 |
Subtitle of host publication | Proceedings of the Eighteenth International Conference on Autonomous Agents and Multiagent Systems (AAMAS) |
Publisher | International Foundation for Autonomous Agents and Multiagent Systems (IFAAMAS) |
Pages | 7-15 |
Number of pages | 9 |
ISBN (Electronic) | 9781510892002 |
ISBN (Print) | 978-1-4503-6309-9 |
Publication status | Published - 2019 |
Event | AAMAS 2019: The 18th International Conference on Autonomous Agents and MultiAgent Systems - Montreal, Canada Duration: 13 May 2019 → 17 May 2019 |
Publication series
Name | Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS |
---|---|
Volume | 1 |
ISSN (Print) | 1548-8403 |
ISSN (Electronic) | 1558-2914 |
Conference
Conference | AAMAS 2019 |
---|---|
Country/Territory | Canada |
City | Montreal |
Period | 13/05/19 → 17/05/19 |
Bibliographical note
Green Open Access added to TU Delft Institutional Repository ‘You share, we take care!’ – Taverne project https://www.openaccess.nl/en/you-share-we-take-careOtherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.
Keywords
- Bayes networks
- Bayesian Reinforcement Learning
- Monte-carlo tree search
- Monte-chain monte-carlo
- POMDPs