Research output per year
Research output per year
Research output: Chapter in Book/Conference proceedings/Edited volume › Conference contribution › Scientific › peer-review
Reinforcement Learning (RL) deals with problems that can be modeled as a Markov Decision Process (MDP) where the transition function is unknown. In situations where an arbitrary policy π is already in execution and the experiences with the environment were recorded in a batch D, an RL algorithm can use D to compute a new policy π 0. However, the policy computed by traditional RL algorithms might have worse performance compared to π. Our goal is to develop safe RL algorithms, where the agent has a high confidence that the performance of π 0 is better than the performance of π given D. To develop sample-efficient and safe RL algorithms we combine ideas from exploration strategies in RL with a safe policy improvement method.
Original language | English |
---|---|
Title of host publication | Proceedings of the 28th International Joint Conference on Artificial Intelligence, IJCAI 2019 |
Editors | Sarit Kraus |
Publisher | International Joint Conferences on Artifical Intelligence (IJCAI) |
Pages | 6460-6461 |
Number of pages | 2 |
ISBN (Electronic) | 978-0-9992411-4-1 |
DOIs | |
Publication status | Published - 2019 |
Event | IJCAI 2019: 28th International Joint Conference on Artificial Intelligence - Macao, China Duration: 10 Aug 2019 → 16 Aug 2019 |
Name | IJCAI International Joint Conference on Artificial Intelligence |
---|---|
Volume | 2019-August |
ISSN (Print) | 1045-0823 |
Conference | IJCAI 2019 |
---|---|
Country/Territory | China |
City | Macao |
Period | 10/08/19 → 16/08/19 |
Research output: Thesis › Dissertation (TU Delft)
Research output: Chapter in Book/Conference proceedings/Edited volume › Conference contribution › Scientific › peer-review
Research output: Chapter in Book/Conference proceedings/Edited volume › Conference contribution › Scientific › peer-review