Generalized Optimistic Q-Learning with Provable Efficiency

Neustroev, G. (Speaker)

Algorithmics

Activity: Talk or presentation › Talk or presentation at a conference

Description

Reinforcement learning (RL), like any on-line learning method, inevitably faces the exploration-exploitation dilemma. When a learning algorithm requires as few data samples as possible, it is called sample efficient. The design of sample-efficient algorithms is an important area of research. Interestingly, all currently known provably efficient model-free RL algorithms utilize the same well-known principle of optimism in the face of uncertainty. We unite these existing algorithms into a single general model-free optimistic RL framework. We show how this facilitates the design of new optimistic model-free RL algorithms by simplifying the analysis of their efficiency. Finally, we propose one such new algorithm and demonstrate its performance in an experimental study.

Period	11 May 2020
Event title	AAMAS 2020: The 19th International Conference on Autonomous Agents and Multi-Agent Systems
Event type	Conference
Conference number	19th
Location	Auckland, New ZealandShow on map
Degree of Recognition	International

Documents & Links

Virtual/online event due to COVID-19