Integrating state representation learning into deep reinforcement learning

Tim de Bruin; Jens Kober; Karl Tuyls; Robert Babuska

doi:10.1109/LRA.2018.2800101

Integrating state representation learning into deep reinforcement learning

Tim de Bruin, Jens Kober, Karl Tuyls, Robert Babuska

Learning & Autonomous Control

Research output: Contribution to journal › Article › Scientific › peer-review

86 Citations (Scopus)

86 Downloads (Pure)

Abstract

Most deep reinforcement learning techniques are unsuitable for robotics, as they require too much interaction time to learn useful, general control policies. This problem can be largely attributed to the fact that a state representation needs to be learned as a part of learning control policies, which can only be done through fitting expected returns based on observed rewards. While the reward function provides information on the desirability of the state of the world, it does not necessarily provide information on how to distill a good, general representation of that state from the sensory observations. State representation learning objectives can be used to help learn such a representation. While many of these objectives have been proposed, they are typically not directly combined with reinforcement learning algorithms. We investigate several methods for integrating state representation learning into reinforcement learning. In these methods, the state representation learning objectives help regularize the state representation during the reinforcement learning, and the reinforcement learning itself is viewed as a crucial state representation learning objective and allowed to help shape the representation. Using autonomous racing tests in the TORCS simulator, we show how the integrated methods quickly learn policies that generalize to new environments much better than deep reinforcement learning without state representation learning.

Original language	English
Pages (from-to)	1394-1401
Journal	IEEE Robotics and Automation Letters
Volume	3
Issue number	3
DOIs	https://doi.org/10.1109/LRA.2018.2800101
Publication status	Published - 2018

Bibliographical note

Green Open Access added to TU Delft Institutional Repository 'You share, we take care!' - Taverne project https://www.openaccess.nl/en/you-share-we-take-care

Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.

Keywords

Deep learning in robotics and automation
learning and adaptive systems
sensor fusion

Access to Document

10.1109/LRA.2018.2800101

Integrating_State_Representation_Learning_Into_Deep_Reinforcement_LearningFinal published version, 695 KB

86 Citations
1 Dissertation (TU Delft)

Sample effficient deep reinforcement learning for control
de Bruin, T., 2020, 167 p.
Research output: Thesis › Dissertation (TU Delft)

Open Access
File
487 Downloads (Pure)

Cite this

@article{adbec8394b4b4877914309346b560249,

title = "Integrating state representation learning into deep reinforcement learning",

abstract = "Most deep reinforcement learning techniques are unsuitable for robotics, as they require too much interaction time to learn useful, general control policies. This problem can be largely attributed to the fact that a state representation needs to be learned as a part of learning control policies, which can only be done through fitting expected returns based on observed rewards. While the reward function provides information on the desirability of the state of the world, it does not necessarily provide information on how to distill a good, general representation of that state from the sensory observations. State representation learning objectives can be used to help learn such a representation. While many of these objectives have been proposed, they are typically not directly combined with reinforcement learning algorithms. We investigate several methods for integrating state representation learning into reinforcement learning. In these methods, the state representation learning objectives help regularize the state representation during the reinforcement learning, and the reinforcement learning itself is viewed as a crucial state representation learning objective and allowed to help shape the representation. Using autonomous racing tests in the TORCS simulator, we show how the integrated methods quickly learn policies that generalize to new environments much better than deep reinforcement learning without state representation learning.",

keywords = "Deep learning in robotics and automation, learning and adaptive systems, sensor fusion",

author = "{de Bruin}, Tim and Jens Kober and Karl Tuyls and Robert Babuska",

note = "Green Open Access added to TU Delft Institutional Repository 'You share, we take care!' - Taverne project https://www.openaccess.nl/en/you-share-we-take-care Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.",

year = "2018",

doi = "10.1109/LRA.2018.2800101",

language = "English",

volume = "3",

pages = "1394--1401",

journal = "IEEE Robotics and Automation Letters",

issn = "2377-3766",

publisher = "Institute of Electrical and Electronics Engineers (IEEE)",

number = "3",

}

TY - JOUR

T1 - Integrating state representation learning into deep reinforcement learning

AU - de Bruin, Tim

AU - Kober, Jens

AU - Tuyls, Karl

AU - Babuska, Robert

N1 - Green Open Access added to TU Delft Institutional Repository 'You share, we take care!' - Taverne project https://www.openaccess.nl/en/you-share-we-take-care Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.

PY - 2018

Y1 - 2018

N2 - Most deep reinforcement learning techniques are unsuitable for robotics, as they require too much interaction time to learn useful, general control policies. This problem can be largely attributed to the fact that a state representation needs to be learned as a part of learning control policies, which can only be done through fitting expected returns based on observed rewards. While the reward function provides information on the desirability of the state of the world, it does not necessarily provide information on how to distill a good, general representation of that state from the sensory observations. State representation learning objectives can be used to help learn such a representation. While many of these objectives have been proposed, they are typically not directly combined with reinforcement learning algorithms. We investigate several methods for integrating state representation learning into reinforcement learning. In these methods, the state representation learning objectives help regularize the state representation during the reinforcement learning, and the reinforcement learning itself is viewed as a crucial state representation learning objective and allowed to help shape the representation. Using autonomous racing tests in the TORCS simulator, we show how the integrated methods quickly learn policies that generalize to new environments much better than deep reinforcement learning without state representation learning.

AB - Most deep reinforcement learning techniques are unsuitable for robotics, as they require too much interaction time to learn useful, general control policies. This problem can be largely attributed to the fact that a state representation needs to be learned as a part of learning control policies, which can only be done through fitting expected returns based on observed rewards. While the reward function provides information on the desirability of the state of the world, it does not necessarily provide information on how to distill a good, general representation of that state from the sensory observations. State representation learning objectives can be used to help learn such a representation. While many of these objectives have been proposed, they are typically not directly combined with reinforcement learning algorithms. We investigate several methods for integrating state representation learning into reinforcement learning. In these methods, the state representation learning objectives help regularize the state representation during the reinforcement learning, and the reinforcement learning itself is viewed as a crucial state representation learning objective and allowed to help shape the representation. Using autonomous racing tests in the TORCS simulator, we show how the integrated methods quickly learn policies that generalize to new environments much better than deep reinforcement learning without state representation learning.

KW - Deep learning in robotics and automation

KW - learning and adaptive systems

KW - sensor fusion

U2 - 10.1109/LRA.2018.2800101

DO - 10.1109/LRA.2018.2800101

M3 - Article

SN - 2377-3766

VL - 3

SP - 1394

EP - 1401

JO - IEEE Robotics and Automation Letters

JF - IEEE Robotics and Automation Letters

IS - 3

ER -

Integrating state representation learning into deep reinforcement learning

Abstract

Bibliographical note

Keywords

Access to Document

Fingerprint

Research output

Sample effficient deep reinforcement learning for control

Cite this