Autonomous guidance and navigation problems often have high-dimensional spaces, multiple objectives, and consequently a large number of states and actions, which is known as the ‘curse of dimensionality’. Furthermore, systems often have partial observability instead of a perfect perception of their environment. Recent research has sought to deal with these problems by using Hierarchical Reinforcement Learning, which often uses same or similar reinforcement learning methods within one application so that multiple objectives can be combined. However, there is not a single learning method that can benefit all targets. To acquire optimal decision-making most efficiently, this paper proposes a hybrid Hierarchical Reinforcement Learning method consisting of several levels, where each level uses various methods to optimize the learning with different types of information and objectives. An algorithm is provided using the proposed method and applied to an online guidance and navigation task. The navigation environments are complex, partially observable, and a priori unknown. Simulation results indicate that the proposed hybrid Hierarchical Reinforcement Learning method, compared to flat or non-hybrid methods, can help to accelerate learning, to alleviate the ‘curse of dimensionality’ in complex decision-making tasks. In addition, the mixture of relative micro states and absolute macro states can help to reduce the uncertainty or ambiguity at high levels, to transfer the learned results within and across tasks efficiently, and to apply to non-stationary environments. This proposed method can yield a hierarchical optimal policy for autonomous guidance and navigation without a priori knowledge of the system or the environment.

Original languageEnglish
Pages (from-to)443-457
Number of pages15
JournalNeurocomputing
Volume331
DOIs
Publication statusPublished - 2019

    Research areas

  • Hierarchical Reinforcement Learning, Hybrid learning, Non-stationary environment, Online guidance and navigation, Partial observability

ID: 47880059