Bootstrapping LPs in Value Iteration for Multi-Objective and Partially Observable MDPs

Diederik M. Roijers; Erwin Walraven; Matthijs T.J. Spaan

Bootstrapping LPs in Value Iteration for Multi-Objective and Partially Observable MDPs

Diederik M. Roijers, Erwin Walraven, Matthijs T.J. Spaan

Algorithmics

Research output: Chapter in Book/Conference proceedings/Edited volume › Conference contribution › Scientific › peer-review

4 Citations (Scopus)

88 Downloads (Pure)

Abstract

Iteratively solving a set of linear programs (LPs) is a common strategy for solving various decision-making problems in Artificial Intelligence, such as planning in multi-objective or partially observable Markov Decision Processes (MDPs). A prevalent feature is that the solutions to these LPs become increasingly similar as the solving algorithm converges, because the solution computed by the algorithm approaches the fixed point of a Bellman backup operator. In this paper, we propose to speed up the solving process of these LPs by bootstrapping based on similar LPs solved previously. We use these LPs to initialize a subset of relevant LP constraints, before iteratively generating the remaining constraints. The resulting algorithm is the first to consider such information sharing across iterations. We evaluate our approach on planning in Multi-Objective MDPs (MOMDPs) and Partially Observable MDPs (POMDPs), showing that it solves fewer LPs than the state of the art, which leads to a significant speed-up. Moreover, for MOMDPs we show that our method scales better in both the number of states and the number of objectives, which is vital for multi-objective planning.

Original language	English
Title of host publication	Proceedings of the 28th International Conference on Automated Planning and Scheduling
Editors	Mathijs de Weerdt, Sven Koenig, Gabriele Roeger, Matthijs Spaan
Publisher	Association for the Advancement of Artificial Intelligence (AAAI)
Pages	218-226
Number of pages	9
ISBN (Print)	978-1-57735-797-1
Publication status	Published - 2018
Event	28th International Conference on Automated Planning and Scheduling: KEPS 2018 - Delft, Delft, Netherlands Duration: 24 Jun 2018 → 29 Jun 2018 Conference number: 28 http://www.icaps-conference.org

Conference

Conference	28th International Conference on Automated Planning and Scheduling
Abbreviated title	ICAPS 2018
Country/Territory	Netherlands
City	Delft
Period	24/06/18 → 29/06/18
Internet address	http://www.icaps-conference.org

Access to Document

icaps2018Accepted author manuscript, 308 KB

https://www.aaai.org/ocs/index.php/ICAPS/ICAPS18/paper/view/17771/16981Licence: Unspecified

Cite this

Roijers, D. M., Walraven, E., & Spaan, M. T. J. (2018). Bootstrapping LPs in Value Iteration for Multi-Objective and Partially Observable MDPs. In M. de Weerdt, S. Koenig, G. Roeger, & M. Spaan (Eds.), Proceedings of the 28th International Conference on Automated Planning and Scheduling (pp. 218-226). Association for the Advancement of Artificial Intelligence (AAAI). https://www.aaai.org/ocs/index.php/ICAPS/ICAPS18/paper/view/17771/16981

Roijers, Diederik M. ; Walraven, Erwin ; Spaan, Matthijs T.J. / Bootstrapping LPs in Value Iteration for Multi-Objective and Partially Observable MDPs. Proceedings of the 28th International Conference on Automated Planning and Scheduling . editor / Mathijs de Weerdt ; Sven Koenig ; Gabriele Roeger ; Matthijs Spaan. Association for the Advancement of Artificial Intelligence (AAAI), 2018. pp. 218-226

@inproceedings{46e473c94feb4735b14261555270eea3,

title = "Bootstrapping LPs in Value Iteration for Multi-Objective and Partially Observable MDPs",

abstract = "Iteratively solving a set of linear programs (LPs) is a common strategy for solving various decision-making problems in Artificial Intelligence, such as planning in multi-objective or partially observable Markov Decision Processes (MDPs). A prevalent feature is that the solutions to these LPs become increasingly similar as the solving algorithm converges, because the solution computed by the algorithm approaches the fixed point of a Bellman backup operator. In this paper, we propose to speed up the solving process of these LPs by bootstrapping based on similar LPs solved previously. We use these LPs to initialize a subset of relevant LP constraints, before iteratively generating the remaining constraints. The resulting algorithm is the first to consider such information sharing across iterations. We evaluate our approach on planning in Multi-Objective MDPs (MOMDPs) and Partially Observable MDPs (POMDPs), showing that it solves fewer LPs than the state of the art, which leads to a significant speed-up. Moreover, for MOMDPs we show that our method scales better in both the number of states and the number of objectives, which is vital for multi-objective planning.",

author = "Roijers, {Diederik M.} and Erwin Walraven and Spaan, {Matthijs T.J.}",

year = "2018",

language = "English",

isbn = "978-1-57735-797-1",

pages = "218--226",

editor = "{de Weerdt}, Mathijs and Sven Koenig and Gabriele Roeger and Matthijs Spaan",

booktitle = "Proceedings of the 28th International Conference on Automated Planning and Scheduling",

publisher = "Association for the Advancement of Artificial Intelligence (AAAI)",

note = "28th International Conference on Automated Planning and Scheduling : KEPS 2018, ICAPS 2018 ; Conference date: 24-06-2018 Through 29-06-2018",

url = "http://www.icaps-conference.org",

}

Roijers, DM, Walraven, E & Spaan, MTJ 2018, Bootstrapping LPs in Value Iteration for Multi-Objective and Partially Observable MDPs. in M de Weerdt, S Koenig, G Roeger & M Spaan (eds), Proceedings of the 28th International Conference on Automated Planning and Scheduling . Association for the Advancement of Artificial Intelligence (AAAI), pp. 218-226, 28th International Conference on Automated Planning and Scheduling, Delft, Netherlands, 24/06/18. <https://www.aaai.org/ocs/index.php/ICAPS/ICAPS18/paper/view/17771/16981>

Bootstrapping LPs in Value Iteration for Multi-Objective and Partially Observable MDPs. / Roijers, Diederik M.; Walraven, Erwin; Spaan, Matthijs T.J.
Proceedings of the 28th International Conference on Automated Planning and Scheduling . ed. / Mathijs de Weerdt; Sven Koenig; Gabriele Roeger; Matthijs Spaan. Association for the Advancement of Artificial Intelligence (AAAI), 2018. p. 218-226.

Research output: Chapter in Book/Conference proceedings/Edited volume › Conference contribution › Scientific › peer-review

TY - GEN

T1 - Bootstrapping LPs in Value Iteration for Multi-Objective and Partially Observable MDPs

AU - Roijers, Diederik M.

AU - Walraven, Erwin

AU - Spaan, Matthijs T.J.

N1 - Conference code: 28

PY - 2018

Y1 - 2018

N2 - Iteratively solving a set of linear programs (LPs) is a common strategy for solving various decision-making problems in Artificial Intelligence, such as planning in multi-objective or partially observable Markov Decision Processes (MDPs). A prevalent feature is that the solutions to these LPs become increasingly similar as the solving algorithm converges, because the solution computed by the algorithm approaches the fixed point of a Bellman backup operator. In this paper, we propose to speed up the solving process of these LPs by bootstrapping based on similar LPs solved previously. We use these LPs to initialize a subset of relevant LP constraints, before iteratively generating the remaining constraints. The resulting algorithm is the first to consider such information sharing across iterations. We evaluate our approach on planning in Multi-Objective MDPs (MOMDPs) and Partially Observable MDPs (POMDPs), showing that it solves fewer LPs than the state of the art, which leads to a significant speed-up. Moreover, for MOMDPs we show that our method scales better in both the number of states and the number of objectives, which is vital for multi-objective planning.

AB - Iteratively solving a set of linear programs (LPs) is a common strategy for solving various decision-making problems in Artificial Intelligence, such as planning in multi-objective or partially observable Markov Decision Processes (MDPs). A prevalent feature is that the solutions to these LPs become increasingly similar as the solving algorithm converges, because the solution computed by the algorithm approaches the fixed point of a Bellman backup operator. In this paper, we propose to speed up the solving process of these LPs by bootstrapping based on similar LPs solved previously. We use these LPs to initialize a subset of relevant LP constraints, before iteratively generating the remaining constraints. The resulting algorithm is the first to consider such information sharing across iterations. We evaluate our approach on planning in Multi-Objective MDPs (MOMDPs) and Partially Observable MDPs (POMDPs), showing that it solves fewer LPs than the state of the art, which leads to a significant speed-up. Moreover, for MOMDPs we show that our method scales better in both the number of states and the number of objectives, which is vital for multi-objective planning.

M3 - Conference contribution

SN - 978-1-57735-797-1

SP - 218

EP - 226

BT - Proceedings of the 28th International Conference on Automated Planning and Scheduling

A2 - de Weerdt, Mathijs

A2 - Koenig, Sven

A2 - Roeger, Gabriele

A2 - Spaan, Matthijs

PB - Association for the Advancement of Artificial Intelligence (AAAI)

T2 - 28th International Conference on Automated Planning and Scheduling

Y2 - 24 June 2018 through 29 June 2018

ER -

Bootstrapping LPs in Value Iteration for Multi-Objective and Partially Observable MDPs

Abstract

Conference

Access to Document

Fingerprint

Cite this