Bounding the probability of resource constraint violations in multi-agent MDPs

Frits De Nijs; Erwin Walraven; Mathijs M. De Weerdt; Matthijs T.J. Spaan

Bounding the probability of resource constraint violations in multi-agent MDPs

Frits De Nijs, Erwin Walraven, Mathijs M. De Weerdt, Matthijs T.J. Spaan

Algorithmics

Research output: Chapter in Book/Conference proceedings/Edited volume › Conference contribution › Scientific › peer-review

11 Citations (Scopus)

80 Downloads (Pure)

Abstract

Multi-agent planning problems with constraints on global resource consumption occur in several domains. Existing algorithms for solving Multi-agent Markov Decision Processes can compute policies that meet a resource constraint in expectation, but these policies provide no guarantees on the probability that a resource constraint violation will occur. We derive a method to bound constraint violation probabilities using Hoeffding's inequality. This method is applied to two existing approaches for computing policies satisfying constraints: the Constrained MDP framework and a Column Generation approach. We also introduce an algorithm to adaptively relax the bound up to a given maximum violation tolerance. Experiments on a hard toy problem show that the resulting policies outperform static optimal resource allocations to an arbitrary level. By testing the algorithms on more realistic planning domains from the literature, we demonstrate that the adaptive bound is able to efficiently trade off violation probability with expected value, outperforming state-of-the-art planners.

Original language	English
Title of host publication	Proceedings of the 31st Conference on Artificial Intelligence, AAAI 2017
Publisher	American Association for Artificial Intelligence (AAAI)
Pages	3562-3568
Number of pages	7
ISBN (Print)	978-1577357803
Publication status	Published - 2017
Event	31st AAAI Conference on Artificial Intelligence: AAAI 2017 - Hilton San Francisco Union Square, San Francisco, United States Duration: 4 Feb 2017 → 10 Feb 2017 Conference number: 31 https://aaai.org/Conferences/AAAI/aaai17.php

Conference

Conference	31st AAAI Conference on Artificial Intelligence
Abbreviated title	AAAI Conference on Artificial Intelligence
Country/Territory	United States
City	San Francisco
Period	4/02/17 → 10/02/17
Internet address	https://aaai.org/Conferences/AAAI/aaai17.php

Keywords

Markov Decision Process
Resource constraints
Planning under uncertainty

Access to Document

bounding-the-probability-of-resource-constraint-violations-in-multi-agent-mdps-fullAccepted author manuscript, 421 KB

https://aaai.org/ocs/index.php/AAAI/AAAI17/paper/view/15009/14155

Cite this

@inproceedings{5d6c967ee4b34a6580ec53d191e3da03,

title = "Bounding the probability of resource constraint violations in multi-agent MDPs",

abstract = "Multi-agent planning problems with constraints on global resource consumption occur in several domains. Existing algorithms for solving Multi-agent Markov Decision Processes can compute policies that meet a resource constraint in expectation, but these policies provide no guarantees on the probability that a resource constraint violation will occur. We derive a method to bound constraint violation probabilities using Hoeffding's inequality. This method is applied to two existing approaches for computing policies satisfying constraints: the Constrained MDP framework and a Column Generation approach. We also introduce an algorithm to adaptively relax the bound up to a given maximum violation tolerance. Experiments on a hard toy problem show that the resulting policies outperform static optimal resource allocations to an arbitrary level. By testing the algorithms on more realistic planning domains from the literature, we demonstrate that the adaptive bound is able to efficiently trade off violation probability with expected value, outperforming state-of-the-art planners.",

keywords = "Markov Decision Process, Resource constraints, Planning under uncertainty",

author = "{De Nijs}, Frits and Erwin Walraven and {De Weerdt}, {Mathijs M.} and Spaan, {Matthijs T.J.}",

year = "2017",

language = "English",

isbn = "978-1577357803",

pages = "3562--3568",

booktitle = "Proceedings of the 31st Conference on Artificial Intelligence, AAAI 2017",

publisher = "American Association for Artificial Intelligence (AAAI)",

address = "United States",

note = "31st AAAI Conference on Artificial Intelligence : AAAI 2017, AAAI Conference on Artificial Intelligence ; Conference date: 04-02-2017 Through 10-02-2017",

url = "https://aaai.org/Conferences/AAAI/aaai17.php",

}

De Nijs, F, Walraven, E, De Weerdt, MM & Spaan, MTJ 2017, Bounding the probability of resource constraint violations in multi-agent MDPs. in Proceedings of the 31st Conference on Artificial Intelligence, AAAI 2017. American Association for Artificial Intelligence (AAAI), pp. 3562-3568, 31st AAAI Conference on Artificial Intelligence, San Francisco, United States, 4/02/17. <https://aaai.org/ocs/index.php/AAAI/AAAI17/paper/view/15009/14155>

Bounding the probability of resource constraint violations in multi-agent MDPs. / De Nijs, Frits; Walraven, Erwin; De Weerdt, Mathijs M. et al.
Proceedings of the 31st Conference on Artificial Intelligence, AAAI 2017. American Association for Artificial Intelligence (AAAI), 2017. p. 3562-3568.

Research output: Chapter in Book/Conference proceedings/Edited volume › Conference contribution › Scientific › peer-review

TY - GEN

T1 - Bounding the probability of resource constraint violations in multi-agent MDPs

AU - De Nijs, Frits

AU - Walraven, Erwin

AU - De Weerdt, Mathijs M.

AU - Spaan, Matthijs T.J.

N1 - Conference code: 31

PY - 2017

Y1 - 2017

N2 - Multi-agent planning problems with constraints on global resource consumption occur in several domains. Existing algorithms for solving Multi-agent Markov Decision Processes can compute policies that meet a resource constraint in expectation, but these policies provide no guarantees on the probability that a resource constraint violation will occur. We derive a method to bound constraint violation probabilities using Hoeffding's inequality. This method is applied to two existing approaches for computing policies satisfying constraints: the Constrained MDP framework and a Column Generation approach. We also introduce an algorithm to adaptively relax the bound up to a given maximum violation tolerance. Experiments on a hard toy problem show that the resulting policies outperform static optimal resource allocations to an arbitrary level. By testing the algorithms on more realistic planning domains from the literature, we demonstrate that the adaptive bound is able to efficiently trade off violation probability with expected value, outperforming state-of-the-art planners.

AB - Multi-agent planning problems with constraints on global resource consumption occur in several domains. Existing algorithms for solving Multi-agent Markov Decision Processes can compute policies that meet a resource constraint in expectation, but these policies provide no guarantees on the probability that a resource constraint violation will occur. We derive a method to bound constraint violation probabilities using Hoeffding's inequality. This method is applied to two existing approaches for computing policies satisfying constraints: the Constrained MDP framework and a Column Generation approach. We also introduce an algorithm to adaptively relax the bound up to a given maximum violation tolerance. Experiments on a hard toy problem show that the resulting policies outperform static optimal resource allocations to an arbitrary level. By testing the algorithms on more realistic planning domains from the literature, we demonstrate that the adaptive bound is able to efficiently trade off violation probability with expected value, outperforming state-of-the-art planners.

KW - Markov Decision Process

KW - Resource constraints

KW - Planning under uncertainty

UR - http://resolver.tudelft.nl/uuid:5d6c967e-e4b3-4a65-80ec-53d191e3da03

UR - http://www.scopus.com/inward/record.url?scp=85030478873&partnerID=8YFLogxK

UR - https://aaai.org/ocs/index.php/AAAI/AAAI17/paper/view/15009

M3 - Conference contribution

AN - SCOPUS:85030478873

SN - 978-1577357803

SP - 3562

EP - 3568

BT - Proceedings of the 31st Conference on Artificial Intelligence, AAAI 2017

PB - American Association for Artificial Intelligence (AAAI)

T2 - 31st AAAI Conference on Artificial Intelligence

Y2 - 4 February 2017 through 10 February 2017

ER -

Bounding the probability of resource constraint violations in multi-agent MDPs

Abstract

Conference

Keywords

Access to Document

Other files and links

Fingerprint

GCP: Gaming beyond the Copper Plate

Cite this

Bounding the probability of resource constraint violations in multi-agent MDPs

Abstract

Conference

Keywords

Access to Document

Other files and links

Fingerprint

Projects

GCP: Gaming beyond the Copper Plate

Cite this