Data-driven approximate dynamic programming: A linear programming approach

Tobias Sutter; Angeliki Kamoutsi; Peyman Mohajerin Esfahani; John Lygeros

doi:10.1109/CDC.2017.8264426

Data-driven approximate dynamic programming: A linear programming approach

Tobias Sutter, Angeliki Kamoutsi, Peyman Mohajerin Esfahani, John Lygeros

Research output: Chapter in Book/Conference proceedings/Edited volume › Conference contribution › Scientific › peer-review

8 Citations (Scopus)

44 Downloads (Pure)

Abstract

This article presents an approximation scheme for the infinite-dimensional linear programming formulation of discrete-time Markov control processes via a finite-dimensional convex program, when the dynamics are unknown and learned from data. We derive a probabilistic explicit error bound between the data-driven finite convex program and the original infinite linear program. We further discuss the sample complexity of the error bound which translates to the number of samples required for an a priori approximation accuracy. Our analysis sheds light on the impact of the choice of basis functions for approximating the true value function. Finally, the relevance of the method is illustrated on a truncated LQG problem.

Original language	English
Title of host publication	Proceedings of the 2017 IEEE 56th Annual Conference on Decision and Control
Editors	A Astolfi et al
Place of Publication	Piscataway, NJ, USA
Publisher	IEEE
Pages	5174-5179
ISBN (Electronic)	978-150902873-3
DOIs	https://doi.org/10.1109/CDC.2017.8264426
Publication status	Published - 2017
Event	CDC 2017: 56th IEEE Annual Conference on Decision and Control - Melbourne, Australia Duration: 12 Dec 2017 → 15 Dec 2017 http://cdc2017.ieeecss.org/

Conference

Conference	CDC 2017: 56th IEEE Annual Conference on Decision and Control
Country/Territory	Australia
City	Melbourne
Period	12/12/17 → 15/12/17
Other	The CDC is recognized as the premier scientific and engineering conference dedicated to the advancement of the theory and practice of systems and control. The CDC annually brings together an international community of researchers and practitioners in the field of automatic control to discuss new research results, perspectives on future developments, and innovative applications relevant to decision making, systems and control, and related areas.
Internet address	http://cdc2017.ieeecss.org/

Access to Document

10.1109/CDC.2017.8264426

ADPAccepted author manuscript, 300 KB

Cite this

@inproceedings{77db1a91fb474c139f153303ce6648f7,

title = "Data-driven approximate dynamic programming: A linear programming approach",

abstract = "This article presents an approximation scheme for the infinite-dimensional linear programming formulation of discrete-time Markov control processes via a finite-dimensional convex program, when the dynamics are unknown and learned from data. We derive a probabilistic explicit error bound between the data-driven finite convex program and the original infinite linear program. We further discuss the sample complexity of the error bound which translates to the number of samples required for an a priori approximation accuracy. Our analysis sheds light on the impact of the choice of basis functions for approximating the true value function. Finally, the relevance of the method is illustrated on a truncated LQG problem.",

author = "Tobias Sutter and Angeliki Kamoutsi and Esfahani, {Peyman Mohajerin} and John Lygeros",

year = "2017",

doi = "10.1109/CDC.2017.8264426",

language = "English",

pages = "5174--5179",

editor = "{Astolfi et al}, A",

booktitle = "Proceedings of the 2017 IEEE 56th Annual Conference on Decision and Control",

publisher = "IEEE",

address = "United States",

note = "CDC 2017: 56th IEEE Annual Conference on Decision and Control ; Conference date: 12-12-2017 Through 15-12-2017",

url = "http://cdc2017.ieeecss.org/",

}

Sutter, T, Kamoutsi, A, Esfahani, PM & Lygeros, J 2017, Data-driven approximate dynamic programming: A linear programming approach. in A Astolfi et al (ed.), Proceedings of the 2017 IEEE 56th Annual Conference on Decision and Control. IEEE, Piscataway, NJ, USA, pp. 5174-5179, CDC 2017: 56th IEEE Annual Conference on Decision and Control, Melbourne, Australia, 12/12/17. https://doi.org/10.1109/CDC.2017.8264426

Data-driven approximate dynamic programming: A linear programming approach. / Sutter, Tobias; Kamoutsi, Angeliki; Esfahani, Peyman Mohajerin et al.
Proceedings of the 2017 IEEE 56th Annual Conference on Decision and Control. ed. / A Astolfi et al. Piscataway, NJ, USA: IEEE, 2017. p. 5174-5179.

Research output: Chapter in Book/Conference proceedings/Edited volume › Conference contribution › Scientific › peer-review

TY - GEN

T1 - Data-driven approximate dynamic programming

T2 - CDC 2017: 56th IEEE Annual Conference on Decision and Control

AU - Sutter, Tobias

AU - Kamoutsi, Angeliki

AU - Esfahani, Peyman Mohajerin

AU - Lygeros, John

PY - 2017

Y1 - 2017

N2 - This article presents an approximation scheme for the infinite-dimensional linear programming formulation of discrete-time Markov control processes via a finite-dimensional convex program, when the dynamics are unknown and learned from data. We derive a probabilistic explicit error bound between the data-driven finite convex program and the original infinite linear program. We further discuss the sample complexity of the error bound which translates to the number of samples required for an a priori approximation accuracy. Our analysis sheds light on the impact of the choice of basis functions for approximating the true value function. Finally, the relevance of the method is illustrated on a truncated LQG problem.

AB - This article presents an approximation scheme for the infinite-dimensional linear programming formulation of discrete-time Markov control processes via a finite-dimensional convex program, when the dynamics are unknown and learned from data. We derive a probabilistic explicit error bound between the data-driven finite convex program and the original infinite linear program. We further discuss the sample complexity of the error bound which translates to the number of samples required for an a priori approximation accuracy. Our analysis sheds light on the impact of the choice of basis functions for approximating the true value function. Finally, the relevance of the method is illustrated on a truncated LQG problem.

UR - http://www.scopus.com/inward/record.url?scp=85046127695&partnerID=8YFLogxK

U2 - 10.1109/CDC.2017.8264426

DO - 10.1109/CDC.2017.8264426

M3 - Conference contribution

AN - SCOPUS:85046127695

SP - 5174

EP - 5179

BT - Proceedings of the 2017 IEEE 56th Annual Conference on Decision and Control

A2 - Astolfi et al, A

PB - IEEE

CY - Piscataway, NJ, USA

Y2 - 12 December 2017 through 15 December 2017

ER -

Data-driven approximate dynamic programming: A linear programming approach

Abstract

Conference

Access to Document

Other files and links

Fingerprint

Cite this