The Zig-Zag process and super-efficient sampling for Bayesian analysis of big data

Joris Bierkens; Paul  Fearnhead; Gareth Roberts

doi:10.1214/18-AOS1715

The Zig-Zag process and super-efficient sampling for Bayesian analysis of big data

Joris Bierkens, Paul Fearnhead, Gareth Roberts

Statistics

Research output: Contribution to journal › Article › Scientific › peer-review

102 Citations (Scopus)

176 Downloads (Pure)

Abstract

Standard MCMC methods can scale poorly to big data settings due to the need to evaluate the likelihood at each iteration. There have been a number of approximate MCMC algorithms that use sub-sampling ideas to reduce this computational burden, but with the drawback that these algorithms no longer target the true posterior distribution. We introduce a new family of Monte Carlo methods based upon a multidimensional version of the Zig-Zag process of [Ann. Appl. Probab. 27 (2017) 846–882], a continuous-time piecewise deterministic Markov process. While traditional MCMC methods are reversible by construction (a property which is known to inhibit rapid convergence) the Zig-Zag process offers a flexible nonreversible alternative which we observe to often have favourable convergence properties. We show how the Zig-Zag process can be simulated without discretisation error, and give conditions for the process to be ergodic. Most importantly, we introduce a sub-sampling version of the Zig-Zag process that is an example of an exact approximate scheme, that is, the resulting approximate process still has the posterior as its stationary distribution. Furthermore, if we use a control-variate idea to reduce the variance of our unbiased estimator, then the Zig-Zag process can be super-efficient: after an initial preprocessing step, essentially independent samples from the posterior distribution are obtained at a computational cost which does not depend on the size of the data.

Original language	English
Pages (from-to)	1288-1320
Number of pages	33
Journal	Annals of Statistics
Volume	47
Issue number	3
DOIs	https://doi.org/10.1214/18-AOS1715
Publication status	Published - 2019

Keywords

MCMC
nonreversible Markov process
piecewise deterministic Markov process
stochastic gradient Langevin dynamics
sub-sampling
exact sampling

Access to Document

10.1214/18-AOS1715

AOS1715Final published version, 689 KB

Cite this

@article{c8aa0421e47f48c386119bd8a55d6f2c,

title = "The Zig-Zag process and super-efficient sampling for Bayesian analysis of big data",

abstract = "Standard MCMC methods can scale poorly to big data settings due to the need to evaluate the likelihood at each iteration. There have been a number of approximate MCMC algorithms that use sub-sampling ideas to reduce this computational burden, but with the drawback that these algorithms no longer target the true posterior distribution. We introduce a new family of Monte Carlo methods based upon a multidimensional version of the Zig-Zag process of [Ann. Appl. Probab. 27 (2017) 846–882], a continuous-time piecewise deterministic Markov process. While traditional MCMC methods are reversible by construction (a property which is known to inhibit rapid convergence) the Zig-Zag process offers a flexible nonreversible alternative which we observe to often have favourable convergence properties. We show how the Zig-Zag process can be simulated without discretisation error, and give conditions for the process to be ergodic. Most importantly, we introduce a sub-sampling version of the Zig-Zag process that is an example of an exact approximate scheme, that is, the resulting approximate process still has the posterior as its stationary distribution. Furthermore, if we use a control-variate idea to reduce the variance of our unbiased estimator, then the Zig-Zag process can be super-efficient: after an initial preprocessing step, essentially independent samples from the posterior distribution are obtained at a computational cost which does not depend on the size of the data.",

keywords = "MCMC, nonreversible Markov process, piecewise deterministic Markov process, stochastic gradient Langevin dynamics, sub-sampling, exact sampling",

author = "Joris Bierkens and Paul Fearnhead and Gareth Roberts",

year = "2019",

doi = "10.1214/18-AOS1715",

language = "English",

volume = "47",

pages = "1288--1320",

journal = "Annals of Statistics",

issn = "0090-5364",

publisher = "Institute of Mathematical Statistics",

number = "3",

}

TY - JOUR

T1 - The Zig-Zag process and super-efficient sampling for Bayesian analysis of big data

AU - Bierkens, Joris

AU - Fearnhead, Paul

AU - Roberts, Gareth

PY - 2019

Y1 - 2019

N2 - Standard MCMC methods can scale poorly to big data settings due to the need to evaluate the likelihood at each iteration. There have been a number of approximate MCMC algorithms that use sub-sampling ideas to reduce this computational burden, but with the drawback that these algorithms no longer target the true posterior distribution. We introduce a new family of Monte Carlo methods based upon a multidimensional version of the Zig-Zag process of [Ann. Appl. Probab. 27 (2017) 846–882], a continuous-time piecewise deterministic Markov process. While traditional MCMC methods are reversible by construction (a property which is known to inhibit rapid convergence) the Zig-Zag process offers a flexible nonreversible alternative which we observe to often have favourable convergence properties. We show how the Zig-Zag process can be simulated without discretisation error, and give conditions for the process to be ergodic. Most importantly, we introduce a sub-sampling version of the Zig-Zag process that is an example of an exact approximate scheme, that is, the resulting approximate process still has the posterior as its stationary distribution. Furthermore, if we use a control-variate idea to reduce the variance of our unbiased estimator, then the Zig-Zag process can be super-efficient: after an initial preprocessing step, essentially independent samples from the posterior distribution are obtained at a computational cost which does not depend on the size of the data.

AB - Standard MCMC methods can scale poorly to big data settings due to the need to evaluate the likelihood at each iteration. There have been a number of approximate MCMC algorithms that use sub-sampling ideas to reduce this computational burden, but with the drawback that these algorithms no longer target the true posterior distribution. We introduce a new family of Monte Carlo methods based upon a multidimensional version of the Zig-Zag process of [Ann. Appl. Probab. 27 (2017) 846–882], a continuous-time piecewise deterministic Markov process. While traditional MCMC methods are reversible by construction (a property which is known to inhibit rapid convergence) the Zig-Zag process offers a flexible nonreversible alternative which we observe to often have favourable convergence properties. We show how the Zig-Zag process can be simulated without discretisation error, and give conditions for the process to be ergodic. Most importantly, we introduce a sub-sampling version of the Zig-Zag process that is an example of an exact approximate scheme, that is, the resulting approximate process still has the posterior as its stationary distribution. Furthermore, if we use a control-variate idea to reduce the variance of our unbiased estimator, then the Zig-Zag process can be super-efficient: after an initial preprocessing step, essentially independent samples from the posterior distribution are obtained at a computational cost which does not depend on the size of the data.

KW - MCMC

KW - nonreversible Markov process

KW - piecewise deterministic Markov process

KW - stochastic gradient Langevin dynamics

KW - sub-sampling

KW - exact sampling

UR - http://www.scopus.com/inward/record.url?scp=85062974230&partnerID=8YFLogxK

U2 - 10.1214/18-AOS1715

DO - 10.1214/18-AOS1715

M3 - Article

SN - 0090-5364

VL - 47

SP - 1288

EP - 1320

JO - Annals of Statistics

JF - Annals of Statistics

IS - 3

ER -

The Zig-Zag process and super-efficient sampling for Bayesian analysis of big data

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this