Approximating multivariate posterior distribution functions from Monte Carlo samples for sequential Bayesian inference

Bram Thijssen; Lodewyk F.A. Wessels

doi:10.1371/journal.pone.0230101

Approximating multivariate posterior distribution functions from Monte Carlo samples for sequential Bayesian inference

Bram Thijssen, Lodewyk F.A. Wessels^*

^*Corresponding author for this work

Pattern Recognition and Bioinformatics

Research output: Contribution to journal › Article › Scientific › peer-review

5 Citations (Scopus)

120 Downloads (Pure)

Abstract

An important feature of Bayesian statistics is the opportunity to do sequential inference: The posterior distribution obtained after seeing a dataset can be used as prior for a second inference. However, when Monte Carlo sampling methods are used for inference, we only have a set of samples from the posterior distribution. To do sequential inference, we then either have to evaluate the second posterior at only these locations and reweight the samples accordingly, or we can estimate a functional description of the posterior probability distribution from the samples and use that as prior for the second inference. Here, we investigated to what extent we can obtain an accurate joint posterior from two datasets if the inference is done sequentially rather than jointly, under the condition that each inference step is done using Monte Carlo sampling. To test this, we evaluated the accuracy of kernel density estimates, Gaussian mixtures, mixtures of factor analyzers, vine copulas and Gaussian processes in approximating posterior distributions, and then tested whether these approximations can be used in sequential inference. In low dimensionality, Gaussian processes are more accurate, whereas in higher dimensionality Gaussian mixtures, mixtures of factor analyzers or vine copulas perform better. In our test cases of sequential inference, using posterior approximations gives more accurate results than direct sample reweighting, but joint inference is still preferable over sequential inference whenever possible. Since the performance is case-specific, we provide an R package mvdens with a unified interface for the density approximation methods.

Original language	English
Article number	e0230101
Pages (from-to)	1-25
Number of pages	25
Journal	PLoS ONE
Volume	15
Issue number	3
DOIs	https://doi.org/10.1371/journal.pone.0230101
Publication status	Published - 2020

Access to Document

10.1371/journal.pone.0230101

journal.pone.0230101Final published version, 3.22 MBLicence: CC BY

Cite this

@article{22804079464044ceb73cf9bfbaab0af0,

title = "Approximating multivariate posterior distribution functions from Monte Carlo samples for sequential Bayesian inference",

abstract = "An important feature of Bayesian statistics is the opportunity to do sequential inference: The posterior distribution obtained after seeing a dataset can be used as prior for a second inference. However, when Monte Carlo sampling methods are used for inference, we only have a set of samples from the posterior distribution. To do sequential inference, we then either have to evaluate the second posterior at only these locations and reweight the samples accordingly, or we can estimate a functional description of the posterior probability distribution from the samples and use that as prior for the second inference. Here, we investigated to what extent we can obtain an accurate joint posterior from two datasets if the inference is done sequentially rather than jointly, under the condition that each inference step is done using Monte Carlo sampling. To test this, we evaluated the accuracy of kernel density estimates, Gaussian mixtures, mixtures of factor analyzers, vine copulas and Gaussian processes in approximating posterior distributions, and then tested whether these approximations can be used in sequential inference. In low dimensionality, Gaussian processes are more accurate, whereas in higher dimensionality Gaussian mixtures, mixtures of factor analyzers or vine copulas perform better. In our test cases of sequential inference, using posterior approximations gives more accurate results than direct sample reweighting, but joint inference is still preferable over sequential inference whenever possible. Since the performance is case-specific, we provide an R package mvdens with a unified interface for the density approximation methods.",

author = "Bram Thijssen and Wessels, {Lodewyk F.A.}",

year = "2020",

doi = "10.1371/journal.pone.0230101",

language = "English",

volume = "15",

pages = "1--25",

journal = "PLoS ONE",

issn = "1932-6203",

publisher = "Public Library of Science (PLOS)",

number = "3",

}

TY - JOUR

T1 - Approximating multivariate posterior distribution functions from Monte Carlo samples for sequential Bayesian inference

AU - Thijssen, Bram

AU - Wessels, Lodewyk F.A.

PY - 2020

Y1 - 2020

N2 - An important feature of Bayesian statistics is the opportunity to do sequential inference: The posterior distribution obtained after seeing a dataset can be used as prior for a second inference. However, when Monte Carlo sampling methods are used for inference, we only have a set of samples from the posterior distribution. To do sequential inference, we then either have to evaluate the second posterior at only these locations and reweight the samples accordingly, or we can estimate a functional description of the posterior probability distribution from the samples and use that as prior for the second inference. Here, we investigated to what extent we can obtain an accurate joint posterior from two datasets if the inference is done sequentially rather than jointly, under the condition that each inference step is done using Monte Carlo sampling. To test this, we evaluated the accuracy of kernel density estimates, Gaussian mixtures, mixtures of factor analyzers, vine copulas and Gaussian processes in approximating posterior distributions, and then tested whether these approximations can be used in sequential inference. In low dimensionality, Gaussian processes are more accurate, whereas in higher dimensionality Gaussian mixtures, mixtures of factor analyzers or vine copulas perform better. In our test cases of sequential inference, using posterior approximations gives more accurate results than direct sample reweighting, but joint inference is still preferable over sequential inference whenever possible. Since the performance is case-specific, we provide an R package mvdens with a unified interface for the density approximation methods.

AB - An important feature of Bayesian statistics is the opportunity to do sequential inference: The posterior distribution obtained after seeing a dataset can be used as prior for a second inference. However, when Monte Carlo sampling methods are used for inference, we only have a set of samples from the posterior distribution. To do sequential inference, we then either have to evaluate the second posterior at only these locations and reweight the samples accordingly, or we can estimate a functional description of the posterior probability distribution from the samples and use that as prior for the second inference. Here, we investigated to what extent we can obtain an accurate joint posterior from two datasets if the inference is done sequentially rather than jointly, under the condition that each inference step is done using Monte Carlo sampling. To test this, we evaluated the accuracy of kernel density estimates, Gaussian mixtures, mixtures of factor analyzers, vine copulas and Gaussian processes in approximating posterior distributions, and then tested whether these approximations can be used in sequential inference. In low dimensionality, Gaussian processes are more accurate, whereas in higher dimensionality Gaussian mixtures, mixtures of factor analyzers or vine copulas perform better. In our test cases of sequential inference, using posterior approximations gives more accurate results than direct sample reweighting, but joint inference is still preferable over sequential inference whenever possible. Since the performance is case-specific, we provide an R package mvdens with a unified interface for the density approximation methods.

UR - http://www.scopus.com/inward/record.url?scp=85081663501&partnerID=8YFLogxK

U2 - 10.1371/journal.pone.0230101

DO - 10.1371/journal.pone.0230101

M3 - Article

AN - SCOPUS:85081663501

SN - 1932-6203

VL - 15

SP - 1

EP - 25

JO - PLoS ONE

JF - PLoS ONE

IS - 3

M1 - e0230101

ER -

Approximating multivariate posterior distribution functions from Monte Carlo samples for sequential Bayesian inference

Abstract

Access to Document

Other files and links

Fingerprint

Cite this