`VariationalPosterior`#

Note

The main output of a PyVBMC run is a VariationalPosterior object vp.

The variational posterior can be queried and manipulated via several methods, such as:

moments: compute mean and covariance matrix of the variational posterior;
pdf: evaluate the variational posterior density at a point;
plot: plot the variational posterior as a corner plot (1D and 2D marginals);
sample: draw random samples from the variational posterior.

There are also methods to compare variational posteriors:

kl_div: compute the Kullback-Leibler divergence between two posteriors;
mtv: compute the marginal total variation distance between two posteriors.

See below for the full VariationalPosterior class methods and interface.

class pyvbmc.variational_posterior.VariationalPosterior(D: int, K: int = 2, x0=None, parameter_transformer=None)[source]#

The variational posterior class used in PyVBMC.

The variational posterior represents the approximate posterior as returned by the Variational Bayesian Monte Carlo (VBMC) algorithm.

In VBMC, the variational posterior is a weighted mixture of multivariate normal distributions (see Notes below for details).

Parameters:

Dint: The number of dimensions (i.e., parameters) of the posterior.
Kint, optional: The number of mixture components, default 2.
x0np.ndarray, optional: The starting vector for the mixture components means. It can be a single array or multiple rows (up to K); missing rows are duplicated by making copies of x0, default np.zeros.
parameter_transformerParameterTransformer, optional: The ParameterTransformer object specifying the transformation of the input space that leads to the current representation used by the variational posterior, by default uses an identity transform.

Attributes:

wnp.ndarray: The weights of the VP mixture components, shape (1, K).
etanp.ndarray: The unbounded (softmax) parametrization of the VP mixture components, shape (1, K).
munp.ndarray: The means of the VP mixture components, shape (D, K).
sigmanp.ndarray: The per-component scale of the VP mixture components. Shape (1, K).
lambdnp.ndarray: The per-dimension scale of the VP mixture components. Shape (D, 1).
optimize_weightsbool: Whether to optimize the weights.
optimize_mubool: Whether to optimize the means.
optimize_sigmabool: Whether to optimize sigma.
optimize_lambdbool: Whether to optimize lambd.
parameter_transformerParameterTransformer: The parameter transformer implementing transformations to/from unbounded space.
boundsdict: A dictionary containing the soft bounds for each variable to be optimized.
statsdict: A dictionary of statistics and other relevant info computed during optimization.

Notes

In VBMC, the variational posterior is defined as a mixture of multivariate normal distributions as follows:

\[q({\theta}) = \sum_{k = 1}^K w_k N(\theta, \mu_k, \sigma^2_k \Lambda)\]

where \(w_k\) are the mixture weights, \(\mu_k\) the component means, \(\sigma_k\) scaling factors, and \(\Lambda\) is a diagonal matrix common to all components with elements \(\lambda^2_d\) on the diagonal, for \(1 \le d \le D\).

Note that \(q({\theta})\) is defined in an unconstrained space. Constrained variables in the posterior are mapped to a trasformed, unconstrained space via a nonlinear mapping (represented by a ParameterTransformer object). The transformation is handled automatically.

In practice, you would almost never create a new VariationalPosterior object, simply use the variational posteriors returned by VBMC.

kl_div(self, vp2: VariationalPosterior = None, samples: ndarray = None, N: int = 100000, gauss_flag: bool = False)#

Kullback-Leibler divergence between two variational posteriors.

Compute the forward and reverse Kullback-Leibler (KL) divergence between two posteriors. The other variational posterior can be specified as vp2 (an instance of the class VariationalPosterior) or with samples. One of the two must be specified.

Parameters:

vp2VariationalPosterior, optional: The other VariationalPosterior, by default None.
samplesnp.ndarray, optional: An N-by-D matrix of samples from the other variational posterior, by default None.
Nint, optional: The number of random samples to estimate the KL divergence, by default int(1e5).
gauss_flagbool, optional: If True, returns a “Gaussianized” KL-divergence, that is the KL divergence between two multivariate normal distributions with the same moments as the variational posteriors given as inputs. By default False.

Returns:

kl_div: np.ndarray: A two-element vector containing the forward and reverse Kullback-Leibler divergence between the two posteriors.

Raises:

ValueError: Raised if neither vp2 nor samples are specified.
ValueError: Raised if vp2 is not provided but gauss_flag = False.

Notes

Since the KL divergence is not symmetric, the method returns both the forward and the reverse KL divergence, that is KL(vp1 || vp2) and KL(vp2 || vp1).

log_pdf(self, *args, **kwargs)#

log-probability density function of the variational posterior.

Compute the log density of the variational posterior at one or multiple input points. The parameters are the same as for vp.pdf() with log_flag=True. These parameters are described again here for reference.

Parameters:

xnp.ndarray: x is a matrix of inputs to evaluate the pdf at. The rows of the N-by-D matrix x correspond to observations or points, and columns correspond to variables or coordinates. x is assumed to be in the original space by default.
orig_flagbool, optional: Controls if the value of the posterior density should be evaluated in the original parameter space for orig_flag is True, or in the transformed space if orig_flag is False, by default True. Accordingly, x should be in the original space if orig_flag is True and be in the transformed space if orig_flag is False.
grad_flagbool, optional: If True the gradient of the log-pdf is returned as a second output, by default False.
dffloat, optional: Compute the log-pdf of a heavy-tailed version of the variational posterior, in which the multivariate normal components have been replaced by multivariate t-distributions with df degrees of freedom. The default is df = np.inf, limit in which the t-distribution becomes a multivariate normal.

Returns:

log_pdf: np.ndarray: The probability density of the variational posterior evaluated at each row of x.
gradient: np.ndarray: If grad_flag is True, the function returns the gradient as well.

Raises:

NotImplementedError: Raised if df is non-zero and finite and grad_flag = True (Gradient of heavy-tailed pdf not supported yet).
NotImplementedError: Raised if orig_flag = True and grad_flag = True (gradient computation in original space not supported yet).

load(file)#

Load a VP from a file.

Parameters:

filepath-like: The file name or path to write to.

Returns:

vpVariationalPosterior: The loaded VP instance.

Raises:

OSError: If the file cannot be found, or cannot be opened for other reasons.

moments(self, N: int = 1000000, orig_flag=True, cov_flag=False)#

Compute mean and covariance matrix of variational posterior.

In the original space, the moments are estimated via Monte Carlo sampling. If requested in the transformed (unconstrained) space used internally by VBMC, the moments can be computed analytically.

Parameters:

Nint, optional: Number of samples used to estimate the moments, by default int(1e6).
orig_flagbool, optional: If True, compute moments in the original parameter space, otherwise in the transformed VBMC space. By default True.
cov_flagbool, optional: If True, return the covariance matrix as a second return value, by default False.

Returns:

mean: np.ndarray: The mean of the variational posterior.
cov: np.ndarray: If cov_flag is True, returns the covariance matrix as well.

mtv(self, vp2: VariationalPosterior = None, samples: ndarray = None, N: int = 100000)#

Marginal total variation distances between two variational posteriors.

Compute the total variation distance between the variational posterior and a second posterior, separately for each dimension (hence “marginal” total variation distance, MTV). The second posterior can be specified either as a VariationalPosterior or as a set of samples.

Parameters:

vp2VariationalPosterior, optional: The other VariationalPosterior, by default None.
samplesnp.ndarray, optional: An N-by-D matrix of samples from the other variational posterior, by default None.
Nint, optional: The number of random draws to estimate the MTV, by default int(1e5).

Returns:

mtv: np.ndarray: A D-element vector whose elements are the total variation distance between the marginal distributions of vp and vp1 or samples, for each coordinate dimension.

Raises:

ValueError: Raised if neither vp2 nor samples are specified.

Notes

The total variation distance between two densities \(p_1\) and \(p_2\) is:

\[TV(p_1, p_2) = \frac{1}{2} \int | p_1(x) - p_2(x) | dx.\]

pdf(self, x: ndarray, orig_flag: bool = True, log_flag: bool = False, grad_flag: bool = False, df: float = inf)#

Probability density function of the variational posterior.

Compute the probability density function (pdf) of the variational posterior at one or multiple input points.

Parameters:

xnp.ndarray: x is a matrix of inputs to evaluate the pdf at. The rows of the N-by-D matrix x correspond to observations or points, and columns correspond to variables or coordinates. x is assumed to be in the original space by default.
orig_flagbool, optional: Controls if the value of the posterior density should be evaluated in the original parameter space for orig_flag is True, or in the transformed space if orig_flag is False, by default True. Accordingly, x should be in the original space if orig_flag is True and be in the transformed space if orig_flag is False.
log_flagbool, optional: If log_flag is True return the logarithm of the pdf, by default False.
grad_flagbool, optional: If True the gradient of the pdf is returned as a second output, by default False.
dffloat, optional: Compute the pdf of a heavy-tailed version of the variational posterior, in which the multivariate normal components have been replaced by multivariate t-distributions with df degrees of freedom. The default is df = np.inf, limit in which the t-distribution becomes a multivariate normal.

Returns:

pdf: np.ndarray: The probability density of the variational posterior evaluated at each row of x.
gradient: np.ndarray: If grad_flag is True, the function returns the gradient as well.

Raises:

NotImplementedError: Raised if df is non-zero and finite and grad_flag = True (Gradient of heavy-tailed pdf not supported yet).
NotImplementedError: Raised if orig_flag = True and log_flag = True and grad_flag = True (gradient computation of the log-pdf in the original space is not supported yet).

plot(self, n_samples: int = 100000, title: str = None, plot_data: bool = False, highlight_data: list = None, plot_vp_centres: bool = False, plot_style: dict = None, gp: GaussianProcess = None)#

Plot the variational posterior.

plot displays the variational posterior as a cornerplot showing the 1D and 2D marginals, estimated from samples. It uses the corner package.

plot also optionally displays the centres of the variational mixture components and the datapoints of the underlying Gaussian process (GP) used by VBMC. The plot can be enhanced by custom styles and specific datapoints of the GP can be highlighted.

Parameters:

n_samplesint, optional

The number of posterior samples used to create the plot, by default int(1e5).

titlestr, optional

The title of the plot, by default None.

plot_databool, optional

Whether to plot the datapoints of the GP, by default False.

highlight_datalist, optional

Indices of the GP datapoints that should be plotted in a different way than the other datapoints, by default None.

plot_vp_centresbool, optional

Whether to plot the centres of the vp components, by default False.

plot_styledict, optional

A dictionary of plot styling options. The possible options are:

cornerdict, optional: Styling options directly passed to the corner function. By default: {"fig": plt.figure(figsize=(8, 8)), "labels": labels}. See the documentation of corner.
datadict, optional: Styling options used to plot the GP data. By default: {"s":15, "color":'blue', "facecolors": "none"}.
highlight_datadict, optional: Styling options used to plot the highlighted GP data. By default: {"s":15, "color":"orange"}.
vp_centredict, optional: Styling options used to plot the vp centres. By default: {"marker":"x", "color":"red"}.

gpgpyreg.GaussianProcess, optional

The GP to use for plotting actively sampled points.

Returns:

figmatplotlib.figure.Figure: The resulting matplotlib figure of the plot.

sample(self, N: int, orig_flag: bool = True, balance_flag: bool = False, df: float = inf)#

Draw random samples from the variational posterior.

Parameters:

Nint: Number of samples to draw.
orig_flagbool, optional: If orig_flag is True, the random vectors are returned in the original parameter space. If False, they are returned in the transformed, unconstrained space used internally by VBMC. By default True.
balance_flagbool, optional: If balance_flag is True, the generating process is balanced such that the random samples come from each mixture component exactly proportionally (or as close as possible) to the variational mixture weights. If False, the generating mixture component for each sample is determined randomly, according to the mixture weights. By default False.
dffloat, optional: Generate the samples from a heavy-tailed version of the variational posterior, in which the multivariate normal components have been replaced by multivariate t-distributions with df degrees of freedom. The default is np.inf, limit in which the t-distribution becomes a multivariate normal.

Returns:

Xnp.ndarray: X is an N-by-D matrix of random vectors drawn from the variational posterior.
Inp.ndarray: I is an N-by-1 array such that the i-th element of I indicates the index of the variational mixture component from which the i-th row of X has been generated.

save(self, file, overwrite=False)#

Save the VP to a file.

Parameters:

filepath-like: The file name or path to write to. Default file extension .pkl will be added if no extension is specified.
overwritebool: Whether to allow overwriting existing files. Default False.

Raises:

FileExistsError: If the file already exists and overwrite is False.
OSError: If the file cannot be opened for other reasons (e.g., the directory is not found, the disk is full, etc.).

get_bounds(X: ndarray, options, K: int = None)[source]#

Compute soft bounds for variational posterior parameters.

These bounds are used during the variational optimization in VBMC.

Parameters:

Xndarray, shape (N, D): Training inputs.
optionsOptions: Program options.
Kint, optional: The number of mixture components. By default we use the number provided at class instantiation.

Returns:

theta_bnddict

A dictionary of soft bounds with the following elements:

lbnp.ndarray: Lower bounds.
ubnp.ndarray: Upper bounds.
tol_confloat: Fractional tolerance for constraint violation of variational parameters.
weight_thresholdfloat, optional: Threshold below which weights are penalized.
weight_penaltyfloat, optional: The penalty for weight below the threshold.

get_parameters(raw_flag=True)[source]#

Get variational posterior parameters as single array.

Return all the active VariationalPosterior parameters flattened as a 1D (numpy) array, possibly transformed.

Parameters:

raw_flagbool, optional: Specifies whether the sigma and lambda parameters are returned as raw (unconstrained) or not, by default True.

Returns:

thetanp.ndarray: The variational posterior parameters flattenend as a 1D array.

mode(orig_flag=True, n_opts: int | None = None)[source]#

Find the mode of the variational posterior.

Parameters:

orig_flagbool, optional: If True find the mode of the variational posterior in the original parameter space, otherwise in the transformed parameter space. By default True.
n_optsint, optional: Maximum number of optimization runs from different starting points to find the mode. By default n_opts is the square root of the number of mixture components K, that is \(n\_opts = \lceil \sqrt{K} \rceil\).
Returns
——-
mode: np.ndarray: The mode of the variational posterior.

Notes

Mode estimation (e.g., for the purpose of maximum-a-posteriori estimation) is not recommended with VBMC, since due to the underlying representation (mixture of Gaussians) the mode of the variational posterior is a brittle and potentially unreliable estimator of the mode of the target posterior, especially if it lies close to the boundaries of the space.

The mode is not invariant to nonlinear reparameterizations of the input space, so the mode in the original space and the mode in the transformed (unconstrained) space will generally be in different locations (even after applying the appropriate transformations).

set_parameters(theta: ndarray, raw_flag=True)[source]#

Set variational posterior parameters from a single array.

Takes as input a numpy array and assigns it to the variational posterior parameters.

Parameters:

thetanp.ndarray: The array with the parameters that should be assigned.
raw_flagbool, optional: Specifies whether the sigma and lambda parameters are passed as raw (unconstrained) or not, by default True.

Raises:

ValueError: Raised if sigma, lambda and weights are not positive and raw_flag = False.

VariationalPosterior

Contents

VariationalPosterior#

`VariationalPosterior`#