VariationalPosterior#

Note

The main output of a PyVBMC run is a VariationalPosterior object vp.

The variational posterior can be queried and manipulated via several methods, such as:

  • moments: compute mean and covariance matrix of the variational posterior;

  • pdf: evaluate the variational posterior density at a point;

  • plot: plot the variational posterior as a corner plot (1D and 2D marginals);

  • sample: draw random samples from the variational posterior.

There are also methods to compare variational posteriors:

  • kl_div: compute the Kullback-Leibler divergence between two posteriors;

  • mtv: compute the marginal total variation distance between two posteriors.

See below for the full VariationalPosterior class methods and interface.

class pyvbmc.variational_posterior.VariationalPosterior(D: int, K: int = 2, x0=None, parameter_transformer=None)[source]#

The variational posterior class used in PyVBMC.

The variational posterior represents the approximate posterior as returned by the Variational Bayesian Monte Carlo (VBMC) algorithm.

In VBMC, the variational posterior is a weighted mixture of multivariate normal distributions (see Notes below for details).

Parameters:
Dint

The number of dimensions (i.e., parameters) of the posterior.

Kint, optional

The number of mixture components, default 2.

x0np.ndarray, optional

The starting vector for the mixture components means. It can be a single array or multiple rows (up to K); missing rows are duplicated by making copies of x0, default np.zeros.

parameter_transformerParameterTransformer, optional

The ParameterTransformer object specifying the transformation of the input space that leads to the current representation used by the variational posterior, by default uses an identity transform.

Notes

In VBMC, the variational posterior is defined as a mixture of multivariate normal distributions as follows:

\[q({\theta}) = \sum_{k = 1}^K w_k N(\theta, \mu_k, \sigma^2_k \Lambda)\]

where \(w_k\) are the mixture weights, \(\mu_k\) the component means, \(\sigma_k\) scaling factors, and \(\Lambda\) is a diagonal matrix common to all components with elements \(\lambda^2_d\) on the diagonal, for \(1 \le d \le D\).

Note that \(q({\theta})\) is defined in an unconstrained space. Constrained variables in the posterior are mapped to a trasformed, unconstrained space via a nonlinear mapping (represented by a ParameterTransformer object). The transformation is handled automatically.

In practice, you would almost never create a new VariationalPosterior object, simply use the variational posteriors returned by VBMC.

Attributes:
wnp.ndarray

The weights of the VP mixture components, shape (1, K).

etanp.ndarray

The unbounded (softmax) parametrization of the VP mixture components, shape (1, K).

munp.ndarray

The means of the VP mixture components, shape (D, K).

sigmanp.ndarray

The per-component scale of the VP mixture components. Shape (1, K).

lambdnp.ndarray

The per-dimension scale of the VP mixture components. Shape (D, 1).

optimize_weightsbool

Whether to optimize the weights.

optimize_mubool

Whether to optimize the means.

optimize_sigmabool

Whether to optimize sigma.

optimize_lambdbool

Whether to optimize lambd.

parameter_transformerParameterTransformer

The parameter transformer implementing transformations to/from unbounded space.

boundsdict

A dictionary containing the soft bounds for each variable to be optimized.

statsdict

A dictionary of statistics and other relevant info computed during optimization.

kl_div(self, vp2: VariationalPosterior = None, samples: ndarray = None, N: int = 100000, gauss_flag: bool = False)#

Kullback-Leibler divergence between two variational posteriors.

Compute the forward and reverse Kullback-Leibler (KL) divergence between two posteriors. The other variational posterior can be specified as vp2 (an instance of the class VariationalPosterior) or with samples. One of the two must be specified.

Parameters:
vp2VariationalPosterior, optional

The other VariationalPosterior, by default None.

samplesnp.ndarray, optional

An N-by-D matrix of samples from the other variational posterior, by default None.

Nint, optional

The number of random samples to estimate the KL divergence, by default int(1e5).

gauss_flagbool, optional

If True, returns a “Gaussianized” KL-divergence, that is the KL divergence between two multivariate normal distributions with the same moments as the variational posteriors given as inputs. By default False.

Returns:
kl_div: np.ndarray

A two-element vector containing the forward and reverse Kullback-Leibler divergence between the two posteriors.

Raises:
ValueError

Raised if neither vp2 nor samples are specified.

ValueError

Raised if vp2 is not provided but gauss_flag = False.

Notes

Since the KL divergence is not symmetric, the method returns both the forward and the reverse KL divergence, that is KL(vp1 || vp2) and KL(vp2 || vp1).

log_pdf(self, *args, **kwargs)#

log-probability density function of the variational posterior.

Compute the log density of the variational posterior at one or multiple input points. The parameters are the same as for vp.pdf() with log_flag=True. These parameters are described again here for reference.

Parameters:
xnp.ndarray

x is a matrix of inputs to evaluate the pdf at. The rows of the N-by-D matrix x correspond to observations or points, and columns correspond to variables or coordinates. x is assumed to be in the original space by default.

orig_flagbool, optional

Controls if the value of the posterior density should be evaluated in the original parameter space for orig_flag is True, or in the transformed space if orig_flag is False, by default True. Accordingly, x should be in the original space if orig_flag is True and be in the transformed space if orig_flag is False.

grad_flagbool, optional

If True the gradient of the log-pdf is returned as a second output, by default False.

dffloat, optional

Compute the log-pdf of a heavy-tailed version of the variational posterior, in which the multivariate normal components have been replaced by multivariate t-distributions with df degrees of freedom. The default is df = np.inf, limit in which the t-distribution becomes a multivariate normal.

Returns:
log_pdf: np.ndarray

The probability density of the variational posterior evaluated at each row of x.

gradient: np.ndarray

If grad_flag is True, the function returns the gradient as well.

Raises:
NotImplementedError

Raised if df is non-zero and finite and grad_flag = True (Gradient of heavy-tailed pdf not supported yet).

NotImplementedError

Raised if orig_flag = True and grad_flag = True (gradient computation in original space not supported yet).

load(file)#

Load a VP from a file.

Parameters:
filepath-like

The file name or path to write to.

Returns:
vpVariationalPosterior

The loaded VP instance.

Raises:
OSError

If the file cannot be found, or cannot be opened for other reasons.

moments(self, N: int = 1000000, orig_flag=True, cov_flag=False)#

Compute mean and covariance matrix of variational posterior.

In the original space, the moments are estimated via Monte Carlo sampling. If requested in the transformed (unconstrained) space used internally by VBMC, the moments can be computed analytically.

Parameters:
Nint, optional

Number of samples used to estimate the moments, by default int(1e6).

orig_flagbool, optional

If True, compute moments in the original parameter space, otherwise in the transformed VBMC space. By default True.

cov_flagbool, optional

If True, return the covariance matrix as a second return value, by default False.

Returns:
mean: np.ndarray

The mean of the variational posterior.

cov: np.ndarray

If cov_flag is True, returns the covariance matrix as well.

mtv(self, vp2: VariationalPosterior = None, samples: ndarray = None, N: int = 100000)#

Marginal total variation distances between two variational posteriors.

Compute the total variation distance between the variational posterior and a second posterior, separately for each dimension (hence “marginal” total variation distance, MTV). The second posterior can be specified either as a VariationalPosterior or as a set of samples.

Parameters:
vp2VariationalPosterior, optional

The other VariationalPosterior, by default None.

samplesnp.ndarray, optional

An N-by-D matrix of samples from the other variational posterior, by default None.

Nint, optional

The number of random draws to estimate the MTV, by default int(1e5).

Returns:
mtv: np.ndarray

A D-element vector whose elements are the total variation distance between the marginal distributions of vp and vp1 or samples, for each coordinate dimension.

Raises:
ValueError

Raised if neither vp2 nor samples are specified.

Notes

The total variation distance between two densities \(p_1\) and \(p_2\) is:

\[TV(p_1, p_2) = \frac{1}{2} \int | p_1(x) - p_2(x) | dx.\]
pdf(self, x: ndarray, orig_flag: bool = True, log_flag: bool = False, grad_flag: bool = False, df: float = inf)#

Probability density function of the variational posterior.

Compute the probability density function (pdf) of the variational posterior at one or multiple input points.

Parameters:
xnp.ndarray

x is a matrix of inputs to evaluate the pdf at. The rows of the N-by-D matrix x correspond to observations or points, and columns correspond to variables or coordinates. x is assumed to be in the original space by default.

orig_flagbool, optional

Controls if the value of the posterior density should be evaluated in the original parameter space for orig_flag is True, or in the transformed space if orig_flag is False, by default True. Accordingly, x should be in the original space if orig_flag is True and be in the transformed space if orig_flag is False.

log_flagbool, optional

If log_flag is True return the logarithm of the pdf, by default False.

grad_flagbool, optional

If True the gradient of the pdf is returned as a second output, by default False.

dffloat, optional

Compute the pdf of a heavy-tailed version of the variational posterior, in which the multivariate normal components have been replaced by multivariate t-distributions with df degrees of freedom. The default is df = np.inf, limit in which the t-distribution becomes a multivariate normal.

Returns:
pdf: np.ndarray

The probability density of the variational posterior evaluated at each row of x.

gradient: np.ndarray

If grad_flag is True, the function returns the gradient as well.

Raises:
NotImplementedError

Raised if df is non-zero and finite and grad_flag = True (Gradient of heavy-tailed pdf not supported yet).

NotImplementedError

Raised if orig_flag = True and log_flag = True and grad_flag = True (gradient computation of the log-pdf in the original space is not supported yet).

plot(self, n_samples: int = 100000, title: str = None, plot_data: bool = False, highlight_data: list = None, plot_vp_centres: bool = False, plot_style: dict = None, gp: GaussianProcess = None)#

Plot the variational posterior.

plot displays the variational posterior as a cornerplot showing the 1D and 2D marginals, estimated from samples. It uses the corner package.

plot also optionally displays the centres of the variational mixture components and the datapoints of the underlying Gaussian process (GP) used by VBMC. The plot can be enhanced by custom styles and specific datapoints of the GP can be highlighted.

Parameters:
n_samplesint, optional

The number of posterior samples used to create the plot, by default int(1e5).

titlestr, optional

The title of the plot, by default None.

plot_databool, optional

Whether to plot the datapoints of the GP, by default False.

highlight_datalist, optional

Indices of the GP datapoints that should be plotted in a different way than the other datapoints, by default None.

plot_vp_centresbool, optional

Whether to plot the centres of the vp components, by default False.

plot_styledict, optional
A dictionary of plot styling options. The possible options are:
cornerdict, optional

Styling options directly passed to the corner function. By default: {"fig": plt.figure(figsize=(8, 8)), "labels": labels}. See the documentation of corner.

datadict, optional

Styling options used to plot the GP data. By default: {"s":15, "color":'blue', "facecolors": "none"}.

highlight_datadict, optional

Styling options used to plot the highlighted GP data. By default: {"s":15, "color":"orange"}.

vp_centredict, optional

Styling options used to plot the vp centres. By default: {"marker":"x", "color":"red"}.

gpgpyreg.GaussianProcess, optional

The GP to use for plotting actively sampled points.

Returns:
figmatplotlib.figure.Figure

The resulting matplotlib figure of the plot.

sample(self, N: int, orig_flag: bool = True, balance_flag: bool = False, df: float = inf)#

Draw random samples from the variational posterior.

Parameters:
Nint

Number of samples to draw.

orig_flagbool, optional

If orig_flag is True, the random vectors are returned in the original parameter space. If False, they are returned in the transformed, unconstrained space used internally by VBMC. By default True.

balance_flagbool, optional

If balance_flag is True, the generating process is balanced such that the random samples come from each mixture component exactly proportionally (or as close as possible) to the variational mixture weights. If False, the generating mixture component for each sample is determined randomly, according to the mixture weights. By default False.

dffloat, optional

Generate the samples from a heavy-tailed version of the variational posterior, in which the multivariate normal components have been replaced by multivariate t-distributions with df degrees of freedom. The default is np.inf, limit in which the t-distribution becomes a multivariate normal.

Returns:
Xnp.ndarray

X is an N-by-D matrix of random vectors drawn from the variational posterior.

Inp.ndarray

I is an N-by-1 array such that the i-th element of I indicates the index of the variational mixture component from which the i-th row of X has been generated.

save(self, file, overwrite=False)#

Save the VP to a file.

Parameters:
filepath-like

The file name or path to write to. Default file extension .pkl will be added if no extension is specified.

overwritebool

Whether to allow overwriting existing files. Default False.

Raises:
FileExistsError

If the file already exists and overwrite is False.

OSError

If the file cannot be opened for other reasons (e.g., the directory is not found, the disk is full, etc.).

get_bounds(X: ndarray, options, K: int = None)[source]#

Compute soft bounds for variational posterior parameters.

These bounds are used during the variational optimization in VBMC.

Parameters:
Xndarray, shape (N, D)

Training inputs.

optionsOptions

Program options.

Kint, optional

The number of mixture components. By default we use the number provided at class instantiation.

Returns:
theta_bnddict
A dictionary of soft bounds with the following elements:
lbnp.ndarray

Lower bounds.

ubnp.ndarray

Upper bounds.

tol_confloat

Fractional tolerance for constraint violation of variational parameters.

weight_thresholdfloat, optional

Threshold below which weights are penalized.

weight_penaltyfloat, optional

The penalty for weight below the threshold.

get_parameters(raw_flag=True)[source]#

Get variational posterior parameters as single array.

Return all the active VariationalPosterior parameters flattened as a 1D (numpy) array, possibly transformed.

Parameters:
raw_flagbool, optional

Specifies whether the sigma and lambda parameters are returned as raw (unconstrained) or not, by default True.

Returns:
thetanp.ndarray

The variational posterior parameters flattenend as a 1D array.

mode(orig_flag=True, n_opts: int | None = None)[source]#

Find the mode of the variational posterior.

Parameters:
orig_flagbool, optional

If True find the mode of the variational posterior in the original parameter space, otherwise in the transformed parameter space. By default True.

n_optsint, optional

Maximum number of optimization runs from different starting points to find the mode. By default n_opts is the square root of the number of mixture components K, that is \(n\_opts = \lceil \sqrt{K} \rceil\).

Returns
——-
mode: np.ndarray

The mode of the variational posterior.

Notes

Mode estimation (e.g., for the purpose of maximum-a-posteriori estimation) is not recommended with VBMC, since due to the underlying representation (mixture of Gaussians) the mode of the variational posterior is a brittle and potentially unreliable estimator of the mode of the target posterior, especially if it lies close to the boundaries of the space.

The mode is not invariant to nonlinear reparameterizations of the input space, so the mode in the original space and the mode in the transformed (unconstrained) space will generally be in different locations (even after applying the appropriate transformations).

set_parameters(theta: ndarray, raw_flag=True)[source]#

Set variational posterior parameters from a single array.

Takes as input a numpy array and assigns it to the variational posterior parameters.

Parameters:
thetanp.ndarray

The array with the parameters that should be assigned.

raw_flagbool, optional

Specifies whether the sigma and lambda parameters are passed as raw (unconstrained) or not, by default True.

Raises:
ValueError

Raised if sigma, lambda and weights are not positive and raw_flag = False.