VariationalPosterior
#
Note
The main output of a PyVBMC run is a VariationalPosterior
object vp
.
The variational posterior can be queried and manipulated via several methods, such as:
moments
: compute mean and covariance matrix of the variational posterior;pdf
: evaluate the variational posterior density at a point;plot
: plot the variational posterior as a corner plot (1D and 2D marginals);sample
: draw random samples from the variational posterior.
There are also methods to compare variational posteriors:
kl_div
: compute the Kullback-Leibler divergence between two posteriors;mtv
: compute the marginal total variation distance between two posteriors.
See below for the full VariationalPosterior
class methods and interface.
- class pyvbmc.variational_posterior.VariationalPosterior(D: int, K: int = 2, x0=None, parameter_transformer=None)[source]#
The variational posterior class used in PyVBMC.
The variational posterior represents the approximate posterior as returned by the Variational Bayesian Monte Carlo (VBMC) algorithm.
In VBMC, the variational posterior is a weighted mixture of multivariate normal distributions (see Notes below for details).
- Parameters:
- Dint
The number of dimensions (i.e., parameters) of the posterior.
- Kint, optional
The number of mixture components, default 2.
- x0np.ndarray, optional
The starting vector for the mixture components means. It can be a single array or multiple rows (up to K); missing rows are duplicated by making copies of x0, default
np.zeros
.- parameter_transformerParameterTransformer, optional
The
ParameterTransformer
object specifying the transformation of the input space that leads to the current representation used by the variational posterior, by default uses an identity transform.
- Attributes:
- wnp.ndarray
The weights of the VP mixture components, shape
(1, K)
.- etanp.ndarray
The unbounded (softmax) parametrization of the VP mixture components, shape
(1, K)
.- munp.ndarray
The means of the VP mixture components, shape
(D, K)
.- sigmanp.ndarray
The per-component scale of the VP mixture components. Shape
(1, K)
.- lambdnp.ndarray
The per-dimension scale of the VP mixture components. Shape
(D, 1)
.- optimize_weightsbool
Whether to optimize the weights.
- optimize_mubool
Whether to optimize the means.
- optimize_sigmabool
Whether to optimize
sigma
.- optimize_lambdbool
Whether to optimize
lambd
.- parameter_transformerParameterTransformer
The parameter transformer implementing transformations to/from unbounded space.
- boundsdict
A dictionary containing the soft bounds for each variable to be optimized.
- statsdict
A dictionary of statistics and other relevant info computed during optimization.
Notes
In VBMC, the variational posterior is defined as a mixture of multivariate normal distributions as follows:
\[q({\theta}) = \sum_{k = 1}^K w_k N(\theta, \mu_k, \sigma^2_k \Lambda)\]where \(w_k\) are the mixture weights, \(\mu_k\) the component means, \(\sigma_k\) scaling factors, and \(\Lambda\) is a diagonal matrix common to all components with elements \(\lambda^2_d\) on the diagonal, for \(1 \le d \le D\).
Note that \(q({\theta})\) is defined in an unconstrained space. Constrained variables in the posterior are mapped to a trasformed, unconstrained space via a nonlinear mapping (represented by a
ParameterTransformer
object). The transformation is handled automatically.In practice, you would almost never create a new
VariationalPosterior
object, simply use the variational posteriors returned byVBMC
.- kl_div(self, vp2: VariationalPosterior = None, samples: ndarray = None, N: int = 100000, gauss_flag: bool = False)#
Kullback-Leibler divergence between two variational posteriors.
Compute the forward and reverse Kullback-Leibler (KL) divergence between two posteriors. The other variational posterior can be specified as vp2 (an instance of the class
VariationalPosterior
) or with samples. One of the two must be specified.- Parameters:
- vp2VariationalPosterior, optional
The other
VariationalPosterior
, by default None.- samplesnp.ndarray, optional
An N-by-D matrix of samples from the other variational posterior, by default
None
.- Nint, optional
The number of random samples to estimate the KL divergence, by default
int(1e5)
.- gauss_flagbool, optional
If
True
, returns a “Gaussianized” KL-divergence, that is the KL divergence between two multivariate normal distributions with the same moments as the variational posteriors given as inputs. By defaultFalse
.
- Returns:
- kl_div: np.ndarray
A two-element vector containing the forward and reverse Kullback-Leibler divergence between the two posteriors.
- Raises:
- ValueError
Raised if neither vp2 nor samples are specified.
- ValueError
Raised if vp2 is not provided but gauss_flag =
False
.
Notes
Since the KL divergence is not symmetric, the method returns both the forward and the reverse KL divergence, that is KL(vp1 || vp2) and KL(vp2 || vp1).
- log_pdf(self, *args, **kwargs)#
log-probability density function of the variational posterior.
Compute the log density of the variational posterior at one or multiple input points. The parameters are the same as for
vp.pdf()
withlog_flag=True
. These parameters are described again here for reference.- Parameters:
- xnp.ndarray
x is a matrix of inputs to evaluate the pdf at. The rows of the N-by-D matrix x correspond to observations or points, and columns correspond to variables or coordinates. x is assumed to be in the original space by default.
- orig_flagbool, optional
Controls if the value of the posterior density should be evaluated in the original parameter space for orig_flag is
True
, or in the transformed space if orig_flag isFalse
, by defaultTrue
. Accordingly, x should be in the original space if orig_flag isTrue
and be in the transformed space if orig_flag is False.- grad_flagbool, optional
If
True
the gradient of the log-pdf is returned as a second output, by defaultFalse
.- dffloat, optional
Compute the log-pdf of a heavy-tailed version of the variational posterior, in which the multivariate normal components have been replaced by multivariate t-distributions with df degrees of freedom. The default is df =
np.inf
, limit in which the t-distribution becomes a multivariate normal.
- Returns:
- log_pdf: np.ndarray
The probability density of the variational posterior evaluated at each row of x.
- gradient: np.ndarray
If grad_flag is
True
, the function returns the gradient as well.
- Raises:
- NotImplementedError
Raised if df is non-zero and finite and grad_flag =
True
(Gradient of heavy-tailed pdf not supported yet).- NotImplementedError
Raised if orig_flag =
True
and grad_flag =True
(gradient computation in original space not supported yet).
- load(file)#
Load a VP from a file.
- Parameters:
- filepath-like
The file name or path to write to.
- Returns:
- vpVariationalPosterior
The loaded VP instance.
- Raises:
- OSError
If the file cannot be found, or cannot be opened for other reasons.
- moments(self, N: int = 1000000, orig_flag=True, cov_flag=False)#
Compute mean and covariance matrix of variational posterior.
In the original space, the moments are estimated via Monte Carlo sampling. If requested in the transformed (unconstrained) space used internally by VBMC, the moments can be computed analytically.
- Parameters:
- Nint, optional
Number of samples used to estimate the moments, by default
int(1e6)
.- orig_flagbool, optional
If
True
, compute moments in the original parameter space, otherwise in the transformed VBMC space. By defaultTrue
.- cov_flagbool, optional
If
True
, return the covariance matrix as a second return value, by defaultFalse
.
- Returns:
- mean: np.ndarray
The mean of the variational posterior.
- cov: np.ndarray
If cov_flag is
True
, returns the covariance matrix as well.
- mtv(self, vp2: VariationalPosterior = None, samples: ndarray = None, N: int = 100000)#
Marginal total variation distances between two variational posteriors.
Compute the total variation distance between the variational posterior and a second posterior, separately for each dimension (hence “marginal” total variation distance, MTV). The second posterior can be specified either as a
VariationalPosterior
or as a set of samples.- Parameters:
- vp2VariationalPosterior, optional
The other
VariationalPosterior
, by defaultNone
.- samplesnp.ndarray, optional
An N-by-D matrix of samples from the other variational posterior, by default
None
.- Nint, optional
The number of random draws to estimate the MTV, by default
int(1e5)
.
- Returns:
- mtv: np.ndarray
A D-element vector whose elements are the total variation distance between the marginal distributions of vp and vp1 or samples, for each coordinate dimension.
- Raises:
- ValueError
Raised if neither vp2 nor samples are specified.
Notes
The total variation distance between two densities \(p_1\) and \(p_2\) is:
\[TV(p_1, p_2) = \frac{1}{2} \int | p_1(x) - p_2(x) | dx.\]
- pdf(self, x: ndarray, orig_flag: bool = True, log_flag: bool = False, grad_flag: bool = False, df: float = inf)#
Probability density function of the variational posterior.
Compute the probability density function (pdf) of the variational posterior at one or multiple input points.
- Parameters:
- xnp.ndarray
x is a matrix of inputs to evaluate the pdf at. The rows of the N-by-D matrix x correspond to observations or points, and columns correspond to variables or coordinates. x is assumed to be in the original space by default.
- orig_flagbool, optional
Controls if the value of the posterior density should be evaluated in the original parameter space for orig_flag is
True
, or in the transformed space if orig_flag isFalse
, by defaultTrue
. Accordingly, x should be in the original space if orig_flag isTrue
and be in the transformed space if orig_flag is False.- log_flagbool, optional
If log_flag is
True
return the logarithm of the pdf, by defaultFalse
.- grad_flagbool, optional
If
True
the gradient of the pdf is returned as a second output, by defaultFalse
.- dffloat, optional
Compute the pdf of a heavy-tailed version of the variational posterior, in which the multivariate normal components have been replaced by multivariate t-distributions with df degrees of freedom. The default is df =
np.inf
, limit in which the t-distribution becomes a multivariate normal.
- Returns:
- pdf: np.ndarray
The probability density of the variational posterior evaluated at each row of x.
- gradient: np.ndarray
If grad_flag is
True
, the function returns the gradient as well.
- Raises:
- NotImplementedError
Raised if df is non-zero and finite and grad_flag =
True
(Gradient of heavy-tailed pdf not supported yet).- NotImplementedError
Raised if orig_flag =
True
and log_flag =True
and grad_flag =True
(gradient computation of the log-pdf in the original space is not supported yet).
- plot(self, n_samples: int = 100000, title: str = None, plot_data: bool = False, highlight_data: list = None, plot_vp_centres: bool = False, plot_style: dict = None, gp: GaussianProcess = None)#
Plot the variational posterior.
plot displays the variational posterior as a cornerplot showing the 1D and 2D marginals, estimated from samples. It uses the corner package.
plot also optionally displays the centres of the variational mixture components and the datapoints of the underlying Gaussian process (GP) used by
VBMC
. The plot can be enhanced by custom styles and specific datapoints of the GP can be highlighted.- Parameters:
- n_samplesint, optional
The number of posterior samples used to create the plot, by default
int(1e5)
.- titlestr, optional
The title of the plot, by default
None
.- plot_databool, optional
Whether to plot the datapoints of the GP, by default
False
.- highlight_datalist, optional
Indices of the GP datapoints that should be plotted in a different way than the other datapoints, by default
None
.- plot_vp_centresbool, optional
Whether to plot the centres of the vp components, by default
False
.- plot_styledict, optional
- A dictionary of plot styling options. The possible options are:
- cornerdict, optional
Styling options directly passed to the corner function. By default:
{"fig": plt.figure(figsize=(8, 8)), "labels": labels}
. See the documentation of corner.- datadict, optional
Styling options used to plot the GP data. By default:
{"s":15, "color":'blue', "facecolors": "none"}
.- highlight_datadict, optional
Styling options used to plot the highlighted GP data. By default:
{"s":15, "color":"orange"}
.- vp_centredict, optional
Styling options used to plot the vp centres. By default:
{"marker":"x", "color":"red"}
.
- gpgpyreg.GaussianProcess, optional
The GP to use for plotting actively sampled points.
- Returns:
- figmatplotlib.figure.Figure
The resulting
matplotlib
figure of the plot.
- sample(self, N: int, orig_flag: bool = True, balance_flag: bool = False, df: float = inf)#
Draw random samples from the variational posterior.
- Parameters:
- Nint
Number of samples to draw.
- orig_flagbool, optional
If orig_flag is
True
, the random vectors are returned in the original parameter space. IfFalse
, they are returned in the transformed, unconstrained space used internally by VBMC. By defaultTrue
.- balance_flagbool, optional
If balance_flag is
True
, the generating process is balanced such that the random samples come from each mixture component exactly proportionally (or as close as possible) to the variational mixture weights. IfFalse
, the generating mixture component for each sample is determined randomly, according to the mixture weights. By defaultFalse
.- dffloat, optional
Generate the samples from a heavy-tailed version of the variational posterior, in which the multivariate normal components have been replaced by multivariate t-distributions with df degrees of freedom. The default is
np.inf
, limit in which the t-distribution becomes a multivariate normal.
- Returns:
- Xnp.ndarray
X is an N-by-D matrix of random vectors drawn from the variational posterior.
- Inp.ndarray
I is an N-by-1 array such that the i-th element of I indicates the index of the variational mixture component from which the i-th row of X has been generated.
- save(self, file, overwrite=False)#
Save the VP to a file.
- Parameters:
- filepath-like
The file name or path to write to. Default file extension .pkl will be added if no extension is specified.
- overwritebool
Whether to allow overwriting existing files. Default False.
- Raises:
- FileExistsError
If the file already exists and
overwrite
is False.- OSError
If the file cannot be opened for other reasons (e.g., the directory is not found, the disk is full, etc.).
- get_bounds(X: ndarray, options, K: int = None)[source]#
Compute soft bounds for variational posterior parameters.
These bounds are used during the variational optimization in
VBMC
.- Parameters:
- Xndarray, shape (N, D)
Training inputs.
- optionsOptions
Program options.
- Kint, optional
The number of mixture components. By default we use the number provided at class instantiation.
- Returns:
- theta_bnddict
- A dictionary of soft bounds with the following elements:
- lbnp.ndarray
Lower bounds.
- ubnp.ndarray
Upper bounds.
- tol_confloat
Fractional tolerance for constraint violation of variational parameters.
- weight_thresholdfloat, optional
Threshold below which weights are penalized.
- weight_penaltyfloat, optional
The penalty for weight below the threshold.
- get_parameters(raw_flag=True)[source]#
Get variational posterior parameters as single array.
Return all the active
VariationalPosterior
parameters flattened as a 1D (numpy) array, possibly transformed.- Parameters:
- raw_flagbool, optional
Specifies whether the sigma and lambda parameters are returned as raw (unconstrained) or not, by default
True
.
- Returns:
- thetanp.ndarray
The variational posterior parameters flattenend as a 1D array.
- mode(orig_flag=True, n_opts: int | None = None)[source]#
Find the mode of the variational posterior.
- Parameters:
- orig_flagbool, optional
If
True
find the mode of the variational posterior in the original parameter space, otherwise in the transformed parameter space. By defaultTrue
.- n_optsint, optional
Maximum number of optimization runs from different starting points to find the mode. By default n_opts is the square root of the number of mixture components K, that is \(n\_opts = \lceil \sqrt{K} \rceil\).
- Returns
- ——-
- mode: np.ndarray
The mode of the variational posterior.
Notes
Mode estimation (e.g., for the purpose of maximum-a-posteriori estimation) is not recommended with VBMC, since due to the underlying representation (mixture of Gaussians) the mode of the variational posterior is a brittle and potentially unreliable estimator of the mode of the target posterior, especially if it lies close to the boundaries of the space.
The mode is not invariant to nonlinear reparameterizations of the input space, so the mode in the original space and the mode in the transformed (unconstrained) space will generally be in different locations (even after applying the appropriate transformations).
- set_parameters(theta: ndarray, raw_flag=True)[source]#
Set variational posterior parameters from a single array.
Takes as input a
numpy
array and assigns it to the variational posterior parameters.- Parameters:
- thetanp.ndarray
The array with the parameters that should be assigned.
- raw_flagbool, optional
Specifies whether the sigma and lambda parameters are passed as raw (unconstrained) or not, by default
True
.
- Raises:
- ValueError
Raised if sigma, lambda and weights are not positive and raw_flag =
False
.