PyVBMC Example 3: Output diagnostics and saving results#

In this notebook, we demonstrate extended usage of PyVBMC. We will take a brief look at PyVBMC’s diagnostic output, and show you how to save the results of optimization to disk.

This notebook is Part 3 of a series of notebooks in which we present various example usages for VBMC with the PyVBMC package. The code used in this example is available as a script here.

import numpy as np
import scipy.stats as scs
from scipy.optimize import minimize
from pyvbmc import VBMC
from pyvbmc.formatting import format_dict

1. Model definition and setup#

For demonstration purposes, we will run PyVBMC with a restricted budget of function evaluations, insufficient to achieve convergence. Then we will inspect the output diagnostics, and resume optimization.

We use a higher-dimensional analogue of the same toy target function in Example 1, a broad Rosenbrock’s banana function in \(D = 4\).

D = 4  # A four-dimensional problem
prior_mu = np.zeros(D)
prior_var = 3 * np.ones(D)


def log_prior(theta):
    """Multivariate normal prior on theta."""
    cov = np.diag(prior_var)
    return scs.multivariate_normal(prior_mu, cov).logpdf(theta)

The likelihood function of your model will in general depend on the observed data. This data can be fixed as a global variable, as we did directly above for prior_mu and prior_var. It can also be defined by a default second argument: to PyVBMC there is no difference so long as the function can be called with only a single argument (the parameters theta):

def log_likelihood(theta, data=np.ones(D)):
    """D-dimensional Rosenbrock's banana function."""
    # In this simple demo the data just translates the parameters:
    theta = np.atleast_2d(theta)
    theta = theta + data

    x, y = theta[:, :-1], theta[:, 1:]
    return -np.sum((x**2 - y) ** 2 + (x - 1) ** 2 / 100, axis=1)


def log_joint(theta, data=np.ones(D)):
    """log-density of the joint distribution."""
    return log_likelihood(theta, data) + log_prior(theta)
LB = np.full(D, -np.inf)  # Lower bounds
UB = np.full(D, np.inf)  # Upper bounds
PLB = np.full(D, prior_mu - np.sqrt(prior_var))  # Plausible lower bounds
PUB = np.full(D, prior_mu + np.sqrt(prior_var))  # Plausible upper bounds

In a typical inference scenario, we recommend starting from a “good” point (i.e. one near the mode). We can run a quick preliminary optimization, though a more extensive optimization would not harm.

np.random.seed(41)
x0 = np.random.uniform(PLB, PUB)  # Random point inside plausible box
x0 = minimize(
    lambda t: -log_joint(t),
    x0,
    bounds=[
        (-np.inf, np.inf),
        (-np.inf, np.inf),
        (-np.inf, np.inf),
        (-np.inf, np.inf),
    ],
).x
np.random.seed(42)
# Limit number of function evaluations
options = {
    "max_fun_evals": 10 * D,
}
# We can specify either the log-joint, or the log-likelihood and log-prior.
# In other words, the following lines are equivalent:
vbmc = VBMC(
    log_likelihood,
    x0,
    LB,
    UB,
    PLB,
    PUB,
    options=options,
    log_prior=log_prior,
)
# vbmc = VBMC(
#     log_joint,
#     x0, LB, UB, PLB, PUB, options=options,
# )
Reshaping x0 to row vector.
Reshaping lower bounds to (1, 4).
Reshaping upper bounds to (1, 4).
Reshaping plausible lower bounds to (1, 4).
Reshaping plausible upper bounds to (1, 4).

(PyVBMC expects the bounds to be (1, D) row vectors, and the initial point(s) to be of shape (n, D), but it will accept and re-shape vectors of shape (D,) as well.)

2. Running the model and checking convergence diagnostics#

Now we run PyVBMC with a very small budget of 40 function evaluations:

vp, results = vbmc.optimize()
Beginning variational optimization assuming EXACT observations of the log-joint.
 Iteration  f-count    Mean[ELBO]    Std[ELBO]    sKL-iter[q]   K[q]  Convergence  Action
     0         10          -3.64         1.20    230538.53        2        inf     start warm-up
     1         15          -3.40         2.05        15.56        2        inf     
     2         20          -3.42         2.08        14.12        2        242     
     3         25          35.04        70.57      2305.69        2   3.88e+04     
     4         30          39.27        71.14      4950.88        2   8.28e+04     trim data
     5         35          -3.39         1.05       227.02        2   3.93e+03     
     6         40           3.33        12.07        27.91        2        528     
   inf         40          -3.13         1.00         0.20       50   3.93e+03     finalize
Inference terminated: reached maximum number of function evaluations options.max_fun_evals.
Estimated ELBO: -3.130 +/-0.995.
Caution: Returned variational solution may have not converged.

PyVBMC is warning us that convergence is doubtful. We can look at the output for more information and diagnostics.

print(results["success_flag"])
False

False means that PyVBMC has not converged to a stable solution within the given number of function evaluations.

print(format_dict(results))
{
    'function': '<function VBMC._init_log_joint.<locals>.log_joint at 0x7f00ea0d79d0>',
    'problem_type': 'unconstrained',
    'iterations': 6,
    'func_count': 40,
    'best_iter': 5,
    'train_set_size': 31,
    'components': 50,
    'r_index': 3929.341498249543,
    'convergence_status': 'no',
    'overhead': nan,
    'rng_state': 'rng',
    'algorithm': 'Variational Bayesian Monte Carlo',
    'version': '0.1.0',
    'message': 'Inference terminated: reached maximum number of function evaluations options.max_fun_evals.',
    'elbo': -3.129758475814559,
    'elbo_sd': 0.995262203269921,
    'success_flag': False,
}

In the info dictionary:

  • the convergence_status field says ‘no’ (probable lack of convergence);

  • the reliability index r_index is 3.68, (should be less than 1). Our diagnostics tell us that this run has not converged, suggesting to increase the budget.

Note that convergence to a solution does not mean that it is a good solution. You should always check the returned variational posteriors, and ideally should compare across multiple runs of PyVBMC.

3. Saving results#

We can also save the VBMC instance to disk and reload it later, in order to check the results and convergence diagnostics, sample from the posterior, or resume the optimization from checkpoint etc. If you are only interested in the final (best) variational solution, as opposed to the full iteration history of the optimization, then you may wish to save only the final VariationalPosterior instead.

Note: Some complex attributes of the VBMC instance — such as the stored function(s) representing the log joint density — may not behave as expected when saved and loaded by different Python minor versions (e.g. 3.9 and 3.10), due to differing dependencies. The instance should still load, and its static data will remain, but if you plan to resume optimization as shown here, then we suggest you use the same version of Python to save and load the VBMC instance.
# Here we specify `overwrite=True`, since we don't care about overwriting our
# test file. By default, `overwrite=False` and PyVBMC will raise an error if
# the file already exists.
vbmc.save("vbmc_test_save.pkl", overwrite=True)
# We could also save just the final variational posterior:
# vbmc.vp.save("vp_test_save.pkl")

4. Loading results and resuming the optimization process#

The VBMC.load(file) class method will load a previously-saved VBMC instance from the specified file. We can load the instance saved above and resume the optimization process, increasing maximum number of function evaluations. The default budget is \(50(d+2)\) evaluations, where \(d\) is the dimension of the parameter space (though this example will not require the full budget). We can change this or other options by passing the new_options keyword to VBMC.load(...). Here we increase max_fun_evals and resume the optimization process:

new_options = {
    "max_fun_evals": 50 * (D + 2),
}
vbmc = VBMC.load(
    "vbmc_test_save.pkl",
    new_options=new_options,
    iteration=None,  # the default: start from the last stored iteration.
    set_random_state=False,  # the default: don't modify the random state
    # (can be set to True for reproducibility).
)
vp, results = vbmc.optimize()
Beginning variational optimization assuming EXACT observations of the log-joint.
Continuing optimization from previous state.
 Iteration  f-count    Mean[ELBO]    Std[ELBO]    sKL-iter[q]   K[q]  Convergence  Action
     7         45          -4.84         1.80         4.46        2        108     
     8         50          -4.87         0.33         7.65        2        129     end warm-up
     9         55          73.00        57.41        89.44        2   1.75e+03     
    10         60          -9.39         4.61        20.33        2        629     
    11         65          -4.70         0.12         2.32        3       54.7     
    12         70          -4.53         0.02         0.12        4       2.58     
    13         75          -4.42         0.01         0.04        5       1.13     
    14         75          -4.12         0.22         0.19        6       4.82     rotoscale
    15         80          -4.51         0.07         0.05        6       2.45     
    16         85          -4.42         0.02         0.02        6      0.667     
    17         90          -4.39         0.01         0.01        6      0.382     
    18         95          -4.38         0.01         0.01        9      0.142     
    19        100          -4.30         0.01         0.01       12      0.451     
    20        105          -4.25         0.01         0.01       15      0.297     rotoscale, undo rotoscale
    21        110          -4.21         0.00         0.01       18      0.271     
    22        115          -4.17         0.00         0.00       21      0.158     
    23        120          -4.19         0.00         0.00       23     0.0931     
    24        125          -4.18         0.00         0.00       23     0.0491     
    25        130          -4.17         0.00         0.00       24     0.0325     stable
   inf        130          -4.15         0.00         0.00       50     0.0325     finalize
Inference terminated: variational solution stable for options.tol_stable_count fcn evaluations.
Estimated ELBO: -4.149 +/-0.001.
print(format_dict(results))
{
    'function': '<function VBMC._init_log_joint.<locals>.log_joint at 0x7f00ea1301f0>',
    'problem_type': 'unconstrained',
    'iterations': 25,
    'func_count': 130,
    'best_iter': 25,
    'train_set_size': 121,
    'components': 50,
    'r_index': 0.03246322965421961,
    'convergence_status': 'probable',
    'overhead': nan,
    'rng_state': 'rng',
    'algorithm': 'Variational Bayesian Monte Carlo',
    'version': '0.1.0',
    'message': 'Inference terminated: variational solution stable for options.tol_stable_count fcn evaluations.',
    'elbo': -4.149058408953476,
    'elbo_sd': 0.001492050324205448,
    'success_flag': True,
}

With the default budget of function evaluations, we can see that the convergence_status is ‘probable’ and the r_index is much less than 1, suggesting convergence has been acheived. We can save the result to file and load it later, e.g. to perform futher validation or to sample from the variational posterior.

vbmc.save("vbmc_test_save.pkl", overwrite=True)
vbmc = VBMC.load("vbmc_test_save.pkl")

samples, components = vbmc.vp.sample(5)
# `samples` are samples drawn from the variational posterior.
# `components` are the index of the mixture components each
#  sample was drawn from.
print(samples)
print(components)
[[-1.82750929 -0.25320367  0.51048786  1.41698176]
 [-2.38874097  0.03977422 -0.35191954  0.32260624]
 [-1.26096266 -0.58007695 -1.03945047 -0.0098995 ]
 [-1.88089158 -0.59198002 -0.72811124 -1.00054502]
 [-0.60285619 -0.07388834 -1.15702975 -0.25769202]]
[ 3 14 20  4  8]

5. Conclusions#

In this notebook, we have given a brief overview of PyVBMC’s output diagnostics, and shown how to save and load results and resume optimization from a specific iteration.

In the next notebook, we will illustrate running PyVBMC multiple times in order to validate the results.

Acknowledgments#

Work on the PyVBMC package was funded by the Finnish Center for Artificial Intelligence FCAI.