PyBADS Example 1: Basic usage#
In this introductory example, we will show a simple usage of Bayesian Adaptive Direct Search (BADS) to perform optimization of a synthetic target function.
This notebook is Part 1 of a series of notebooks in which we present various example usages for BADS with the PyBADS package. The code used in this example is available as a script here.
import numpy as np
from pybads.bads import BADS
0. What is (Py)BADS?#
BADS is a fast hybrid Bayesian optimization algorithm designed to solve difficult optimization problems, in particular related to parameter estimation – aka model fitting – of computational models (e.g., via maximum-likelihood or maximum-a-posteriori estimation). PyBADS is its Python implementation.
BADS has been intensively tested for fitting a variety of computational models, and is currently used by many research groups around the world (see Google Scholar for many example applications). In our benchmark with real model-fitting problems, BADS performed on par or better than many other common and state-of-the-art optimizers, as shown in the original BADS paper.
BADS is particularly recommended when:
the objective function landscape is rough (nonsmooth), typically due to numerical approximations or noise;
the objective function is moderately expensive to compute (e.g., more than 0.1 second per function evaluation);
the gradient is unavailable;
the number of input parameters is up to about
D = 20
or so.
BADS requires no specific tuning and runs off-the-shelf like other built-in Python optimizers (e.g., from scipy.optimize.minimize
).
Note: If you are interested in estimating posterior distributions (i.e., uncertainty and error bars) over model parameters, and not just point estimates, you might also want to check out Variational Bayesian Monte Carlo for Python (PyVBMC), a package for Bayesian posterior and model inference which can be used in synergy with PyBADS.
Optimization problem#
Formally, the goal of BADS is to minimize a target (or objective) function \(f(\mathbf{x}): \mathbb{R}^D \rightarrow \mathbb{R}\), for \(\mathbf{x} \in \mathbb{R}^D\),
where \(D\) is the dimensionality of the problem and lower_bounds
, upper_bounds
are arrays representing lower/upper bound constraints, which can be set to infinite for unbounded parameters.
1. Problem setup#
Here we show PyBADS at work on Rosenbrock’s banana function in 2D as target function.
We specify wide hard bounds and tighter plausible bounds that (hopefully) contain the solution.
Hard lower/upper bounds
lower_bounds
,upper_bounds
are the actual optimization bounds; PyBADS will not evaluate the target function outside these bounds, but might evaluate the target on the bounds. You can use-np.inf
andnp.inf
for unbounded parameters.Plausible lower/upper bounds
plausible_lower_bounds
,plausible_upper_bounds
represent our best guess at bounding the region where the solution might lie. The plausible bounds do not change the optimization problem, but help define the initial exploration and hyperparameters of PyBADS.
We set as starting point for the optimization \(\mathbf{x}_0 = (0, 0)\).
def rosenbrocks_fcn(x):
"""Rosenbrock's 'banana' function in any dimension."""
x_2d = np.atleast_2d(x)
return np.sum(100 * (x_2d[:, 0:-1]**2 - x_2d[:, 1:])**2 + (x_2d[:, 0:-1]-1)**2, axis=1)
target = rosenbrocks_fcn;
lower_bounds = np.array([-20, -20])
upper_bounds = np.array([20, 20])
plausible_lower_bounds = np.array([-5, -5])
plausible_upper_bounds = np.array([5, 5])
x0 = np.array([0, 0]); # Starting point
2. Initialize and run the optimization#
Then, we initialize a bads
instance which takes care of the optimization. For now, we use default options.
To run the optimization, we simply call bads.optimize()
, which returns an OptimizeResult
object.
bads = BADS(target, x0, lower_bounds, upper_bounds, plausible_lower_bounds, plausible_upper_bounds)
optimize_result = bads.optimize()
Beginning optimization of a DETERMINISTIC objective function
Iteration f-count f(x) MeshScale Method Actions
0 2 1 1 Uncertainty test
0 6 1 1 Initial mesh Initial points
0 10 1 0.5 Refine grid Train
1 18 1 0.25 Refine grid Train
2 20 0.608156 0.25 Successful search (ES-ell)
2 21 0.408646 0.25 Successful search (ES-ell)
2 23 0.303906 0.25 Incremental search (ES-wcm)
2 25 0.212582 0.25 Incremental search (ES-ell)
2 30 0.212582 0.125 Refine grid Train
3 32 0.127714 0.125 Successful search (ES-wcm)
3 33 0.0139198 0.125 Successful search (ES-wcm)
3 34 0.000681342 0.125 Incremental search (ES-wcm)
3 42 0.000681342 0.0625 Refine grid Train
4 44 0.000120297 0.0625 Incremental search (ES-ell)
4 50 0.000120297 0.03125 Refine grid Train
5 58 0.000120297 0.015625 Refine grid Train
6 66 0.000120297 0.00390625 Refine grid Train
7 69 2.70979e-05 0.00390625 Incremental search (ES-wcm)
7 74 2.70979e-05 0.000976562 Refine grid Train
Optimization terminated: change in the function value less than options['tol_fun'].
Function value at minimum: 2.7097907605695835e-05
3. Results and conclusions#
We examine now the result of the optimization stored in optimize_result
. The most important keys are the location of the minimum, optimize_result['x']
, and the value of the function at the minimum, optimize_result['fval']
.
x_min = optimize_result['x']
fval = optimize_result['fval']
print(f"BADS minimum at: x_min = {x_min.flatten()}, fval = {fval:.4g}")
print(f"total f-count: {optimize_result['func_count']}, time: {round(optimize_result['total_time'], 2)} s")
BADS minimum at: x_min = [0.99709786 0.993772 ], fval = 2.71e-05
total f-count: 75, time: 1.13 s
For reference, the true minimum of the Rosenbrock function is at \(\textbf{x}^\star = [1, 1]\), where \(f^\star = 0\).
In conclusion, PyBADS found the solution with a fairly small number of function evaluations (f-count
), which is particularly important if the target function is mildly-to-very expensive to compute as in many computational models.
Note: PyBADS by default does not aim for extreme numerical precision of the target (e.g., beyond the 2nd or 3rd decimal place), since in most realistic model-fitting problems a higher resolution is not particularly useful, e.g. due to noise or variability in the data.