PyBADS Example 3: Noisy objective function#
In this example, we will show how to run PyBADS on a noisy target.
This notebook is Part 3 of a series of notebooks in which we present various example usages for BADS with the PyBADS package. The code used in this example is available as a script here.
import numpy as np
from pybads import BADS
0. Noisy optimization#
PyBADS is also able to optimize noisy objective functions. A noisy (or stochastic) objective function is an objective that will return different results if evaluated twice at the same point \(\mathbf{x}\). Conversely, a non-noisy objective function is known as noiseless or deterministic. For example, noisy objectives are common in model fitting when the model is evaluated through simulation (e.g., via sampling aka Monte Carlo methods).
For a noisy objective, PyBADS aims to minimize the expected value of \(f(\mathbf{x})\):
1. Problem setup#
For this example, we take as target a quadratic function and we add i.i.d. Gaussian noise to it (noisy sphere). In a real case, the noise would arise from some stochastic process in the calculation of the target.
We also set here several options
for the optimization:
We tell
bads
that the target is noisy by activating theuncertainty_handling
option. This is not strictly needed, asbads
can automatically detect if a target is noisy, but it is good practice to specify.We also limit the number of function evaluations with
max_fun_evals
, knowing that this is a simple example. Generally,bads
will tend to run for longer on noisy problems to better explore the noisy landscape.Finally, we tell
bads
to re-evaluate the target at the returned solution with 100 samples vianoise_final_samples
(by default,noise_final_samples = 10
, but since our function is inexpensive we can use more evaluations). These evaluations count towards the total budget of function evaluationsmax_fun_evals
.
def noisy_sphere(x,sigma=1.0):
"""Simple quadratic function with added noise."""
x_2d = np.atleast_2d(x)
f = np.sum(x_2d**2, axis=1)
noise = sigma*np.random.normal(size=x_2d.shape[0])
return f + noise
x0 = np.array([-3, -3]); # Starting point
lower_bounds = np.array([-5, -5])
upper_bounds = np.array([5, 5])
plausible_lower_bounds = np.array([-2, -2])
plausible_upper_bounds = np.array([2, 2])
options = {
"uncertainty_handling": True,
"max_fun_evals": 300,
"noise_final_samples": 100
}
2. Run the optimization#
We run bads
with the user-defined options
.
bads = BADS(
noisy_sphere, x0, lower_bounds, upper_bounds, plausible_lower_bounds, plausible_upper_bounds, options=options
)
optimize_result = bads.optimize()
Beginning optimization of a STOCHASTIC objective function
Iteration f-count E[f(x)] SD[f(x)] MeshScale Method Actions
0 1 17.0467 nan 1
0 33 -0.573104 nan 1 Initial mesh Initial points
0 37 -0.573104 1 0.5 Refine grid Train
1 45 -0.573104 1 0.25 Refine grid Train
2 46 -0.0875221 0.236985 0.25 Successful search (ES-wcm)
2 47 -0.133904 0.229856 0.25 Incremental search (ES-wcm)
2 51 -0.14124 0.205711 0.25 Incremental search (ES-ell)
2 53 -0.184658 0.196127 0.25 Incremental search (ES-ell)
2 57 -0.184658 0.196127 0.125 Refine grid Train
3 58 -0.113416 0.181841 0.125 Incremental search (ES-wcm)
3 59 -0.143424 0.17853 0.125 Incremental search (ES-wcm)
3 60 -0.153414 0.175473 0.125 Incremental search (ES-wcm)
3 65 -0.153414 0.175473 0.0625 Refine grid
4 73 -0.202447 0.162169 0.03125 Refine grid
5 77 -0.0930382 0.136617 0.03125 Successful search (ES-ell)
5 78 -0.0993084 0.139908 0.03125 Successful search (ES-ell)
5 79 -0.108523 0.138428 0.03125 Successful search (ES-ell)
5 80 -0.110481 0.137001 0.03125 Incremental search (ES-ell)
5 81 -0.116493 0.135622 0.03125 Successful search (ES-ell)
5 82 -0.119346 0.134247 0.03125 Incremental search (ES-wcm)
5 83 -0.126705 0.132937 0.03125 Successful search (ES-wcm)
5 93 -0.126705 0.132937 0.015625 Refine grid Train
6 94 -0.112421 0.115326 0.015625 Successful search (ES-ell)
6 95 -0.138253 0.114425 0.015625 Successful search (ES-wcm)
6 105 -0.138253 0.114425 0.0078125 Refine grid
7 106 -0.108284 0.103892 0.0078125 Incremental search (ES-wcm)
7 111 -0.110376 0.101127 0.015625 Successful poll Train
8 119 -0.0915521 0.0990537 0.00390625 Refine grid Train
9 127 -0.0813753 0.0944134 0.000976562 Refine grid Train
10 131 -0.037847 0.0890104 0.000976562 Successful search (ES-ell)
10 134 -0.0528433 0.089216 0.000976562 Successful search (ES-ell)
10 135 -0.0639539 0.0887966 0.000976562 Successful search (ES-ell)
10 143 -0.0639539 0.0887966 0.000488281 Refine grid
11 145 -0.0383948 0.084316 0.000488281 Successful search (ES-ell)
11 146 -0.0420733 0.0839543 0.000488281 Successful search (ES-ell)
11 147 -0.0479433 0.0835921 0.000488281 Successful search (ES-ell)
11 150 -0.0501619 0.0825213 0.000488281 Successful search (ES-wcm)
11 152 -0.0640664 0.0818386 0.000488281 Successful search (ES-wcm)
11 153 -0.0722405 0.0815143 0.000488281 Successful search (ES-wcm)
11 163 -0.0722405 0.0815143 0.000244141 Refine grid
Optimization terminated: change in the function value less than options['tol_fun'].
Estimated function value at minimum: -0.04111370566211175 ± 0.09385579705318894 (mean ± SEM from 100 samples)
3. Results and conclusions#
First, note that in this case optimize_result['fval']
is the estimated function value at optimize_result['x']
, obtained by taking the mean of options['noise_final_samples']
target evaluations (noise_final_samples = 10
by default, but here we used 100). The uncertainty of the value of the function at the returned solution, optimize_result['fsd']
, is the standard error of the mean.
If needed, the final samples used to estimate fval
and fsd
can be found in optimize_result['yval_vec']
.
x_min = optimize_result['x']
fval = optimize_result['fval']
fsd = optimize_result['fsd']
print(f"BADS minimum at: x_min = {x_min.flatten()}, fval (estimated) = {fval:.4g} +/- {fsd:.2g}")
print(f"total f-count: {optimize_result['func_count']}, time: {round(optimize_result['total_time'], 2)} s")
print(f"final evaluations (shape): {optimize_result['yval_vec'].shape}")
BADS minimum at: x_min = [-0.0268445 0.00389814], fval (estimated) = -0.04111 +/- 0.094
total f-count: 264, time: 5.38 s
final evaluations (shape): (100,)
We can also check the ground-truth value of the target function at the returned point once we remove the noise:
print(f"The true, noiseless value of f(x_min) is {noisy_sphere(x_min,sigma=0)[0]:.3g}.")
The true, noiseless value of f(x_min) is 0.000736.
Compare this to the true global minimum of the sphere function at \(\textbf{x}^\star = [0,0]\), where \(f^\star = 0\).
Remarks#
While PyBADS can handle noisy targets, it cannot handle arbitrarily large noise.
PyBADS will work best if the standard deviation of the objective function \(\sigma\), when evaluated in the vicinity of the global solution, is small with respect to changes in the objective function itself (that is, there is a good signal-to-noise ratio). In many cases, \(\sigma \approx 1\) or less should work (this is the default assumption). If you approximately know the magnitude of the noise in the vicinity of the solution, you can help BADS by specifying it in advance (set
options["noise_size"] = sigma_est
, wheresigma_est
is your estimate of the standard deviation).If the noise around the solution is too large, PyBADS will perform poorly. In that case, we recommend to increase the precision of your computation of the objective (e.g., by drawing more Monte Carlo samples) such that \(\sigma \approx 1\) or even lower, as needed by your problem. Note that the noise farther away from the solution can be larger, and this is usually okay.
In this example, we assumed the amount of noise in each target evaluation is unknown. If instead you can estimate the magnitude of the noise for each evaluation, see the next example notebook.
For more information on optimizing noisy objective functions, see the BADS wiki: https://github.com/acerbilab/bads/wiki#noisy-objective-function (this link points to the MATLAB wiki, but many of the questions and answers apply to PyBADS as well).