Gaussian processes#
gpyreg.gaussian_process
#
This module contains the GP
class, which is the main entry point for working with Gaussian processes. A class instance is initialized with specified dimension and covariance/mean/noise functions. Then, its various methods can be used to fit data, make predictions, plot the GP, etc.
GP
#
- class gpyreg.GP(D: int, covariance: object, mean: object, noise: object)[source]#
A single Gaussian Process (GP).
- Parameters:
- Dint
The dimension of the Gaussian Process.
- covarianceobject
The covariance function to use. This can be one of the objects from the following module:
gpyreg.covariance_functions
.- meanobject
The mean function to use. This can be one of the objects from the following module:
gpyreg.mean_functions
.- noiseobject
The noise function to use. This can be one of the objects from the following module:
gpyreg.noise_functions
.
- bounds_to_dict(lower_bounds: ndarray, upper_bounds: ndarray)[source]#
Convert the given hyperparameter lower and upper bounds to a dict.
- Parameters:
- lower_boundsndarray, shape (hyp_N,)
The lower bounds.
- upper_boundsndarray, shape (hyp_N,)
The upper bounds.
- Returns:
- bounds_dictdict
A dictionary of the current hyperparameter names and tuples of their lower and upper bounds.
- clean()[source]#
Clean auxiliary computational structures from the Gaussian Process, thus reducing memory usage. These can be reconstructed with a call to
update()
withcompute_posterior=True
.Furthermore, the temporary_data attribute is being cleared.
- fit(X: ndarray = None, y: ndarray = None, s2: ndarray = None, hyp0=None, options: dict = None)[source]#
Train the hyperparameters of the Gaussian Process.
- Parameters:
- Xndarray, shape (N, D), optional
Training points that will replace the current training points of the GP. If not given the current training points are used.
- yndarray, shape (N, 1), optional
Training targets that will replace the current training targets of the GP. If not given the current training targets are used.
- s2ndarray, shape (N, 1), optional
Noise variance at training points that will replace the current noise variances of the GP. If not given the current noise variances are used.
- optionsdict, optional
A dictionary of options for training. The possible options are:
- opts_Nint, defaults to 3
Number of hyperparameter optimization runs.
- init_Nint, defaults to 1024
Initial design size for hyperparameter optimization.
- df_baseint, defaults to 7
Default degrees of freedom for student’s t prior.
- n_samplesint, defaults to 10
Number of hyperparameters to sample.
- thinint, defaults to 5
Thinning parameter for slice sampling.
- burnint, defaults to
thin * n_samples
Burn parameter for slice sampling.
- lower_boundsstr or ndarray, defaults to “current”
User-provided lower bounds. Any values which are nan will be filled with the recommended bounds. “recommended” means use all recommended bounds. “current” means use current bounds.
- upper_boundsstr or ndarray, defaults to “current”
User-provided upper bounds. Any values which are nan will be filled with the recommended bounds. “recommended” means use all recommended bounds. “current” means use current bounds.
- init_method{‘sobol’, ‘rand’}, defaults to ‘sobol’
Specify whether to use Sobol or random sequences for the initial space-filling design.
- sampler_name{‘slicesample’}, defaults to ‘slicesample’
The name of the sampler to use. Currently only slice sampling is supported.
- tol_optfloat, defaults to 1e-5
Optimization tolerance for stopping.
- tol_opt_mcmcfloat, defaults to 1e-3
Preliminary optimization tolerance when doing MCMC.
- widthsndarray, shape (hyp_n,), optional
Default widths to use for sampling. If not provided appropriate ones will be computed.
- Returns:
- hypndarray, shape (hyp_samples, hyp_N)
The fitted hyperparameters.
- optimize_resultOptimizeResult
The optimization result represented as a
OptimizeResult
object. For more details seescipy.optimize.minimize()
.- sampling_resultdict
If sampling was performed this is a dictionary with info on the sampling run, and None otherwise.
- Raises:
- ValueError
Raised when the sampler_name is not slicesample.
- get_bounds()[source]#
Gets a dictionary with the current lower and upper bounds of the hyperparameters.
- Returns:
- bounds_dictdict
A dictionary of the current hyperparameter names and their bounds.
- get_hyperparameters(as_array: bool = False)[source]#
Get the current hyperparameters of the GP.
If hyperparameters have not been set yet, the result will be filled with
NaN
.- Parameters:
- as_arraybool, defaults to False
Whether to return the hyperparameters as an array of shape
(hyp_samples, hyp_N)
, or a list of dictionaries for each sample.
- Returns:
- hypobject
The hyperparameters in the form specified by
as_array
.
- get_priors()[source]#
Return the current hyperparameter priors as a dict.
- Returns:
- hyper_priorsdict
A dictionary of the current hyperparameter names and their priors.
- get_recommended_bounds(lower_bounds=None, upper_bounds=None)[source]#
Return the recommended hyperparameter lower and upper bounds as a dict.
- Parameters:
- lower_boundsndarray, optional
If present, override the recommended lower bounds with these values. Any nan values will be replaced by the corresponding recommended bounds. Defaults to all nan values.
- upper_boundsndarray, optional
If present, override the recommended upper bounds with these values. Any nan values will be replaced by the corresponding recommended bounds. Defaults to all nan values.
- Returns:
- bounds_dictdict
A dictionary of the hyperparameter names and tuples of lower upper bounds.
- Raises:
- ValueError
Raise when GP does not have X or y set yet, or when provided bounds are not one of “recommended”/None, “current”, or array_like.
- hyperparameters_from_dict(hyp_dict_list)[source]#
Convert a list of hyperparameter dictionaries to a hyperparameter array.
- Parameters:
- hyp_dict_listobject
A list of hyperparameter dictionaries with hyperparameter names and values. One can also pass just one dictionary instead of a list with one element.
- Returns:
- hyp_arrndarray, shape (hyp_samples, hyp_N)
The hyperparameter array where
hyp_samples
is the length of the listhyp_dict_list
.
- hyperparameters_to_dict(hyp_arr: ndarray)[source]#
Convert a hyperparameter array to a list which contains a dictionary with hyperparameter names and values for each hyperparameter sample.
- Parameters:
- hyp_arrndarray
An array of shape
(hyp_samples, hyp_N)
or shape(hyp_N,)
, which is interpreted as shape(1, hyp_N)
, containing hyperparameters.
- Returns:
- hyp_dictobject
A list which contains a dictonary with hyperparameter names and values for each sample.
- Raises:
- ValueError
Raised when the input hyperparameter array has the wrong shape.
- log_likelihood(hyp: object, compute_grad: bool = False)[source]#
Compute the (positive) log marginal likelihood of the GP for given hyperparameters.
- Parameters:
- hypobject
Either an 1D array or a dictionary of hyperparameters.
- compute_gradbool, defaults to False
Whether to compute the gradient with respect to hyperparameters.
- Returns:
- lZfloat
The positive log marginal likelihood.
- dlZndarray, shape (hyp_N,), optional
The gradient with respect to hyperparameters.
- log_posterior(hyp: object, compute_grad: bool = False)[source]#
Compute the (positive) log marginal likelihood of the GP with added log prior for given hyperparameters (that is, the unnormalized log posterior).
- Parameters:
- hypobject
Either an 1D array or a dictionary of hyperparameters.
- compute_gradbool, defaults to False
Whether to compute the gradient with respect to hyperparameters.
- Returns:
- lZ_plus_posteriorfloat
The positive log marginal likelihood with added log prior.
- dlZ_plus_d_posteriorndarray, shape (hyp_N,), optional
The gradient with respect to hyperparameters.
- Raises:
- LinAlgError
Raised when the Cholesky decomposition failed multiple times even by adding numerical stability values to the matrix.
- plot(x0: ndarray = None, lb: ndarray = None, ub: ndarray = None, delta_y: float = None, max_min_flag: bool = True)[source]#
Plot the Gaussian Process profile centered around a given point.
The plot is a D-by-D panel matrix, in which panels on the diagonal show the profile of the Gaussian Process prediction (mean and +/- 1 SD) by varying one dimension at a time, whereas off-diagonal panels show 2-D contour plots of the GP mean and standard deviation (respectively, above and below diagonal). In each panel, black lines indicate the location of the reference point.
- Parameters:
- x0ndarray, shape (D,), optional
The reference point.
- lbndarray, shape (D,), optional
Lower bounds for the plotting.
- ubndarray, shape (D,), optional
Upper bounds for the plotting.
- delta_yfloat, optional
Range of the plot such that the plotted predictive GP mean approximately brackets
[y0-delta_y, y0+delta_y]
wherey0
is the predictive GP mean atx0
. If lower or upper bounds are given this will do nothing.- max_min_flagbool, defaults to True
If set to
False
then the minimum, and if set toTrue
then the maximum of the GP training input is used as the reference point.
- predict(x_star: ndarray, y_star: ndarray = None, s2_star: ndarray = None, add_noise: bool = False, separate_samples: bool = False, return_lpd: bool = False)[source]#
Compute the GP posterior mean and noise variance at given points.
- Parameters:
- x_starndarray, shape (M, D)
The points we want to predict the values at.
- y_starndarray, shape (M, 1), optional
True values at the points.
- s2_starndarray, shape (M, 1), optional
Noise variance at the points.
- add_noisebool, defaults to
True
Whether to add noise to the prediction results.
- separate_samplesbool, defaults to
False
Whether to return the results separately for each hyperparameter sample or averaged.
- return_lpdbool, defaults to
False
Whether to return the log predictive density at the input points. If separate_samples is
False
, returns the lpd of the corresponding mean approximation.
- Returns:
- mundarray
Posterior mean at the requested points. If we requested separate samples the shape is
(M, sample_N)
while otherwise it is(M,)
. sample.- s2ndarray
Noise variance at each point. If we requested separate samples the shape is
(M, sample_N)
while otherwise it is(M,)
.
- predict_full(x_star: ndarray, y_star: ndarray = None, s2_star: ndarray = None, add_noise: bool = False)[source]#
Compute the GP posterior mean and full covariance matrix for each hyperparameter sample.
- Parameters:
- x_starndarray, shape (M, D)
The points we want to predict the values at.
- y_starndarray, shape (M, 1), optional
True values at the points.
- s2_starndarray, shape (M, 1), optional
Noise variance at the points.
- add_noisebool, defaults to True
Whether to add noise to the prediction results.
- Returns:
- mundarray, shape (M, sample_N)
Posterior mean at the requested points for each hyperparameter sample.
- covndarray, shape (M, M, sample_N)
Covariance matrix for each hyperparameter sample.
- quad(mu, sigma, compute_var: bool = False, separate_samples: bool = False)[source]#
Bayesian quadrature for a Gaussian Process.
Compute the integral of a function represented by a Gaussian Process with respect to a given Gaussian measure.
- Parameters:
- muarray_like
Either a array of shape
(N, D)
with each row containing the mean of a single Gaussian measure, or a single floating point number which is interpreted as an array of shape(1, D)
.- sigmaarray_like
Either a array of shape
(N, D)
with each row containing the variance of a single Gaussian measure, or a single floating point number which is interpreted as an array of shape(1, D)
.- compute_varbool, defaults to False
Whether to compute variance for each integral.
- separate_samplesbool, defaults to False
Whether to return the results separately for each hyperparameter sample, or averaged.
- Returns:
- Fndarray
The conputed integrals in an array with shape
(N, 1)
if samples are averaged and shape(N, hyp_samples)
if requested separately.- F_varndarray, optional
The computed variances of the integrals in an array with shape
(N, 1)
if samples are averaged and shape(N, hyp_samples)
if requested separately.
- Raises:
- ValueError
Raised when the method is called and the covariance of the GP is not squared exponential.
- random_function(X_star: ndarray, add_noise: bool = False)[source]#
Draw a random function from the Gaussian Process.
- Parameters:
- X_starndarray, shape (M, D)
The points at which to evaluate the drawn function.
- add_noisebool, defaults to False
Whether to add noise to the values of the drawn function.
- Returns:
- f_starndarray, shape (M, 1)
The values of the drawn function at the requested points.
- set_bounds(bounds: dict = None)[source]#
Set the hyperparameter lower and upper bounds.
- Parameters:
- boundsdict, optional
A dictionary of GP hyperparameter names and tuples of their lower and upper bounds. All hyperparameters need to appear in the dictionary. Use the value
None
to set the bounds of a specific hyperparameter to the default recommended values. Ifbounds=None
, all hyperparameter bounds will be set to their default recommended values.
- Raises:
- ValueError
Raised when bounds is missing the entry of an expected hyperparameter.
- set_hyperparameters(hyp_new: object, compute_posterior: bool = True)[source]#
Set new hyperparameters for the Gaussian Process.
- Parameters:
- hyp_newobject
The new hyperparameters. This can be an array of shape
(hyp_samples, hyp_N)
wherehyp_N
is the number of hyperparameters, andhyp_samples
is the amount of hyperparameter samples, a single dictionary with hyperparameter names and values, or a list of dictionaries. Passing a single dictionary or a list with one dictionary is equivalent.- compute_posteriorbool, defaults to True
Whether to compute the posterior for the new hyperparameters.
- Raises:
- ValueError
Raised when hyp_new is an array of the wrong shape.
- set_priors(priors: dict = None)[source]#
Set the hyperparameter priors.
- Parameters:
- priorsdict, optional
A dictionary of GP hyperparameter names and tuples of their priors. All hyperparameters need to appear in the dictionary. Use the value
None
to set no priors for a hyperparameter. Ifpriors=None
, all hyperparameter priors are removed.
- Raises:
- ValueError
Raised when
priors
is given, but missing the entry of an expected hyperparameter.- ValueError
Raised when
priors
is given, but a specified hyperparameter is unknown.
- update(X_new: ndarray = None, y_new: ndarray = None, s2_new: ndarray = None, hyp: ndarray = None, compute_posterior: bool = True)[source]#
Add new data to the Gaussian Process.
- Parameters:
- X_newndarray, shape (N, D), optional
New training inputs that will be added to the old training inputs.
- y_newnarray, shape (N, 1) optional
New training targets that will be added to the old training targets.
- s2_newndarray, shape (N, 1), optional
New input-dependent noise that will be added to the old training inputs.
- hypndarray, shape (hyp_N,), optional
New hyperparameters that will replace the old ones.
- compute_posteriorbool, defaults to True
Whether to compute the new posterior or not.
- Raises:
- LinAlgError
Raised when the Cholesky decomposition failed multiple times even by adding numerical stability values to the matrix.