Gaussian processes#

gpyreg.gaussian_process#

This module contains the GP class, which is the main entry point for working with Gaussian processes. A class instance is initialized with specified dimension and covariance/mean/noise functions. Then, its various methods can be used to fit data, make predictions, plot the GP, etc.

GP#

class gpyreg.GP(D: int, covariance: object, mean: object, noise: object)[source]#

A single Gaussian Process (GP).

Parameters:
Dint

The dimension of the Gaussian Process.

covarianceobject

The covariance function to use. This can be one of the objects from the following module: gpyreg.covariance_functions.

meanobject

The mean function to use. This can be one of the objects from the following module: gpyreg.mean_functions.

noiseobject

The noise function to use. This can be one of the objects from the following module: gpyreg.noise_functions.

bounds_to_dict(lower_bounds: ndarray, upper_bounds: ndarray)[source]#

Convert the given hyperparameter lower and upper bounds to a dict.

Parameters:
lower_boundsndarray, shape (hyp_N,)

The lower bounds.

upper_boundsndarray, shape (hyp_N,)

The upper bounds.

Returns:
bounds_dictdict

A dictionary of the current hyperparameter names and tuples of their lower and upper bounds.

clean()[source]#

Clean auxiliary computational structures from the Gaussian Process, thus reducing memory usage. These can be reconstructed with a call to update() with compute_posterior=True.

Furthermore, the temporary_data attribute is being cleared.

fit(X: ndarray = None, y: ndarray = None, s2: ndarray = None, hyp0=None, options: dict = None)[source]#

Train the hyperparameters of the Gaussian Process.

Parameters:
Xndarray, shape (N, D), optional

Training points that will replace the current training points of the GP. If not given the current training points are used.

yndarray, shape (N, 1), optional

Training targets that will replace the current training targets of the GP. If not given the current training targets are used.

s2ndarray, shape (N, 1), optional

Noise variance at training points that will replace the current noise variances of the GP. If not given the current noise variances are used.

optionsdict, optional

A dictionary of options for training. The possible options are:

opts_Nint, defaults to 3

Number of hyperparameter optimization runs.

init_Nint, defaults to 1024

Initial design size for hyperparameter optimization.

df_baseint, defaults to 7

Default degrees of freedom for student’s t prior.

n_samplesint, defaults to 10

Number of hyperparameters to sample.

thinint, defaults to 5

Thinning parameter for slice sampling.

burnint, defaults to thin * n_samples

Burn parameter for slice sampling.

lower_boundsstr or ndarray, defaults to “current”

User-provided lower bounds. Any values which are nan will be filled with the recommended bounds. “recommended” means use all recommended bounds. “current” means use current bounds.

upper_boundsstr or ndarray, defaults to “current”

User-provided upper bounds. Any values which are nan will be filled with the recommended bounds. “recommended” means use all recommended bounds. “current” means use current bounds.

init_method{‘sobol’, ‘rand’}, defaults to ‘sobol’

Specify whether to use Sobol or random sequences for the initial space-filling design.

sampler_name{‘slicesample’}, defaults to ‘slicesample’

The name of the sampler to use. Currently only slice sampling is supported.

tol_optfloat, defaults to 1e-5

Optimization tolerance for stopping.

tol_opt_mcmcfloat, defaults to 1e-3

Preliminary optimization tolerance when doing MCMC.

widthsndarray, shape (hyp_n,), optional

Default widths to use for sampling. If not provided appropriate ones will be computed.

Returns:
hypndarray, shape (hyp_samples, hyp_N)

The fitted hyperparameters.

optimize_resultOptimizeResult

The optimization result represented as a OptimizeResult object. For more details see scipy.optimize.minimize().

sampling_resultdict

If sampling was performed this is a dictionary with info on the sampling run, and None otherwise.

Raises:
ValueError

Raised when the sampler_name is not slicesample.

get_bounds()[source]#

Gets a dictionary with the current lower and upper bounds of the hyperparameters.

Returns:
bounds_dictdict

A dictionary of the current hyperparameter names and their bounds.

get_hyperparameters(as_array: bool = False)[source]#

Get the current hyperparameters of the GP.

If hyperparameters have not been set yet, the result will be filled with NaN.

Parameters:
as_arraybool, defaults to False

Whether to return the hyperparameters as an array of shape (hyp_samples, hyp_N), or a list of dictionaries for each sample.

Returns:
hypobject

The hyperparameters in the form specified by as_array.

get_priors()[source]#

Return the current hyperparameter priors as a dict.

Returns:
hyper_priorsdict

A dictionary of the current hyperparameter names and their priors.

Return the recommended hyperparameter lower and upper bounds as a dict.

Parameters:
lower_boundsndarray, optional

If present, override the recommended lower bounds with these values. Any nan values will be replaced by the corresponding recommended bounds. Defaults to all nan values.

upper_boundsndarray, optional

If present, override the recommended upper bounds with these values. Any nan values will be replaced by the corresponding recommended bounds. Defaults to all nan values.

Returns:
bounds_dictdict

A dictionary of the hyperparameter names and tuples of lower upper bounds.

Raises:
ValueError

Raise when GP does not have X or y set yet, or when provided bounds are not one of “recommended”/None, “current”, or array_like.

hyperparameters_from_dict(hyp_dict_list)[source]#

Convert a list of hyperparameter dictionaries to a hyperparameter array.

Parameters:
hyp_dict_listobject

A list of hyperparameter dictionaries with hyperparameter names and values. One can also pass just one dictionary instead of a list with one element.

Returns:
hyp_arrndarray, shape (hyp_samples, hyp_N)

The hyperparameter array where hyp_samples is the length of the list hyp_dict_list.

hyperparameters_to_dict(hyp_arr: ndarray)[source]#

Convert a hyperparameter array to a list which contains a dictionary with hyperparameter names and values for each hyperparameter sample.

Parameters:
hyp_arrndarray

An array of shape (hyp_samples, hyp_N) or shape (hyp_N,), which is interpreted as shape (1, hyp_N), containing hyperparameters.

Returns:
hyp_dictobject

A list which contains a dictonary with hyperparameter names and values for each sample.

Raises:
ValueError

Raised when the input hyperparameter array has the wrong shape.

log_likelihood(hyp: object, compute_grad: bool = False)[source]#

Compute the (positive) log marginal likelihood of the GP for given hyperparameters.

Parameters:
hypobject

Either an 1D array or a dictionary of hyperparameters.

compute_gradbool, defaults to False

Whether to compute the gradient with respect to hyperparameters.

Returns:
lZfloat

The positive log marginal likelihood.

dlZndarray, shape (hyp_N,), optional

The gradient with respect to hyperparameters.

log_posterior(hyp: object, compute_grad: bool = False)[source]#

Compute the (positive) log marginal likelihood of the GP with added log prior for given hyperparameters (that is, the unnormalized log posterior).

Parameters:
hypobject

Either an 1D array or a dictionary of hyperparameters.

compute_gradbool, defaults to False

Whether to compute the gradient with respect to hyperparameters.

Returns:
lZ_plus_posteriorfloat

The positive log marginal likelihood with added log prior.

dlZ_plus_d_posteriorndarray, shape (hyp_N,), optional

The gradient with respect to hyperparameters.

Raises:
LinAlgError

Raised when the Cholesky decomposition failed multiple times even by adding numerical stability values to the matrix.

plot(x0: ndarray = None, lb: ndarray = None, ub: ndarray = None, delta_y: float = None, max_min_flag: bool = True)[source]#

Plot the Gaussian Process profile centered around a given point.

The plot is a D-by-D panel matrix, in which panels on the diagonal show the profile of the Gaussian Process prediction (mean and +/- 1 SD) by varying one dimension at a time, whereas off-diagonal panels show 2-D contour plots of the GP mean and standard deviation (respectively, above and below diagonal). In each panel, black lines indicate the location of the reference point.

Parameters:
x0ndarray, shape (D,), optional

The reference point.

lbndarray, shape (D,), optional

Lower bounds for the plotting.

ubndarray, shape (D,), optional

Upper bounds for the plotting.

delta_yfloat, optional

Range of the plot such that the plotted predictive GP mean approximately brackets [y0-delta_y, y0+delta_y] where y0 is the predictive GP mean at x0. If lower or upper bounds are given this will do nothing.

max_min_flagbool, defaults to True

If set to False then the minimum, and if set to True then the maximum of the GP training input is used as the reference point.

predict(x_star: ndarray, y_star: ndarray = None, s2_star: ndarray = None, add_noise: bool = False, separate_samples: bool = False, return_lpd: bool = False)[source]#

Compute the GP posterior mean and noise variance at given points.

Parameters:
x_starndarray, shape (M, D)

The points we want to predict the values at.

y_starndarray, shape (M, 1), optional

True values at the points.

s2_starndarray, shape (M, 1), optional

Noise variance at the points.

add_noisebool, defaults to True

Whether to add noise to the prediction results.

separate_samplesbool, defaults to False

Whether to return the results separately for each hyperparameter sample or averaged.

return_lpdbool, defaults to False

Whether to return the log predictive density at the input points. If separate_samples is False, returns the lpd of the corresponding mean approximation.

Returns:
mundarray

Posterior mean at the requested points. If we requested separate samples the shape is (M, sample_N) while otherwise it is (M,). sample.

s2ndarray

Noise variance at each point. If we requested separate samples the shape is (M, sample_N) while otherwise it is (M,).

predict_full(x_star: ndarray, y_star: ndarray = None, s2_star: ndarray = None, add_noise: bool = False)[source]#

Compute the GP posterior mean and full covariance matrix for each hyperparameter sample.

Parameters:
x_starndarray, shape (M, D)

The points we want to predict the values at.

y_starndarray, shape (M, 1), optional

True values at the points.

s2_starndarray, shape (M, 1), optional

Noise variance at the points.

add_noisebool, defaults to True

Whether to add noise to the prediction results.

Returns:
mundarray, shape (M, sample_N)

Posterior mean at the requested points for each hyperparameter sample.

covndarray, shape (M, M, sample_N)

Covariance matrix for each hyperparameter sample.

quad(mu, sigma, compute_var: bool = False, separate_samples: bool = False)[source]#

Bayesian quadrature for a Gaussian Process.

Compute the integral of a function represented by a Gaussian Process with respect to a given Gaussian measure.

Parameters:
muarray_like

Either a array of shape (N, D) with each row containing the mean of a single Gaussian measure, or a single floating point number which is interpreted as an array of shape (1, D).

sigmaarray_like

Either a array of shape (N, D) with each row containing the variance of a single Gaussian measure, or a single floating point number which is interpreted as an array of shape (1, D).

compute_varbool, defaults to False

Whether to compute variance for each integral.

separate_samplesbool, defaults to False

Whether to return the results separately for each hyperparameter sample, or averaged.

Returns:
Fndarray

The conputed integrals in an array with shape (N, 1) if samples are averaged and shape (N, hyp_samples) if requested separately.

F_varndarray, optional

The computed variances of the integrals in an array with shape (N, 1) if samples are averaged and shape (N, hyp_samples) if requested separately.

Raises:
ValueError

Raised when the method is called and the covariance of the GP is not squared exponential.

random_function(X_star: ndarray, add_noise: bool = False)[source]#

Draw a random function from the Gaussian Process.

Parameters:
X_starndarray, shape (M, D)

The points at which to evaluate the drawn function.

add_noisebool, defaults to False

Whether to add noise to the values of the drawn function.

Returns:
f_starndarray, shape (M, 1)

The values of the drawn function at the requested points.

set_bounds(bounds: dict = None)[source]#

Set the hyperparameter lower and upper bounds.

Parameters:
boundsdict, optional

A dictionary of GP hyperparameter names and tuples of their lower and upper bounds. All hyperparameters need to appear in the dictionary. Use the value None to set the bounds of a specific hyperparameter to the default recommended values. If bounds=None, all hyperparameter bounds will be set to their default recommended values.

Raises:
ValueError

Raised when bounds is missing the entry of an expected hyperparameter.

set_hyperparameters(hyp_new: object, compute_posterior: bool = True)[source]#

Set new hyperparameters for the Gaussian Process.

Parameters:
hyp_newobject

The new hyperparameters. This can be an array of shape (hyp_samples, hyp_N) where hyp_N is the number of hyperparameters, and hyp_samples is the amount of hyperparameter samples, a single dictionary with hyperparameter names and values, or a list of dictionaries. Passing a single dictionary or a list with one dictionary is equivalent.

compute_posteriorbool, defaults to True

Whether to compute the posterior for the new hyperparameters.

Raises:
ValueError

Raised when hyp_new is an array of the wrong shape.

set_priors(priors: dict = None)[source]#

Set the hyperparameter priors.

Parameters:
priorsdict, optional

A dictionary of GP hyperparameter names and tuples of their priors. All hyperparameters need to appear in the dictionary. Use the value None to set no priors for a hyperparameter. If priors=None, all hyperparameter priors are removed.

Raises:
ValueError

Raised when priors is given, but missing the entry of an expected hyperparameter.

ValueError

Raised when priors is given, but a specified hyperparameter is unknown.

update(X_new: ndarray = None, y_new: ndarray = None, s2_new: ndarray = None, hyp: ndarray = None, compute_posterior: bool = True)[source]#

Add new data to the Gaussian Process.

Parameters:
X_newndarray, shape (N, D), optional

New training inputs that will be added to the old training inputs.

y_newnarray, shape (N, 1) optional

New training targets that will be added to the old training targets.

s2_newndarray, shape (N, 1), optional

New input-dependent noise that will be added to the old training inputs.

hypndarray, shape (hyp_N,), optional

New hyperparameters that will replace the old ones.

compute_posteriorbool, defaults to True

Whether to compute the new posterior or not.

Raises:
LinAlgError

Raised when the Cholesky decomposition failed multiple times even by adding numerical stability values to the matrix.

Posterior#

class gpyreg.gaussian_process.Posterior(hyp, alpha, sW, L, sn2_mult, Lchol)[source]#

This object represents the posterior.