Slice sampling#

gpyreg.slice_sample#

SliceSampler#

class gpyreg.slice_sample.SliceSampler(log_f, x0: ndarray, widths=None, LB=None, UB=None, options: dict = None)[source]#

Class for drawing random samples from a target distribution with a given log probability density function using the coordinate-wise slice sampling method.

Parameters:
log_fcallable

The log pdf of the target distribution. It takes one argument as input that has the same type and size as x0 and returns the target log density function (minus a constant; that is, the normalization constant of the pdf need not be known).

Note that log_f can return either a scalar (the value of the log probability density at x) or a row vector (the value of the log probability density at x for each data point; each column corresponds to a different data point). In the latter case, the total log pdf is obtained by summing the log pdf per each individual data point. Also, f_vals in the object returned by sample is a matrix (each row corresponds to a sampled point, each column to a different data point). Knowing the log pdf of the sampled points per each data point can be useful to compute estimates of predictive error such as the widely applicable information criterion (WAIC); see [3].

x0ndarray, shape (D,)

Initial value of the random sample sequence. It must be within the domain of the target distribution. The number of independent variables is D.

widthsarray_like, optional

A parameter for typical widths. Either a scalar or a 1D array. If it is a scalar, all dimensions are assumed to have the same typical widths. If it is a 1D array, each element of the array is the typical width of the marginal target distribution in that dimension. The default value of widths[i] is (UB[i]-LB[i])/2 if the i-th bounds are finite, or 10 otherwise. By default an an adaptive widths method during the burn-in period is being used, so the choice of typical widths is not crucial.

LBarray_like, optional

An array of lower bounds on the domain of the target density function, which is assumed to be zero outside the range LB <= x <= UB. If not given no lower bounds are assumed. Set LB[i] = -inf if x[i] is unbounded below. If LB[i] == UB[i], the variable is assumed to be fixed on that dimension.

UBarray_like, optional

An array of upper bounds on the domain of the target density function, which is assumed to be zero outside the range LB <= x <= UB. If not given no upper bounds are assumed. Set UB[i] = inf if x[i] is unbounded above. If LB[i] == UB[i], the variable is assumed to be fixed on that dimension.

optionsdict, optional

A dictionary of sampler options. The possible options are:

step_outbool, defaults to False

If set to true, performs the stepping-out action when the current window does not bracket the probability density. For details see [1].

display{‘off’, ‘summary’, ‘full’}, defaults to ‘full’

Defines the level of display.

log_priorcallable, optional

Allows the user to specify a prior over x. The function log_prior takes one argument as input that has the same type and size as x0 and returns the log prior density function at X. The generated samples will be then drawn from the log density log_f + log_prior.

adaptivebool, defaults to True

Specifies whether to adapt widths at the end of the burn-in period based on the samples obtained so far. Disabling this works best if we already have good estimates.

diagnosticsbool, defaults to True

Specifies whether convergence diagnostics are performed at the end of the run. The diagnostic tests are from [4].

Raises:
ValueError

Raised when x0 is not a scalar or a 1D array.

ValueError

Raised when LB or UB are not None, scalars, or 1D arrays of the same size as x0.

ValueError

Raised when LB > UB.

ValueError

Raised when widths does not only contain positive real numbers.

ValueError

Raised when the initial starting point x0 is outside the bounds (LB and UB).

Notes

Inspired by a MATLAB implementation of slice sampling by Iain Murray. See pseudo-code in [2].

References

[1]

R. Neal (2003), Slice Sampling, Annals of Statistics, 31(3), p705-67.

[2]

D. J. MacKay (2003), Information theory, inference and learning algorithms, Cambridge university press, p374-7.

[3]

S. Watanabe (2010), Asymptotic equivalence of Bayes cross validation and widely applicable information criterion in singular learning theory, The Journal of Machine Learning Research, 11, p3571-94.

[4]

A. Gelman, et al (2013), Bayesian data analysis. Vol. 2. Boca Raton, FL, USA: Chapman & Hall/CRC.

sample(N: int, thin: int = 1, burn: int = None)[source]#

Samples an arbitrary number of points from the distribution.

Parameters:
Nint

The number of samples to return.

thinint, optional

The thinning parameter will omit thin-1 out of thin values in the generated sequence (after burn-in).

burnint, optional

The burn parameter omits the first burn points before starting recording points for the generated sequence. In case this is the first time sampling, the default value of burn is round(N/3) (that is, one third of the number of recorded samples), while otherwise it is 0.

Returns:
resdict

The sampling result represented as a dictionary with attributes

samplesarray_like

The actual sampled points.

f_valsarray_like

The sequence of values of the target log pdf at the sampled points. If a prior is specified in log_prior, then f_vals does NOT include the contribution of the prior.

exit_flag{ 1, 0, -1, -2, -3 }

Possible values and the corresponding exit conditions are

1, Target number of recorded samples reached,

with no explicit violation of convergence (this does not ensure convergence).

0, Target number of recorded samples reached,

convergence status is unknown (no diagnostics have been run).

-1, No explicit violation of convergence detected, but

the number of effective (independent) samples in the sampled sequence is much lower than the number of requested samples N for at least one dimension.

-2, Detected probable lack of convergence of the sampling

procedure.

-3, Detected lack of convergence of the sampling

procedure.

log_priorsarray_like

The sequence of the values of the log prior at the sampled points.

Rarray_like

Estimate of the potential scale reduction factor in each dimension.

eff_Narray_like

Estimate of the effective number of samples in each dimension.

Raises:
ValueError

Raised when thin is not a positive integer.

ValueError

Raised when burn is not a integer >= 0.

ValueError

Raised when the initial starting point X0 does not evaluate to a real number (e.g. Inf or NaN).