Full DDEGP#

class jetgp.full_ddegp.ddegp.ddegp(x_train, y_train, n_order, der_indices, rays, derivative_locations=None, normalize=True, sigma_data=None, kernel='SE', kernel_type='anisotropic', smoothness_parameter=None)[source]#

Bases: object

Directional Derivative-Enhanced Gaussian Process (dDEGP) model.

Supports multiple directional derivatives, hypercomplex representation, and automatic normalization. Includes methods for training, prediction, and uncertainty quantification using kernel methods.

Parameters:
  • x_train (ndarray) – Training input data of shape (n_samples, n_features).

  • y_train (list or ndarray) – Training targets or list of directional derivatives.

  • n_order (int) – Maximum derivative order.

  • der_indices (list of lists) – Derivative multi-indices corresponding to each derivative term.

  • rays (ndarray) – Array of shape (d, n_rays), where each column is a direction vector. Important: rays defines the OTI space dimension used internally (n_rays = rays.shape[1]). Every direction you may ever want to predict — including directions for which no training data exists — must appear as a column here. A direction absent from rays cannot be requested via derivs_to_predict at prediction time.

  • derivative_locations (list of lists) – Which training points have which derivatives.

  • normalize (bool, default=True) – Whether to normalize inputs and outputs.

  • sigma_data (float or array-like, optional) – Observation noise standard deviation or diagonal noise values.

  • kernel (str, default='SE') – Kernel type (‘SE’, ‘RQ’, ‘Matern’, etc.).

  • kernel_type (str, default='anisotropic') – Kernel anisotropy (‘anisotropic’ or ‘isotropic’).

  • smoothness_parameter (float, optional) – Smoothness parameter for Matern kernel.

optimize_hyperparameters(*args, **kwargs)[source]#

Run the optimizer to find the best kernel hyperparameters. Returns optimized hyperparameter vector.

predict(X_test, params, calc_cov=False, return_deriv=False, derivs_to_predict=None)[source]#

Predict posterior mean and optional variance at test points.

Parameters:
  • X_test (ndarray) – Test input points of shape (n_test, n_features).

  • params (ndarray) – Log-scaled kernel hyperparameters.

  • calc_cov (bool, default=False) – Whether to compute predictive variance.

  • return_deriv (bool, default=False) – Whether to return derivative predictions.

  • derivs_to_predict (list, optional) –

    Specific derivatives to predict. Can include derivatives not present in the training set — the cross-covariance K_* is constructed from kernel derivatives and does not require the requested derivative to have been observed during training. Each entry must be a valid derivative spec within n_rays and n_order. If None, defaults to all derivatives used in training.

    DDEGP-specific constraint: each index must reference a ray that exists in the rays array passed at construction. For example, [[4, 1]] requires rays to have at least 4 columns. Unlike DEGP — where the OTI space always spans the fixed coordinate axes — the DDEGP OTI space is spanned by the columns of rays, so any direction not included there is inaccessible at prediction time.

Returns:

  • f_mean (ndarray) – Predictive mean vector.

  • f_var (ndarray, optional) – Predictive variance vector (only if calc_cov=True).

jetgp.full_ddegp.ddegp_utils.deriv_map(nbases, order)[source]#

Creates a mapping from (order, index_within_order) to a single flattened index for all derivative components.

Parameters:
  • nbases (int) – Number of base dimensions.

  • order (int) – Maximum derivative order.

Returns:

map_deriv – Mapping where map_deriv[order][idx] gives the flattened index.

Return type:

list of lists

jetgp.full_ddegp.ddegp_utils.differences_by_dim_func(X1, X2, rays, n_order, oti_module, return_deriv=True, index=-1)[source]#

Compute dimension-wise pairwise differences between X1 and X2, including hypercomplex perturbations in the directions specified by rays.

This optimized version pre-calculates the perturbation and uses a single efficient loop for subtraction, avoiding broadcasting issues with OTI arrays.

Parameters:
  • X1 (ndarray of shape (n1, d)) – First set of input points with n1 samples in d dimensions.

  • X2 (ndarray of shape (n2, d)) – Second set of input points with n2 samples in d dimensions.

  • rays (ndarray of shape (d, n_rays)) – Directional vectors for derivative computation.

  • n_order (int) – The base order used to construct hypercomplex units. When return_deriv=True, uses order 2*n_order. When return_deriv=False, uses order n_order.

  • oti_module (module) – The PyOTI static module (e.g., pyoti.static.onumm4n2).

  • return_deriv (bool, optional (default=True)) – If True, use order 2*n_order for hypercomplex units (needed for derivative-derivative blocks in training kernel). If False, use order n_order (sufficient for prediction without derivative outputs).

  • index (int, optional) – Currently unused. Reserved for future enhancements.

Returns:

differences_by_dim – A list where each element is an array of shape (n1, n2), containing the differences between corresponding dimensions of X1 and X2, augmented with directional hypercomplex perturbations.

Return type:

list of length d

Notes

  • The function leverages hypercomplex arithmetic from the pyOTI library.

  • The directional perturbation is computed as: perts = rays @ e_bases where e_bases are the hypercomplex units for each ray direction.

  • This routine is typically used in the construction of directional derivative kernels for Gaussian processes.

Example

>>> X1 = np.array([[1.0, 2.0], [3.0, 4.0]])
>>> X2 = np.array([[1.5, 2.5], [3.5, 4.5]])
>>> rays = np.eye(2)  # Standard basis directions
>>> n_order = 1
>>> oti_module = get_oti_module(2, 1)  # dim=2, n_order=1
>>> diffs = differences_by_dim_func(X1, X2, rays, n_order, oti_module)
>>> len(diffs)
2
>>> diffs[0].shape
(2, 2)
jetgp.full_ddegp.ddegp_utils.extract_and_assign(content_full, row_indices, col_indices, K, row_start, col_start, sign)[source]#

Extract submatrix and assign directly to K with sign multiplication. Combines extraction and assignment in one pass for better performance.

Parameters:
  • content_full (ndarray of shape (n_rows_full, n_cols_full)) – Source matrix.

  • row_indices (ndarray of int64) – Row indices to extract.

  • col_indices (ndarray of int64) – Column indices to extract.

  • K (ndarray) – Target matrix to fill.

  • row_start (int) – Starting row index in K.

  • col_start (int) – Starting column index in K.

  • sign (float) – Sign multiplier (+1.0 or -1.0).

jetgp.full_ddegp.ddegp_utils.extract_cols(content_full, col_indices, n_rows)[source]#

Extract columns from content_full at specified indices.

Parameters:
  • content_full (ndarray of shape (n_rows, n_cols_full)) – Source matrix.

  • col_indices (ndarray of int64) – Column indices to extract.

  • n_rows (int) – Number of rows.

Returns:

result – Extracted columns.

Return type:

ndarray of shape (n_rows, len(col_indices))

jetgp.full_ddegp.ddegp_utils.extract_cols_and_assign(content_full, col_indices, K, row_start, col_start, n_rows, sign)[source]#

Extract columns and assign directly to K with sign multiplication.

Parameters:
  • content_full (ndarray of shape (n_rows, n_cols_full)) – Source matrix.

  • col_indices (ndarray of int64) – Column indices to extract.

  • K (ndarray) – Target matrix to fill.

  • row_start (int) – Starting row index in K.

  • col_start (int) – Starting column index in K.

  • n_rows (int) – Number of rows to copy.

  • sign (float) – Sign multiplier (+1.0 or -1.0).

jetgp.full_ddegp.ddegp_utils.extract_rows(content_full, row_indices, n_cols)[source]#

Extract rows from content_full at specified indices.

Parameters:
  • content_full (ndarray of shape (n_rows_full, n_cols)) – Source matrix.

  • row_indices (ndarray of int64) – Row indices to extract.

  • n_cols (int) – Number of columns.

Returns:

result – Extracted rows.

Return type:

ndarray of shape (len(row_indices), n_cols)

jetgp.full_ddegp.ddegp_utils.extract_rows_and_assign(content_full, row_indices, K, row_start, col_start, n_cols, sign)[source]#

Extract rows and assign directly to K with sign multiplication.

Parameters:
  • content_full (ndarray of shape (n_rows_full, n_cols)) – Source matrix.

  • row_indices (ndarray of int64) – Row indices to extract.

  • K (ndarray) – Target matrix to fill.

  • row_start (int) – Starting row index in K.

  • col_start (int) – Starting column index in K.

  • n_cols (int) – Number of columns to copy.

  • sign (float) – Sign multiplier (+1.0 or -1.0).

jetgp.full_ddegp.ddegp_utils.extract_submatrix(content_full, row_indices, col_indices)[source]#

Extract submatrix from content_full at specified row and column indices. Replaces the expensive np.ix_ operation.

Parameters:
  • content_full (ndarray of shape (n_rows_full, n_cols_full)) – Source matrix.

  • row_indices (ndarray of int64) – Row indices to extract.

  • col_indices (ndarray of int64) – Column indices to extract.

Returns:

result – Extracted submatrix.

Return type:

ndarray of shape (len(row_indices), len(col_indices))

jetgp.full_ddegp.ddegp_utils.precompute_kernel_plan(n_order, n_bases, der_indices, powers, index)[source]#

Precompute all structural information needed by rbf_kernel so it can be reused across repeated calls with different phi_exp values.

Returns a dict containing flat indices, signs, index arrays, precomputed offsets/sizes, and mult_dir results for the dd block.

jetgp.full_ddegp.ddegp_utils.rbf_kernel(phi, phi_exp, n_order, n_bases, der_indices, powers, index=-1)[source]#

Assembles the full DD-GP covariance matrix using an efficient, pre-computed derivative array and block-wise matrix filling.

Supports both uniform blocks (all derivatives at all points) and non-contiguous indices (different derivatives at different subsets of points).

This version uses Numba-accelerated functions for efficient matrix slicing, replacing expensive np.ix_ operations.

Parameters:
  • phi (OTI array) – Base kernel matrix from kernel_func(differences, length_scales).

  • phi_exp (ndarray) – Expanded derivative array from phi.get_all_derivs().

  • n_order (int) – Maximum derivative order considered.

  • n_bases (int) – Number of input dimensions (rays).

  • der_indices (list of lists) – Multi-index derivative structures for each derivative component.

  • powers (list of int) – Powers of (-1) applied to each term (for symmetry or sign conventions).

  • index (list of lists or int, optional (default=-1)) – If empty list, assumes all derivative types apply to all training points. If provided, specifies which training point indices have each derivative type, allowing non-contiguous index support and variable block sizes.

Returns:

K – Full kernel matrix with function values and derivative blocks.

Return type:

ndarray

jetgp.full_ddegp.ddegp_utils.rbf_kernel_fast(phi_exp_3d, plan, out=None)[source]#

Fast kernel assembly using a precomputed plan and fused numba kernel.

Parameters:
  • phi_exp_3d (ndarray of shape (n_derivs, n_rows_func, n_cols_func)) – Pre-reshaped expanded derivative array.

  • plan (dict) – Precomputed plan from precompute_kernel_plan().

  • out (ndarray, optional) – Pre-allocated output array. If None, a new array is allocated.

Returns:

K – Full kernel matrix.

Return type:

ndarray

jetgp.full_ddegp.ddegp_utils.rbf_kernel_predictions(phi, phi_exp, n_order, n_bases, der_indices, powers, return_deriv, index=-1, common_derivs=None, calc_cov=False, powers_predict=None)[source]#

Constructs the RBF kernel matrix for predictions with directional derivative entries.

This handles the asymmetric case where: - Rows: Test points (predictions) - Columns: Training points (with derivative structure from index)

This version uses Numba-accelerated functions for efficient matrix slicing.

Parameters:
  • phi (OTI array) – Base kernel matrix between test and training points.

  • phi_exp (ndarray) – Expanded derivative array from phi.get_all_derivs().

  • n_order (int) – Maximum derivative order.

  • n_bases (int) – Number of input dimensions (rays).

  • der_indices (list) – Derivative specifications for training data.

  • powers (list of int) – Sign powers for each derivative type.

  • return_deriv (bool) – If True, predict derivatives at ALL test points.

  • index (list of lists or int, optional (default=-1)) – Training point indices for each derivative type.

  • common_derivs (list, optional) – Common derivative indices to predict (intersection of training and requested).

  • calc_cov (bool, optional (default=False)) – If True, computing covariance (use all indices for rows).

  • powers_predict (list of int, optional) – Sign powers for prediction derivatives.

Returns:

K – Prediction kernel matrix.

Return type:

ndarray

jetgp.full_ddegp.ddegp_utils.transform_der_indices(der_indices, der_map)[source]#

Transforms a list of user-facing derivative specifications into the internal (order, index) format and the final flattened index.

Parameters:
  • der_indices (list) – User-facing derivative specifications.

  • der_map (list of lists) – Derivative mapping from deriv_map().

Returns:

  • deriv_ind_transf (list) – Flattened indices for each derivative.

  • deriv_ind_order (list) – (index, order) tuples for each derivative.

class jetgp.full_ddegp.optimizer.Optimizer(model)[source]#

Bases: object

Optimizer class to perform hyperparameter tuning for derivative-enhanced Gaussian Process models by minimizing the negative log marginal likelihood (NLL).

Parameters:

model (object) – An instance of a model (e.g., ddegp) containing the necessary training data and kernel configuration.

negative_log_marginal_likelihood(x0)[source]#

Compute the negative log marginal likelihood (NLL) of the model.

NLL = 0.5 * y^T K^-1 y + 0.5 * log|K| + 0.5 * N * log(2π)

Parameters:

x0 (ndarray) – Vector of log-scaled hyperparameters (length scales and noise).

Returns:

Value of the negative log marginal likelihood.

Return type:

float

nll_and_grad(x0)[source]#

Compute NLL and its gradient in a single pass, sharing one Cholesky.

nll_grad(x0)[source]#

Analytic gradient of the NLL w.r.t. log10-scaled hyperparameters.

nll_wrapper(x0)[source]#

Wrapper function to compute NLL for optimizer.

Parameters:

x0 (ndarray) – Hyperparameter vector.

Returns:

NLL evaluated at x0.

Return type:

float

optimize_hyperparameters(optimizer='pso', **kwargs)[source]#

Optimize the DEGP model hyperparameters using Particle Swarm Optimization (PSO).

Parameters:#

n_restart_optimizerint, default=20

Maximum number of iterations for PSO.

swarm_sizeint, default=20

Number of particles in the swarm.

verbosebool, default=True

Controls verbosity of PSO output.

Returns:#

best_xndarray

The optimal set of hyperparameters found.