eulerpi.core.kde module

This module provides functions to handle the Kernel Densitiy Estimation (KDE) in EPI.

It is used in the EPI algorithm to eulerpi.core.transformations.evaluate_density of the transformed data distribution at the simulation results.

calc_kernel_width(data: Array) → Array[source]

Sets the width of the kernels used for density estimation of the data according to the Silverman rule

Parameters:: data (jnp.ndarray) – data for the model: 2D array with shape (#Samples, #MeasurementDimensions)
Returns:: kernel width for each data dimension, shape: (#MeasurementDimensions,)
Return type:: jnp.ndarray

Note

Make sure to always use 2D arrays as data, especially when the data dimension is only one.

The data object should be shaped (#Samples, 1) and not (#Samples,) in this case.

Examples:

import jax.numpy as jnp
from eulerpi.core.kde import calc_kernel_width

# create 4 data points of dimension 2 and store them in a numpy 2D array
data = jnp.array([[0,0], [0,2], [1,0], [1,2]])

scales = calc_kernel_width(data)

eval_kde_cauchy(data: Array, sim_res: Array, scales: Array) → float64 | Array[source]

Evaluates a Cauchy Kernel Density estimator in one or several simulation results. Assumes that each data point is a potentially high-dimensional sample from a joint data distribution. This is for example given for time-series data, where each evaluation time is one dimension of the data point. In the following formula x are the evaluation points (sim_res) and y is the data.

\[density_{i} = \frac{1}{samples} \sum_{s=1}^{samples} \prod_{d=1}^{dims} \frac{1}{(\frac{x_{i,d} - y_{s,d}}{scales_d})^2 \; \pi \; scales_d}\]

Parameters:

data (jnp.ndarray) – data for the model: 2D array with shape (#Samples, #MeasurementDimensions)
sim_res (jnp.ndarray) – evaluation coordinates array of shape (#nEvals, #MeasurementDimensions) or (#MeasurementDimensions,)
scales (jnp.ndarray) – one scale for each dimension

Returns:

estimated kernel density evaluated at the simulation result(s), shape: (#nEvals,) or ()

Return type:

Union[jnp.double, jnp.ndarray]

eval_kde_gauss(data: Array, sim_res: Array, scales: Array) → float64 | Array[source]

Evaluates a Gaussian Kernel Density estimator in one or severalsimulation result. Assumes that each data point is a potentially high-dimensional sample from a joint data distribution. This is for example given for time-series data, where each evaluation time is one dimension of the data point. While it is possible to define different standard deviations for different measurement dimensions, it is so far not possible to define covariances.

Parameters:

data (jnp.ndarray) – data for the model: 2D array with shape (#Samples, #MeasurementDimensions)
sim_res (jnp.ndarray) – evaluation coordinates array of shape (#nEvals, #MeasurementDimensions) or (#MeasurementDimensions,)
scales (jnp.ndarray) – one scale for each dimension

Returns:

estimated kernel density evaluated at the simulation result(s), shape: (#nEvals,) or ()

Return type:

Union[jnp.double, jnp.ndarray]

Note

Make sure to always use 2D arrays as data, especially when the data dimension is only one.

The data object should be shaped (#Samples, 1) and not (#Samples,) in this case.

Examples:

import jax.numpy as jnp
from eulerpi.core.kde import eval_kde_gauss

# create 4 data points of dimension 2 and store them in a numpy 2D array
data = jnp.array([[0,0], [0,1], [1,0], [1,1]])

# we intend to evaluate the kernel density estimator at the point (0.5, 0.5)
evaluation_coordinates = jnp.array([[0.5, 0.5]])

# the dimension-specific kernel bandwidths are set to 1
scales = jnp.array([1,1])

kde_res = eval_kde_gauss(data, evaluation_coordinates, scales)