eulerpi.core.sampling module

Module implementing the inference with a Markov Chain Monte Carlo(MCMC) sampling.

The inference with a sampling based approach approximates the the (joint/marginal) parameter distribution(s) by calculating a parameter Markov Chain using multiple walkers in parallel. This module is currently based on the emcee package.

Note

The functions in this module are mainly intended for internal use and are accessed by inference function. Read the documentation of inference_mcmc to learn more about the available options for the MCMC based inference.

calc_walker_acceptance(model: Model, slice: ndarray, num_walkers: int, num_burn_in_samples: int, result_manager: ResultManager)[source]

Calculate the acceptance ratio for each individual walker of the emcee chain. This is especially important to find “zombie” walkers, that are never moving.

Parameters:
  • model (Model) – The model for which the acceptance ratio should be calculated

  • slice (np.ndarray) – slice for which the acceptance ratio should be calculated

  • num_walkers (int) – number of walkers in the emcee chain

  • num_burn_in_samples (int) – Number of samples that will be deleted (burned) per chain (i.e. walker). Only for mcmc inference.

  • result_manager (ResultManager) – ResultManager to load the results from

Returns:

Array with the acceptance ratio for each walker

Return type:

np.ndarray

inference_mcmc(model: Model, data: ndarray, data_transformation: DataTransformation, result_manager: ResultManager, slices: list[ndarray], num_processes: int, num_runs: int = 1, num_walkers: int = 10, num_steps: int = 2500, num_burn_in_samples: int | None = None, thinning_factor: int | None = None, get_walker_acceptance: bool = False) Tuple[Dict[str, ndarray], Dict[str, ndarray], Dict[str, ndarray], ResultManager][source]

This function runs a MCMC sampling for the given model and data.

Parameters:
  • model (Model) – The model describing the mapping from parameters to data.

  • data (np.ndarray) – The data to be used for the inference.

  • data_transformation (DataTransformation) – The data transformation used to normalize the data.

  • result_manager (ResultManager) – The result manager to be used for the inference.

  • slices (np.ndarray) – A list of slices to be used for the inference.

  • num_processes (int) – The number of processes to be used for the inference.

  • num_runs (int, optional) – The number of runs to be used for the inference. For each run except the first, all walkers continue with the end position of the previous run - this parameter does not affect the number of Markov chains, but how often results for each chain are saved. Defaults to 1.

  • num_walkers (int, optional) – The number of walkers to be used for the inference. Corresponds to the number of Markov chains. Defaults to 10.

  • num_steps (int, optional) – The number of steps to be used for the inference. Defaults to 2500.

  • num_burn_in_samples (int, optional) – number of samples to be discarded as burn-in. Defaults to None means a burn in of 10% of the total number of samples.

  • thinning_factor (int, optional) – thinning factor for the samples. Defaults to None means no thinning.

  • get_walker_acceptance (bool, optional) – If True, the acceptance rate of the walkers is calculated and printed. Defaults to False.

Returns:

The parameter samples, the corresponding simulation results, the corresponding density evaluations for each slice and the result manager used for the inference.

Return type:

Tuple[Dict[str, np.ndarray], Dict[str, np.ndarray], Dict[str, np.ndarray], ResultManager]

run_emcee_once(model: Model, data: ndarray, data_transformation: DataTransformation, data_stdevs: ndarray, slice: ndarray, initial_walker_positions: ndarray, num_walkers: int, num_steps: int, num_processes: int) ndarray[source]

Run the emcee particle swarm sampler once.

Parameters:
  • model (Model) – The model which will be sampled

  • data (np.ndarray) – data

  • data_transformation (DataTransformation) – The data transformation used to normalize the data.

  • data_stdevs (np.ndarray) – kernel width for the data

  • slice (np.ndarray) – slice of the parameter space which will be sampled

  • initial_walker_positions (np.ndarray) – initial parameter values for the walkers

  • num_walkers (int) – number of particles in the particle swarm sampler

  • num_steps (int) – number of samples each particle performs before storing the sub run

  • num_processes (int) – number of parallel threads

Returns:

samples from the transformed parameter density

Return type:

np.ndarray

run_emcee_sampling(model: Model, data: ndarray, data_transformation: DataTransformation, slice: ndarray, result_manager: ResultManager, num_processes: int, num_runs: int, num_walkers: int, num_steps: int, num_burn_in_samples: int, thinning_factor: int) Tuple[ndarray, ndarray, ndarray][source]
Create a representative sample from the transformed parameter density using the emcee particle swarm sampler.

Inital values are not stored in the chain and each file contains <num_steps> blocks of size num_walkers.

Parameters:
  • model (Model) – The model which will be sampled

  • data (np.ndarray) – data

  • data_transformation (DataTransformation) – The data transformation used to normalize the data.

  • slice (np.ndarray) – slice of the parameter space which will be sampled

  • result_manager (ResultManager) – ResultManager which will store the results

  • num_processes (int) – number of parallel threads.

  • num_runs (int) – number of stored sub runs.

  • num_walkers (int) – number of particles in the particle swarm sampler.

  • num_steps (int) – number of samples each particle performs before storing the sub run.

  • num_burn_in_samples (int) – Number of samples that will be deleted (burned) per chain (i.e. walker). Only for mcmc inference.

  • thinning_factor (int) – thinning factor for the samples.

Returns:

Array with all params, array with all data, array with all log probabilities TODO check: are those really log probabilities?

Return type:

Tuple[np.ndarray, np.ndarray, np.ndarray]