eulerpi.core.sampling module
Module implementing the inference
with a Markov Chain Monte Carlo(MCMC) sampling.
The inference with a sampling based approach approximates the the (joint/marginal) parameter distribution(s) by calculating a parameter Markov Chain using multiple walkers in parallel. This module is currently based on the emcee package.
Note
The functions in this module are mainly intended for internal use and are accessed by inference
function.
Read the documentation of inference_mcmc
to learn more about the available options for the MCMC based inference.
- calc_walker_acceptance(model: BaseModel, slice: ndarray, num_walkers: int, num_burn_in_samples: int, result_manager: ResultManager)[source]
Calculate the acceptance ratio for each individual walker of the emcee chain. This is especially important to find “zombie” walkers, that are never moving.
- Parameters:
model (BaseModel) – The model for which the acceptance ratio should be calculated
slice (np.ndarray) – slice for which the acceptance ratio should be calculated
num_walkers (int) – number of walkers in the emcee chain
num_burn_in_samples (int) – Number of samples that will be deleted (burned) per chain (i.e. walker). Only for mcmc inference.
result_manager (ResultManager) – ResultManager to load the results from
- Returns:
Array with the acceptance ratio for each walker
- Return type:
np.ndarray
- inference_mcmc(model: BaseModel, data: ndarray, data_transformation: DataTransformation, result_manager: ResultManager, slices: list[ndarray], num_processes: int, num_runs: int = 1, num_walkers: int = 10, num_steps: int = 2500, num_burn_in_samples: int | None = None, thinning_factor: int | None = None, get_walker_acceptance: bool = False) Tuple[Dict[str, ndarray], Dict[str, ndarray], Dict[str, ndarray], ResultManager] [source]
This function runs a MCMC sampling for the given model and data.
- Parameters:
model (BaseModel) – The model describing the mapping from parameters to data.
data (np.ndarray) – The data to be used for the inference.
data_transformation (DataTransformation) – The data transformation used to normalize the data.
result_manager (ResultManager) – The result manager to be used for the inference.
slices (np.ndarray) – A list of slices to be used for the inference.
num_processes (int) – The number of processes to be used for the inference.
num_runs (int, optional) – The number of runs to be used for the inference. For each run except the first, all walkers continue with the end position of the previous run - this parameter does not affect the number of Markov chains, but how often results for each chain are saved. Defaults to 1.
num_walkers (int, optional) – The number of walkers to be used for the inference. Corresponds to the number of Markov chains. Defaults to 10.
num_steps (int, optional) – The number of steps to be used for the inference. Defaults to 2500.
num_burn_in_samples (int, optional) – number of samples to be discarded as burn-in. Defaults to None means a burn in of 10% of the total number of samples.
thinning_factor (int, optional) – thinning factor for the samples. Defaults to None means no thinning.
get_walker_acceptance (bool, optional) – If True, the acceptance rate of the walkers is calculated and printed. Defaults to False.
- Returns:
The parameter samples, the corresponding simulation results, the corresponding density evaluations for each slice and the result manager used for the inference.
- Return type:
Tuple[Dict[str, np.ndarray], Dict[str, np.ndarray], Dict[str, np.ndarray], ResultManager]
- run_emcee_once(model: BaseModel, data: ndarray, data_transformation: DataTransformation, data_stdevs: ndarray, slice: ndarray, initial_walker_positions: ndarray, num_walkers: int, num_steps: int, num_processes: int) ndarray [source]
Run the emcee particle swarm sampler once.
- Parameters:
model (BaseModel) – The model which will be sampled
data (np.ndarray) – data
data_transformation (DataTransformation) – The data transformation used to normalize the data.
data_stdevs (np.ndarray) – kernel width for the data
slice (np.ndarray) – slice of the parameter space which will be sampled
initial_walker_positions (np.ndarray) – initial parameter values for the walkers
num_walkers (int) – number of particles in the particle swarm sampler
num_steps (int) – number of samples each particle performs before storing the sub run
num_processes (int) – number of parallel threads
- Returns:
samples from the transformed parameter density
- Return type:
np.ndarray
- run_emcee_sampling(model: BaseModel, data: ndarray, data_transformation: DataTransformation, slice: ndarray, result_manager: ResultManager, num_processes: int, num_runs: int, num_walkers: int, num_steps: int, num_burn_in_samples: int, thinning_factor: int) Tuple[ndarray, ndarray, ndarray] [source]
- Create a representative sample from the transformed parameter density using the emcee particle swarm sampler.
Inital values are not stored in the chain and each file contains <num_steps> blocks of size num_walkers.
- Parameters:
model (BaseModel) – The model which will be sampled
data (np.ndarray) – data
data_transformation (DataTransformation) – The data transformation used to normalize the data.
slice (np.ndarray) – slice of the parameter space which will be sampled
result_manager (ResultManager) – ResultManager which will store the results
num_processes (int) – number of parallel threads.
num_runs (int) – number of stored sub runs.
num_walkers (int) – number of particles in the particle swarm sampler.
num_steps (int) – number of samples each particle performs before storing the sub run.
num_burn_in_samples (int) – Number of samples that will be deleted (burned) per chain (i.e. walker). Only for mcmc inference.
thinning_factor (int) – thinning factor for the samples.
- Returns:
Array with all params, array with all data, array with all log probabilities TODO check: are those really log probabilities?
- Return type:
Tuple[np.ndarray, np.ndarray, np.ndarray]