pymcmcstat package¶
pymcmcstat.MCMC module¶
Created on Wed Jan 17, 2018
@author: prmiles
Description: This module is intended to be the main class from which to run these Markov Chain Monte Carlo type simulations. The user will create an MCMC object, initialize data, simulation options, model settings and parameters.
-
class
pymcmcstat.MCMC.
MCMC
(rngseed=None, seterr={'over': 'ignore', 'under': 'ignore'})[source]¶ Bases:
object
Markov Chain Monte Carlo (MCMC) simulation object.
This class type is the primary feature of this Python package. All simulations are run through this class type, and for the most part the user will interact with an object of this type. The class initialization provides the option for setting the random seed, which can be very useful for testing the functionality of the code. It was found that setting the random seed at object initialization was the simplest interface.
- Args:
rngseed (
float
): Seed for numpy’s random number generator.
- Attributes:
data (
DataStructure
): MCMC data structure.simulation_options (
SimulationOptions
): MCMC simulation options.model_settings (
ModelSettings
): MCMC model settings.parameters (
ModelParameters
): MCMC model parameters.
-
display_current_mcmc_settings
()[source]¶ Display model settings, simulation options, and current covariance values.
Example display:
model settings: sos_function = <function test_ssfun at 0x1c13c5d400> model_function = None sigma2 = [1.] N = [100.] N0 = [0.] S20 = [1.] nsos = 1 nbatch = 1 simulation options: nsimu = 5000 adaptint = 100 ntry = 2 method = dram printint = 100 lastadapt = 5000 drscale = [5. 4. 3.] qcov = None covariance: qcov = [[0.01 0. ] [0. 0.0625]] R = [[0.1 0. ] [0. 0.25]] RDR = [array([[0.1 , 0. ], [0. , 0.25]]), array([[0.02, 0. ], [0. , 0.05]])] invR = [array([[10., 0.], [ 0., 4.]]), array([[50., 0.], [ 0., 20.]])] last_index_since_adaptation = 0 covchain = None
-
run_simulation
(use_previous_results=False)[source]¶ Run MCMC Simulation
Note
This is the method called by the user to run the simulation. The user must specify a data structure, setup simulation options, and define the model settings and parameters before calling this method.
- Args:
use_previous_results (
bool
): Flag to indicate whether simulation is being restarted.
-
pymcmcstat.MCMC.
print_rejection_statistics
(rejected, isimu, iiadapt, verbosity)[source]¶ Print Rejection Statistics.
Threshold for printing is verbosity greater than or equal to 2. If the rejection counters are as follows:
total: 144
in_adaptation_interval: 92
outside_bounds: 0
Then we would expect the following display at the 200th simulation with an adaptation interval of 100.
i:200 (72.0,92.0,0.0)
pymcmcstat.ParallelMCMC module¶
Created on Tue May 1 15:58:22 2018
@author: prmiles
-
class
pymcmcstat.ParallelMCMC.
ParallelMCMC
[source]¶ Bases:
object
Run Parallel MCMC Simulations.
- Attributes:
-
display_individual_chain_statistics
()[source]¶ Display chain statistics for different chains in parallel simulation.
Example display:
**************************************** Displaying results for chain 0 Files: <parallel_dir>/chain_0 --------------------- name : mean std MC_err tau geweke m : 1.9869 0.1295 0.0107 320.0997 0.9259 b : 3.0076 0.2489 0.0132 138.1260 0.9413 --------------------- **************************************** Displaying results for chain 1 Files: <parallel_dir>/chain_1 --------------------- name : mean std MC_err tau geweke m : 1.8945 0.4324 0.0982 2002.6361 0.3116 b : 3.2240 1.0484 0.2166 1734.0201 0.4161 ---------------------
-
run_parallel_simulation
()[source]¶ Run MCMC simulations in parallel.
The code is run in parallel by using
Pool
. While running, you can expect a display similar toProcessing: <parallel_dir>/chain_1 Processing: <parallel_dir>/chain_0 Processing: <parallel_dir>/chain_2
The simulation is complete when you see the run time displayed.
Parallel simulation run time: 16.15234899520874 sec
-
pymcmcstat.ParallelMCMC.
assign_number_of_cores
(num_cores=1)[source]¶ Assign number of cores to use in parallel process
-
pymcmcstat.ParallelMCMC.
check_directory
(directory)[source]¶ Check and make sure directory exists
- Args:
directory (
str
): Folder/directory path name.
-
pymcmcstat.ParallelMCMC.
check_for_restart_file
(json_restart_file, chain_dir)[source]¶ Check if restart directory was specified.
-
pymcmcstat.ParallelMCMC.
check_initial_values
(initial_values, num_chain, npar, low_lim, upp_lim)[source]¶ Check if initial values satisfy requirements.
- Args:
- Returns:
-
pymcmcstat.ParallelMCMC.
check_options_output
(options)[source]¶ Check output settings defined in options
- Args:
options (
SimulationOptions
): MCMC simulation options.
- Returns:
options (
SimulationOptions
): MCMC simulation options with at least binary save flag set to True.
-
pymcmcstat.ParallelMCMC.
check_shape_of_users_initial_values
(initial_values, num_chain, npar)[source]¶ Check shape of users initial values
- Args:
- Returns:
num_chain (
int
): Number of sampling chains to be generated - equal to number of rows in initial values array.initial_values
-
pymcmcstat.ParallelMCMC.
check_users_initial_values_wrt_limits
(initial_values, low_lim, upp_lim)[source]¶ Check users initial values wrt parameter limits
- Args:
- Returns:
initial_values (
ndarray
): Array of initial parameter values - expect [num_chain,npar]
- Raises:
SystemExit if initial values are outside parameter bounds.
-
pymcmcstat.ParallelMCMC.
generate_initial_values
(num_chain, npar, low_lim, upp_lim)[source]¶ Generate initial values by sampling from uniform distribution between limits
-
pymcmcstat.ParallelMCMC.
get_parameter_features
(parameters)[source]¶ Get features of model parameters.
-
pymcmcstat.ParallelMCMC.
load_parallel_simulation_results
(savedir, extension='h5', chainfile='chainfile', sschainfile='sschainfile', s2chainfile='s2chainfile', covchainfile='covchainfile')[source]¶ Load results from parallel simulation directory json files.
Lists in json files are converted to numpy arrays.
-
pymcmcstat.ParallelMCMC.
run_serial_simulation
(mcstat)[source]¶ Run serial MCMC simulation
- Args:
mcstat (
MCMC.MCMC
): MCMC object.
- Returns:
results (
dict
): Results dictionary for serial simulation.
-
pymcmcstat.ParallelMCMC.
unpack_mcmc_set
(mcset)[source]¶ Unpack attributes of MCMC object.
- Args:
mcset (
MCMC
): MCMC object.
- Returns:
data (
DataStructure
): MCMC data structure.options (
SimulationOptions
): MCMC simulation options.model (
ModelSettings
): MCMC model settings.parameters (
ModelParameters
): MCMC model parameters.
pymcmcstat.propagation module¶
Created on Wed Nov 8 12:00:11 2017
@author: prmiles
-
pymcmcstat.propagation.
calculate_intervals
(chain, results, data, model, s2chain=None, nsample=500, waitbar=True, sstype=0)[source]¶ Calculate distribution of model response to form propagation intervals
Samples values from chain, performs forward model evaluation, and tabulates credible and prediction intervals (if obs. error var. included).
- Args:
chain (
ndarray
): Parameter chains, expect shape=(nsimu, npar).results (
dict
): Results dictionary generated by pymcmcstat.data (
DataStructure
): Datamodel: User defined function. Note, if your model outputs multiple quantities of interest (QoI) at the same time in a multi-dimensional array, then make sure it is returned as a (N, p) array where N is the number of evaluation points and p is the number of QoI.
- Kwargs:
- Returns:
dict
with two elements: 1) credible and 2) prediction
-
pymcmcstat.propagation.
define_sample_points
(nsample, nsimu)[source]¶ Define indices to sample from posteriors.
-
pymcmcstat.propagation.
generate_quantiles
(x, p=array([0.25, 0.5, 0.75]))[source]¶ Calculate empirical quantiles.
-
pymcmcstat.propagation.
observation_sample
(s2, y, sstype)[source]¶ Calculate model response with observation errors.
-
pymcmcstat.propagation.
plot_3d_intervals
(intervals, time, ydata=None, xdata=None, limits=[95], adddata=False, addlegend=True, addmodel=True, figsize=None, model_display={}, data_display={}, interval_display={}, addcredible=True, addprediction=True, fig=None, legloc='upper left', ciset=None, piset=None, return_settings=False)[source]¶ Plot propagation intervals in 3-D
This routine takes the model distributions generated using the
calculate_intervals()
method and then plots specific quantiles. The user can plot just the intervals, or also include the median model response and/or observations. Specific settings for credible intervals are controlled by defining the ciset dictionary. Likewise, for prediction intervals, settings are defined using piset.- The setting options available for each interval are as follows:
limits: This should be a list of numbers between 0 and 100, e.g., limits=[50, 90] will result in 50% and 90% intervals.
cmap: The program is designed to “try” to choose colors that are visually distinct. The user can specify the colormap to choose from.
colors: The user can specify the color they would like for each interval in a list, e.g., [‘r’, ‘g’, ‘b’]. This list should have the same number of elements as limits or the code will revert back to its default behavior.
- Args:
intervals (
dict
): Interval dictionary generated usingcalculate_intervals()
method.time (
ndarray
): Independent variable, i.e., x- and y-axes of plot. Note, it must be a 2-D array with shape=(N, 2), where N is the number of evaluation points.
- Kwargs:
ydata (
ndarray
or None): Observations, expect 1-D array if defined.xdata (
ndarray
or None): Independent values corresponding to observations. This is required if the observations do not align with your times of generating the model response.limits (
list
): Quantile limits that correspond to percentage size of desired intervals. Note, this is the default limits, but specific limits can be defined using the ciset and piset dictionaries.adddata (
bool
): Flag to include dataaddmodel (
bool
): Flag to include median model responseaddlegend (
bool
): Flag to include legendaddcredible (
bool
): Flag to include credible intervalsaddprediction (
bool
): Flag to include prediction intervalsmodel_display (
dict
): Display settings for median model responsedata_display (
dict
): Display settings for datainterval_display (
dict
): General display settings for intervals.fig: Handle of previously created figure object
figsize (
tuple
): (width, height) in incheslegloc (
str
): Legend location - matplotlib help for details.ciset (
dict
): Settings for credible intervalspiset (
dict
): Settings for prediction intervalsreturn_settings (
bool
): Flag to return ciset and piset along with fig and ax.
- Returns:
- (
tuple
) with elements Figure handle
Axes handle
Dictionary with ciset and piset inside (only outputted if return_settings=True)
- (
-
pymcmcstat.propagation.
plot_intervals
(intervals, time, ydata=None, xdata=None, limits=[95], adddata=None, addmodel=True, addlegend=True, addcredible=True, addprediction=True, data_display={}, model_display={}, interval_display={}, fig=None, figsize=None, legloc='upper left', ciset=None, piset=None, return_settings=False)[source]¶ Plot propagation intervals in 2-D
This routine takes the model distributions generated using the
calculate_intervals()
method and then plots specific quantiles. The user can plot just the intervals, or also include the median model response and/or observations. Specific settings for credible intervals are controlled by defining the ciset dictionary. Likewise, for prediction intervals, settings are defined using piset.- The setting options available for each interval are as follows:
limits: This should be a list of numbers between 0 and 100, e.g., limits=[50, 90] will result in 50% and 90% intervals.
cmap: The program is designed to “try” to choose colors that are visually distinct. The user can specify the colormap to choose from.
colors: The user can specify the color they would like for each interval in a list, e.g., [‘r’, ‘g’, ‘b’]. This list should have the same number of elements as limits or the code will revert back to its default behavior.
- Args:
intervals (
dict
): Interval dictionary generated usingcalculate_intervals()
method.time (
ndarray
): Independent variable, i.e., x-axis of plot
- Kwargs:
ydata (
ndarray
or None): Observations, expect 1-D array if defined.xdata (
ndarray
or None): Independent values corresponding to observations. This is required if the observations do not align with your times of generating the model response.limits (
list
): Quantile limits that correspond to percentage size of desired intervals. Note, this is the default limits, but specific limits can be defined using the ciset and piset dictionaries.adddata (
bool
): Flag to include dataaddmodel (
bool
): Flag to include median model responseaddlegend (
bool
): Flag to include legendaddcredible (
bool
): Flag to include credible intervalsaddprediction (
bool
): Flag to include prediction intervalsmodel_display (
dict
): Display settings for median model responsedata_display (
dict
): Display settings for datainterval_display (
dict
): General display settings for intervals.fig: Handle of previously created figure object
figsize (
tuple
): (width, height) in incheslegloc (
str
): Legend location - matplotlib help for details.ciset (
dict
): Settings for credible intervalspiset (
dict
): Settings for prediction intervalsreturn_settings (
bool
): Flag to return ciset and piset along with fig and ax.
- Returns:
- (
tuple
) with elements Figure handle
Axes handle
Dictionary with ciset and piset inside (only outputted if return_settings=True)
- (
-
pymcmcstat.propagation.
setup_display_settings
(interval_display, model_display, data_display)[source]¶ Compare user defined display settings with defaults and merge.
- Args:
- Returns:
-
pymcmcstat.propagation.
setup_interval_colors
(iset, inttype='CI')[source]¶ Setup colors for empirical intervals
This routine attempts to distribute the color of the UQ intervals based on a normalize color map. Or, it will assign user-defined colors; however, this only happens if the correct number of colors are specified.