pymcmcstat¶
The pymcmcstat package is a Python program for running Markov Chain Monte Carlo (MCMC) simulations. Included in this package is the ability to use different Metropolis based sampling techniques:
Metropolis-Hastings (MH): Primary sampling method.
Adaptive-Metropolis (AM): Adapts covariance matrix at specified intervals.
Delayed-Rejection (DR): Delays rejection by sampling from a narrower distribution. Capable of n-stage delayed rejection.
Delayed Rejection Adaptive Metropolis (DRAM): DR + AM
This package is an adaptation of the MATLAB toolbox mcmcstat. The user interface is designed to be as similar to the MATLAB version as possible, but this implementation has taken advantage of certain data structure concepts more amenable to Python.
Note, advanced plotting routines are available in the mcmcplot package. Many plotting features are directly available within pymcmcstat, but some user’s may find mcmcplot useful.
Installation¶
This code can be found on the Github project page. This package is available on the PyPI distribution site and the latest version can be installed via
pip install pymcmcstat
The master branch on Github typically matches the latest version on the PyPI distribution site. To install the master branch directly from Github,
pip install git+https://github.com/prmiles/pymcmcstat.git
You can also clone the repository and run python setup.py install
.
Getting Started¶
Contributors¶
See the GitHub contributor page
Citing pymcmcstat¶
Miles, (2019). pymcmcstat: A Python Package for Bayesian Inference Using Delayed Rejection Adaptive Metropolis. Journal of Open Source Software, 4(38), 1417, https://doi.org/10.21105/joss.01417
Also, please cite the appropriate Zenodo archive for the version of pymcmcstat that you are using.
Feedback¶
Sponsor¶
This work was sponsored in part by the NNSA Office of Defense Nuclear Nonproliferation R&D through the Consortium for Nonproliferation Enabling Capabilities.
Contents:¶
pymcmcstat package¶
pymcmcstat.MCMC module¶
Created on Wed Jan 17, 2018
@author: prmiles
Description: This module is intended to be the main class from which to run these Markov Chain Monte Carlo type simulations. The user will create an MCMC object, initialize data, simulation options, model settings and parameters.
-
class
pymcmcstat.MCMC.
MCMC
(rngseed=None, seterr={'over': 'ignore', 'under': 'ignore'})[source]¶ Bases:
object
Markov Chain Monte Carlo (MCMC) simulation object.
This class type is the primary feature of this Python package. All simulations are run through this class type, and for the most part the user will interact with an object of this type. The class initialization provides the option for setting the random seed, which can be very useful for testing the functionality of the code. It was found that setting the random seed at object initialization was the simplest interface.
- Args:
rngseed (
float
): Seed for numpy’s random number generator.
- Attributes:
data (
DataStructure
): MCMC data structure.simulation_options (
SimulationOptions
): MCMC simulation options.model_settings (
ModelSettings
): MCMC model settings.parameters (
ModelParameters
): MCMC model parameters.
-
display_current_mcmc_settings
()[source]¶ Display model settings, simulation options, and current covariance values.
Example display:
model settings: sos_function = <function test_ssfun at 0x1c13c5d400> model_function = None sigma2 = [1.] N = [100.] N0 = [0.] S20 = [1.] nsos = 1 nbatch = 1 simulation options: nsimu = 5000 adaptint = 100 ntry = 2 method = dram printint = 100 lastadapt = 5000 drscale = [5. 4. 3.] qcov = None covariance: qcov = [[0.01 0. ] [0. 0.0625]] R = [[0.1 0. ] [0. 0.25]] RDR = [array([[0.1 , 0. ], [0. , 0.25]]), array([[0.02, 0. ], [0. , 0.05]])] invR = [array([[10., 0.], [ 0., 4.]]), array([[50., 0.], [ 0., 20.]])] last_index_since_adaptation = 0 covchain = None
-
run_simulation
(use_previous_results=False)[source]¶ Run MCMC Simulation
Note
This is the method called by the user to run the simulation. The user must specify a data structure, setup simulation options, and define the model settings and parameters before calling this method.
- Args:
use_previous_results (
bool
): Flag to indicate whether simulation is being restarted.
-
pymcmcstat.MCMC.
print_rejection_statistics
(rejected, isimu, iiadapt, verbosity)[source]¶ Print Rejection Statistics.
Threshold for printing is verbosity greater than or equal to 2. If the rejection counters are as follows:
total: 144
in_adaptation_interval: 92
outside_bounds: 0
Then we would expect the following display at the 200th simulation with an adaptation interval of 100.
i:200 (72.0,92.0,0.0)
pymcmcstat.ParallelMCMC module¶
Created on Tue May 1 15:58:22 2018
@author: prmiles
-
class
pymcmcstat.ParallelMCMC.
ParallelMCMC
[source]¶ Bases:
object
Run Parallel MCMC Simulations.
- Attributes:
-
display_individual_chain_statistics
()[source]¶ Display chain statistics for different chains in parallel simulation.
Example display:
**************************************** Displaying results for chain 0 Files: <parallel_dir>/chain_0 --------------------- name : mean std MC_err tau geweke m : 1.9869 0.1295 0.0107 320.0997 0.9259 b : 3.0076 0.2489 0.0132 138.1260 0.9413 --------------------- **************************************** Displaying results for chain 1 Files: <parallel_dir>/chain_1 --------------------- name : mean std MC_err tau geweke m : 1.8945 0.4324 0.0982 2002.6361 0.3116 b : 3.2240 1.0484 0.2166 1734.0201 0.4161 ---------------------
-
run_parallel_simulation
()[source]¶ Run MCMC simulations in parallel.
The code is run in parallel by using
Pool
. While running, you can expect a display similar toProcessing: <parallel_dir>/chain_1 Processing: <parallel_dir>/chain_0 Processing: <parallel_dir>/chain_2
The simulation is complete when you see the run time displayed.
Parallel simulation run time: 16.15234899520874 sec
-
pymcmcstat.ParallelMCMC.
assign_number_of_cores
(num_cores=1)[source]¶ Assign number of cores to use in parallel process
-
pymcmcstat.ParallelMCMC.
check_directory
(directory)[source]¶ Check and make sure directory exists
- Args:
directory (
str
): Folder/directory path name.
-
pymcmcstat.ParallelMCMC.
check_for_restart_file
(json_restart_file, chain_dir)[source]¶ Check if restart directory was specified.
-
pymcmcstat.ParallelMCMC.
check_initial_values
(initial_values, num_chain, npar, low_lim, upp_lim)[source]¶ Check if initial values satisfy requirements.
- Args:
- Returns:
-
pymcmcstat.ParallelMCMC.
check_options_output
(options)[source]¶ Check output settings defined in options
- Args:
options (
SimulationOptions
): MCMC simulation options.
- Returns:
options (
SimulationOptions
): MCMC simulation options with at least binary save flag set to True.
-
pymcmcstat.ParallelMCMC.
check_shape_of_users_initial_values
(initial_values, num_chain, npar)[source]¶ Check shape of users initial values
- Args:
- Returns:
num_chain (
int
): Number of sampling chains to be generated - equal to number of rows in initial values array.initial_values
-
pymcmcstat.ParallelMCMC.
check_users_initial_values_wrt_limits
(initial_values, low_lim, upp_lim)[source]¶ Check users initial values wrt parameter limits
- Args:
- Returns:
initial_values (
ndarray
): Array of initial parameter values - expect [num_chain,npar]
- Raises:
SystemExit if initial values are outside parameter bounds.
-
pymcmcstat.ParallelMCMC.
generate_initial_values
(num_chain, npar, low_lim, upp_lim)[source]¶ Generate initial values by sampling from uniform distribution between limits
-
pymcmcstat.ParallelMCMC.
get_parameter_features
(parameters)[source]¶ Get features of model parameters.
-
pymcmcstat.ParallelMCMC.
load_parallel_simulation_results
(savedir, extension='h5', chainfile='chainfile', sschainfile='sschainfile', s2chainfile='s2chainfile', covchainfile='covchainfile')[source]¶ Load results from parallel simulation directory json files.
Lists in json files are converted to numpy arrays.
-
pymcmcstat.ParallelMCMC.
run_serial_simulation
(mcstat)[source]¶ Run serial MCMC simulation
- Args:
mcstat (
MCMC.MCMC
): MCMC object.
- Returns:
results (
dict
): Results dictionary for serial simulation.
-
pymcmcstat.ParallelMCMC.
unpack_mcmc_set
(mcset)[source]¶ Unpack attributes of MCMC object.
- Args:
mcset (
MCMC
): MCMC object.
- Returns:
data (
DataStructure
): MCMC data structure.options (
SimulationOptions
): MCMC simulation options.model (
ModelSettings
): MCMC model settings.parameters (
ModelParameters
): MCMC model parameters.
pymcmcstat.propagation module¶
Created on Wed Nov 8 12:00:11 2017
@author: prmiles
-
pymcmcstat.propagation.
calculate_intervals
(chain, results, data, model, s2chain=None, nsample=500, waitbar=True, sstype=0)[source]¶ Calculate distribution of model response to form propagation intervals
Samples values from chain, performs forward model evaluation, and tabulates credible and prediction intervals (if obs. error var. included).
- Args:
chain (
ndarray
): Parameter chains, expect shape=(nsimu, npar).results (
dict
): Results dictionary generated by pymcmcstat.data (
DataStructure
): Datamodel: User defined function. Note, if your model outputs multiple quantities of interest (QoI) at the same time in a multi-dimensional array, then make sure it is returned as a (N, p) array where N is the number of evaluation points and p is the number of QoI.
- Kwargs:
- Returns:
dict
with two elements: 1) credible and 2) prediction
-
pymcmcstat.propagation.
define_sample_points
(nsample, nsimu)[source]¶ Define indices to sample from posteriors.
-
pymcmcstat.propagation.
generate_quantiles
(x, p=array([0.25, 0.5, 0.75]))[source]¶ Calculate empirical quantiles.
-
pymcmcstat.propagation.
observation_sample
(s2, y, sstype)[source]¶ Calculate model response with observation errors.
-
pymcmcstat.propagation.
plot_3d_intervals
(intervals, time, ydata=None, xdata=None, limits=[95], adddata=False, addlegend=True, addmodel=True, figsize=None, model_display={}, data_display={}, interval_display={}, addcredible=True, addprediction=True, fig=None, legloc='upper left', ciset=None, piset=None, return_settings=False)[source]¶ Plot propagation intervals in 3-D
This routine takes the model distributions generated using the
calculate_intervals()
method and then plots specific quantiles. The user can plot just the intervals, or also include the median model response and/or observations. Specific settings for credible intervals are controlled by defining the ciset dictionary. Likewise, for prediction intervals, settings are defined using piset.- The setting options available for each interval are as follows:
limits: This should be a list of numbers between 0 and 100, e.g., limits=[50, 90] will result in 50% and 90% intervals.
cmap: The program is designed to “try” to choose colors that are visually distinct. The user can specify the colormap to choose from.
colors: The user can specify the color they would like for each interval in a list, e.g., [‘r’, ‘g’, ‘b’]. This list should have the same number of elements as limits or the code will revert back to its default behavior.
- Args:
intervals (
dict
): Interval dictionary generated usingcalculate_intervals()
method.time (
ndarray
): Independent variable, i.e., x- and y-axes of plot. Note, it must be a 2-D array with shape=(N, 2), where N is the number of evaluation points.
- Kwargs:
ydata (
ndarray
or None): Observations, expect 1-D array if defined.xdata (
ndarray
or None): Independent values corresponding to observations. This is required if the observations do not align with your times of generating the model response.limits (
list
): Quantile limits that correspond to percentage size of desired intervals. Note, this is the default limits, but specific limits can be defined using the ciset and piset dictionaries.adddata (
bool
): Flag to include dataaddmodel (
bool
): Flag to include median model responseaddlegend (
bool
): Flag to include legendaddcredible (
bool
): Flag to include credible intervalsaddprediction (
bool
): Flag to include prediction intervalsmodel_display (
dict
): Display settings for median model responsedata_display (
dict
): Display settings for datainterval_display (
dict
): General display settings for intervals.fig: Handle of previously created figure object
figsize (
tuple
): (width, height) in incheslegloc (
str
): Legend location - matplotlib help for details.ciset (
dict
): Settings for credible intervalspiset (
dict
): Settings for prediction intervalsreturn_settings (
bool
): Flag to return ciset and piset along with fig and ax.
- Returns:
- (
tuple
) with elements Figure handle
Axes handle
Dictionary with ciset and piset inside (only outputted if return_settings=True)
- (
-
pymcmcstat.propagation.
plot_intervals
(intervals, time, ydata=None, xdata=None, limits=[95], adddata=None, addmodel=True, addlegend=True, addcredible=True, addprediction=True, data_display={}, model_display={}, interval_display={}, fig=None, figsize=None, legloc='upper left', ciset=None, piset=None, return_settings=False)[source]¶ Plot propagation intervals in 2-D
This routine takes the model distributions generated using the
calculate_intervals()
method and then plots specific quantiles. The user can plot just the intervals, or also include the median model response and/or observations. Specific settings for credible intervals are controlled by defining the ciset dictionary. Likewise, for prediction intervals, settings are defined using piset.- The setting options available for each interval are as follows:
limits: This should be a list of numbers between 0 and 100, e.g., limits=[50, 90] will result in 50% and 90% intervals.
cmap: The program is designed to “try” to choose colors that are visually distinct. The user can specify the colormap to choose from.
colors: The user can specify the color they would like for each interval in a list, e.g., [‘r’, ‘g’, ‘b’]. This list should have the same number of elements as limits or the code will revert back to its default behavior.
- Args:
intervals (
dict
): Interval dictionary generated usingcalculate_intervals()
method.time (
ndarray
): Independent variable, i.e., x-axis of plot
- Kwargs:
ydata (
ndarray
or None): Observations, expect 1-D array if defined.xdata (
ndarray
or None): Independent values corresponding to observations. This is required if the observations do not align with your times of generating the model response.limits (
list
): Quantile limits that correspond to percentage size of desired intervals. Note, this is the default limits, but specific limits can be defined using the ciset and piset dictionaries.adddata (
bool
): Flag to include dataaddmodel (
bool
): Flag to include median model responseaddlegend (
bool
): Flag to include legendaddcredible (
bool
): Flag to include credible intervalsaddprediction (
bool
): Flag to include prediction intervalsmodel_display (
dict
): Display settings for median model responsedata_display (
dict
): Display settings for datainterval_display (
dict
): General display settings for intervals.fig: Handle of previously created figure object
figsize (
tuple
): (width, height) in incheslegloc (
str
): Legend location - matplotlib help for details.ciset (
dict
): Settings for credible intervalspiset (
dict
): Settings for prediction intervalsreturn_settings (
bool
): Flag to return ciset and piset along with fig and ax.
- Returns:
- (
tuple
) with elements Figure handle
Axes handle
Dictionary with ciset and piset inside (only outputted if return_settings=True)
- (
-
pymcmcstat.propagation.
setup_display_settings
(interval_display, model_display, data_display)[source]¶ Compare user defined display settings with defaults and merge.
- Args:
- Returns:
-
pymcmcstat.propagation.
setup_interval_colors
(iset, inttype='CI')[source]¶ Setup colors for empirical intervals
This routine attempts to distribute the color of the UQ intervals based on a normalize color map. Or, it will assign user-defined colors; however, this only happens if the correct number of colors are specified.
Subpackages¶
pymcmcstat.chain package¶
pymcmcstat.chain.ChainProcessing module¶
Created on Tue May 1 09:12:06 2018
@author: prmiles
-
pymcmcstat.chain.ChainProcessing.
check_parallel_directory_contents
(parallel_dir, cf_orig)[source]¶ Check that items in directory are subdirectories with name “chain_#”
-
pymcmcstat.chain.ChainProcessing.
generate_chain_list
(pres, burnin_percentage=50)[source]¶ Generate list of chains.
-
pymcmcstat.chain.ChainProcessing.
generate_combined_chain_with_index
(pres, burnin_percentage=50)[source]¶ Generate combined chain with index.
-
pymcmcstat.chain.ChainProcessing.
load_json_object
(filename)[source]¶ Load object stored in json file.
Note
Filename should include extension.
-
pymcmcstat.chain.ChainProcessing.
load_parallel_simulation_results
(savedir, extension='h5', chainfile='chainfile', sschainfile='sschainfile', s2chainfile='s2chainfile', covchainfile='covchainfile')[source]¶ Load results from parallel simulation directory json files.
Lists in json files are converted to numpy arrays.
-
pymcmcstat.chain.ChainProcessing.
load_serial_simulation_results
(savedir, json_file=None, extension='h5', chainfile='chainfile', sschainfile='sschainfile', s2chainfile='s2chainfile', covchainfile='covchainfile')[source]¶ Load results from serial simulation directory json files.
Lists in json files are converted to numpy arrays.
-
pymcmcstat.chain.ChainProcessing.
print_log_files
(savedir)[source]¶ Print log files to screen.
- Args:
savedir (
str
): Directory where log files are saved.
The output display will include a date/time stamp, as well as indices of the chain that were saved during that export sequence.
Example display:
-------------------------- Display log file: <savedir>/binlogfile.txt 2018-05-03 14:15:54 0 999 2018-05-03 14:15:54 1000 1999 2018-05-03 14:15:55 2000 2999 2018-05-03 14:15:55 3000 3999 2018-05-03 14:15:55 4000 4999 --------------------------
-
pymcmcstat.chain.ChainProcessing.
read_in_bin_file
(filename)[source]¶ Read in information from file containing binary data.
If file exists, it will read in the array elements. Otherwise, it will return and empty list.
-
pymcmcstat.chain.ChainProcessing.
read_in_parallel_json_results_files
(parallel_dir)[source]¶ Read in json results files from directory containing results from parallel MCMC simulation.
- Args:
parallel_dir (
str
): Directory where parallel log files are saved.
-
pymcmcstat.chain.ChainProcessing.
read_in_parallel_savedir_files
(parallel_dir, extension='h5', chainfile='chainfile', sschainfile='sschainfile', s2chainfile='s2chainfile', covchainfile='covchainfile')[source]¶ Read in log files from directory containing results from parallel MCMC simulation.
-
pymcmcstat.chain.ChainProcessing.
read_in_savedir_files
(savedir, extension='h5', chainfile='chainfile', sschainfile='sschainfile', s2chainfile='s2chainfile', covchainfile='covchainfile')[source]¶ Read in log files from directory.
pymcmcstat.chain.ChainStatistics module¶
Created on Thu Apr 26 10:23:51 2018
@author: prmiles
-
pymcmcstat.chain.ChainStatistics.
batch_mean_standard_deviation
(chain, b=None)[source]¶ Standard deviation calculated from batch means
-
pymcmcstat.chain.ChainStatistics.
calculate_psrf
(x, nsimu, nchains)[source]¶ Calculate Potential Scale Reduction Factor (PSRF)
Performs analysis of variances for set of chains corresponding to a single parameter. This code follows the MATLAB implementation found here:
-
pymcmcstat.chain.ChainStatistics.
chainstats
(chain=None, results=None, returnstats=False, display_details=False)[source]¶ Calculate chain statistics.
-
pymcmcstat.chain.ChainStatistics.
display_gelman_rubin
(psrf)[source]¶ Display results of Gelman-Rubin diagnostic
- Args:
psrf (
dict
): Results from GR diagnostic
-
pymcmcstat.chain.ChainStatistics.
gelman_rubin
(chains, names=None, results=None, display=True)[source]¶ Gelman-Rubin diagnostic for multiple chains [GR+92], [BG98].
This diagnostic technique compares the variance within a single change to the variance between multiple chains. This process serves as a method for testing whether or not the chain has converged. If the chain has converged, we would expect the variance within and the variance between to be equal. This diagnostic tool pairs well with the ParallelMCMC module, which generates a set of distinct chains that have all been initialized at different points within the parameter space.
- Args:
- Returns:
(
dict
): Keywords of the dictionary correspond to the parameter names. Each keyword corresponds to a dictionary outputted fromcalculate_psrf()
.
-
pymcmcstat.chain.ChainStatistics.
get_parameter_names
(nparam, results)[source]¶ Get parameter names from results dictionary.
If no results found, then default names are generated. If some results are found, then an extended set is generated to complete the list requirement. Uses the functions:
generate_default_names()
andextend_names_to_match_nparam()
-
pymcmcstat.chain.ChainStatistics.
geweke
(chain, a=0.1, b=0.5)[source]¶ Geweke’s MCMC convergence diagnostic
Test for equality of the means of the first a% (default 10%) and last b% (50%) of a Markov chain - see [BR98].
- Args:
- Returns:
Note
The percentage of the chain should be given as a decimal between zero and one. So, for the first 10% of the chain, define
a = 0.1
. Likewise, for the last 50% of the chain, defineb = 0.5
.
-
pymcmcstat.chain.ChainStatistics.
integrated_autocorrelation_time
(chain)[source]¶ Estimates the integrated autocorrelation time using Sokal’s adaptive truncated periodogram estimator.
-
pymcmcstat.chain.ChainStatistics.
power_spectral_density_using_hanning_window
(x, nfft=None, nw=None)[source]¶ Power spectral density using Hanning window.
-
pymcmcstat.chain.ChainStatistics.
print_chain_acceptance_info
(chain, results=None)[source]¶ Print chain acceptance rate(s)
If results structure is provided, it will try to print acceptance rates with respect to delayed rejection as well.
Example display (if results dictionary provided):
------------------------------ Acceptance rate information --------------- Results dictionary: Stage 1: 3.32% Stage 2: 22.60% Net : 25.92% -> 1296/5000 --------------- Chain provided: Net : 32.10% -> 963/3000 --------------- Note, the net acceptance rate from the results dictionary may be different if you only provided a subset of the chain, e.g., removed the first part for burnin-in. ------------------------------
-
pymcmcstat.chain.ChainStatistics.
print_chain_statistics
(names, meanii, stdii, mcerr, tau, p)[source]¶ Print chain statistics to terminal window.
- Args:
Example display:
--------------------- name : mean std MC_err tau geweke $p_{0}$ : 1.9680 0.0319 0.0013 36.3279 0.9979 $p_{1}$ : 3.0818 0.0803 0.0035 37.1669 0.9961 ---------------------
pymcmcstat.plotting package¶
pymcmcstat.plotting.MCMCPlotting module¶
Created on Wed Jan 31 12:54:16 2018
@author: prmiles
-
class
pymcmcstat.plotting.MCMCPlotting.
Plot
[source]¶ Bases:
object
Plotting routines for analyzing sampling chains from MCMC process.
-
pymcmcstat.plotting.MCMCPlotting.
plot_chain_metrics
(chain, name=None, figsizeinches=None)[source]¶ Plot chain metrics for individual chain
Scatter plot of chain
Histogram of chain
-
pymcmcstat.plotting.MCMCPlotting.
plot_chain_panel
(chains, names=None, figsizeinches=None, skip=1, maxpoints=500)[source]¶ Plot sampling chain for each parameter
- Args:
chains (
ndarray
): Sampling chain for each parameternames (
list
): List of strings - name of each parameterfigsizeinches (
list
): Specify figure size in inches [Width, Height]skip (
int
): Indicates step size to be used when plotting elements from the chainmaxpoints (
int
): Max number of display points - keeps scatter plot from becoming overcrowded
-
pymcmcstat.plotting.MCMCPlotting.
plot_density_panel
(chains, names=None, hist_on=False, figsizeinches=None, return_kde=False)[source]¶ Plot marginal posterior densities
-
pymcmcstat.plotting.MCMCPlotting.
plot_histogram_panel
(chains, names=None, figsizeinches=None)[source]¶ Plot histogram from each parameter’s sampling history
-
pymcmcstat.plotting.MCMCPlotting.
plot_pairwise_correlation_panel
(chains, names=None, figsizeinches=None, skip=1, maxpoints=500)[source]¶ Plot pairwise correlation for each parameter
- Args:
chains (
ndarray
): Sampling chain for each parameternames (
list
): List of strings - name of each parameterfigsizeinches (
list
): Specify figure size in inches [Width, Height]skip (
int
): Indicates step size to be used when plotting elements from the chainmaxpoints (py:class:int): Maximum allowable number of points in plot.
pymcmcstat.plotting.PredictionIntervals module¶
Created on Wed Nov 8 12:00:11 2017
@author: prmiles
-
class
pymcmcstat.plotting.PredictionIntervals.
PredictionIntervals
[source]¶ Bases:
object
Prediction/Credible interval methods.
- Attributes:
-
generate_prediction_intervals
(sstype=None, nsample=500, calc_pred_int=True, waitbar=False)[source]¶ Generate prediction/credible interval.
-
plot_prediction_intervals
(plot_pred_int=True, adddata=False, addlegend=True, figsizeinches=None, model_display={}, data_display={}, interval_display={})[source]¶ Plot prediction/credible intervals.
- Args:
plot_pred_int (
bool
): Flag to include PI on plot.adddata (
bool
): Flag to include data on plot.addlegend (
bool
): Flag to include legend on plot.figsizeinches (
list
): Specify figure size in inches [Width, Height].model_display (
dict
): Model display settings.data_display (
dict
): Data display settings.interval_display (
dict
): Interval display settings.
- Available display options (defaults in parantheses):
model_display: linestyle (
'-'
), marker (''
), color ('r'
), linewidth (2
), markersize (5
), label (model
), alpha (1.0
)data_display: linestyle (
''
), marker ('.'
), color ('b'
), linewidth (1
), markersize (5
), label (data
), alpha (1.0
)data_display: linestyle (
':'
), linewidth (1
), alpha (1.0
), edgecolor ('k'
)
-
setup_prediction_interval_calculation
(results, data, modelfunction, burnin=0)[source]¶ Setup calculation for prediction interval generation
- Args:
results (
ResultsStructure
): MCMC results structuredata (
DataStructure
): MCMC data structuremodelfunction: Model function handle
pymcmcstat.plotting.utilities module¶
Created on Mon May 14 06:24:12 2018
@author: prmiles
-
pymcmcstat.plotting.utilities.
append_to_nrow_ncol_based_on_shape
(sh, nrow, ncol)[source]¶ Append to list based on shape of array
-
pymcmcstat.plotting.utilities.
check_settings
(default_settings, user_settings=None)[source]¶ Check user settings with default.
Recursively checks elements of user settings against the defaults and updates settings as it goes. If a user setting does not exist in the default, then the user setting is added to the settings. If the setting is defined in both the user and default settings, then the user setting overrides the default. Otherwise, the default settings persist.
-
pymcmcstat.plotting.utilities.
check_symmetric
(a, tol=1e-08)[source]¶ Check if array is symmetric by comparing with transpose.
-
pymcmcstat.plotting.utilities.
convert_flag_to_boolean
(flag)[source]¶ Convert flag to boolean for backwards compatibility.
-
pymcmcstat.plotting.utilities.
empirical_quantiles
(x, p=array([0.25, 0.5, 0.75]))[source]¶ Calculate empirical quantiles.
-
pymcmcstat.plotting.utilities.
extend_names_to_match_nparam
(names, nparam)[source]¶ Append names to list using default convention until length of names matches number of parameters.
For example, if names = [‘name_1’, ‘name_2’] and nparam = 4, then two additional names will be appended to the names list. E.g.,:
names = ['name_1', 'name_2', 'p_{2}', 'p_{3}']
-
pymcmcstat.plotting.utilities.
gaussian_density_function
(x, mu=0, sigma2=1)[source]¶ Standard normal/Gaussian density function.
-
pymcmcstat.plotting.utilities.
generate_default_names
(nparam)[source]¶ Generate generic parameter name set.
For example, if nparam = 4, then the generated names are:
names = ['p_{0}', 'p_{1}', 'p_{2}', 'p_{3}']
-
pymcmcstat.plotting.utilities.
generate_ellipse
(mu, cmat, ndp=100)[source]¶ Generates points for a probability contour ellipse
-
pymcmcstat.plotting.utilities.
generate_names
(nparam, names)[source]¶ Generate parameter name set.
For example, if nparam = 4, then the generated names are:
names = ['p_{0}', 'p_{1}', 'p_{2}', 'p_{3}']
-
pymcmcstat.plotting.utilities.
generate_subplot_grid
(nparam=2)[source]¶ Generate subplot grid.
For example, if nparam = 2, then the subplot will have 2 rows and 1 column.
-
pymcmcstat.plotting.utilities.
is_semi_pos_def_chol
(x)[source]¶ Check if matrix is semi positive definite by calculating Cholesky decomposition.
- Args:
x (
ndarray
): Matrix to check
- Returns:
If matrix is not semi positive definite return
False, None
If matrix is semi positive definite return
True
and the Upper triangular form of the Cholesky decomposition matrix.
-
pymcmcstat.plotting.utilities.
make_x_grid
(x, npts=100)[source]¶ Generate x grid based on extrema.
1. If len(x) > 200, then generates grid based on difference between the max and min values in the array.
2. Otherwise, the grid is defined with respect to the array mean plus or minus four standard deviations.
-
pymcmcstat.plotting.utilities.
set_local_parameters
(ii, local)[source]¶ Set local parameters based on tests.
- Test 1
local == 0
- Test 2
local == ii
-
pymcmcstat.plotting.utilities.
setup_plot_features
(nparam, names, figsizeinches)[source]¶ Setup plot features.
- Args:
- Returns:
-
pymcmcstat.plotting.utilities.
setup_subsample
(skip, maxpoints, nsimu)[source]¶ Setup subsampling from posterior.
When plotting the sampling chain, it is often beneficial to subsample in order to avoid to dense of plots. This routine determines the appropriate step size based on the size of the chain (nsimu) and maximum points allowed to plot (maxpoints). The function checks if the size of the chain exceeds the maximum number of points allowed in the plot. If yes, skip is defined such that every the max number of points are used and sampled evenly from the start to end of the chain. Otherwise the value of skip is return as defined by the user. A subsample index is then generated based on the value of skip and the number of simulations.
pymcmcstat.procedures package¶
pymcmcstat.procedures.CovarianceProcedures module¶
Created on Thu Jan 18 07:55:46 2018
Description: Support methods for initializing and updating the covariance matrix. Additional routines associated with Cholesky Decomposition.
@author: prmiles
-
class
pymcmcstat.procedures.CovarianceProcedures.
CovarianceProcedures
[source]¶ Bases:
object
Covariance matrix variables and methods.
-
display_covariance_settings
(print_these=None)[source]¶ Display subset of the covariance settings.
- Args:
print_these (
list
): List of strings corresponding to keywords. Default below.
print_these = ['qcov', 'R', 'RDR', 'invR', 'last_index_since_adaptation', 'covchain']
-
setup_covariance_matrix
(qcov, thetasig, value)[source]¶ Initialize covariance matrix.
If no proposal covariance matrix is provided, then the default is generated by squaring 5% of the initial value. This yields a diagonal covariance matrix.
If the initial value was one, this would lead to zero variance. In those instances the variance is set equal to
qcov[qcov==0] = 1.0
.
-
pymcmcstat.procedures.ErrorVarianceEstimator module¶
Created on Thu Jan 18 13:12:50 2018
@author: prmiles
-
class
pymcmcstat.procedures.ErrorVarianceEstimator.
ErrorVarianceEstimator
[source]¶ Bases:
object
Estimate observation errors.
- Attributes:
-
gammar
(m, n, a, b=1)[source]¶ Random deviates from gamma distribution.
- Returns a m x n matrix of random deviates from the Gamma
distribution with shape parameter A and scale parameter B:
-
gammar_mt
(m, n, a, b=1)[source]¶ Wrapper routine for calculating random deviates from gamma distribution using method of Marsaglia and Tsang (2000) [MT00].
-
update_error_variance
(sos, model)[source]¶ Update observation error variance.
Strategy: Treat error variance as parameter to be sampled.
Definition: The property that the prior and posterior distributions have the same parametric form is termed conjugacy.
Starting from the likelihood function, it can be shown
where and are shape and scaling parameters, is the number of observations, and is the sum-of-squares error. For more details regarding the interpretation of and , please refer to [Smi14] page 163.
Note
The variables and correspond to
N0
andS20
in theModelSettings
class, respectively.- Args:
sos (
ndarray
): Return argument from evaluation of sum-of-squares function.model (
ModelSettings
): MCMC model settings.
pymcmcstat.procedures.PriorFunction module¶
Created on Thu Jan 18 09:10:21 2018
Description: Prior function
@author: prmiles
pymcmcstat.procedures.SumOfSquares module¶
Created on Wed Jan 17 16:21:48 2018
@author: prmiles
-
class
pymcmcstat.procedures.SumOfSquares.
SumOfSquares
(model, data, parameters)[source]¶ Bases:
object
Sum-of-squares function evaluation.
Description: Sum-of-squares (sos) class intended for used in MCMC simulator. Each instance will contain the sos function. If the user did not specify a sos-function, then the user supplied model function will be used in the default mcmc sos-function.
- Attributes:
-
classmethod
mcmc_sos_function
(theta, data, nbatch, model_function)[source]¶ Default sum-of-squares function.
Note
This method requires specifying a model function instead of a sum of squares function. Not recommended for most applications.
Basic formulation:
where is the weight of a particular data set, and is the sum-of-squares error for the i-th data set.
pymcmcstat.samplers package¶
pymcmcstat.samplers.Adaptation module¶
Created on Thu Jan 18 11:14:11 2018
@author: prmiles
-
class
pymcmcstat.samplers.Adaptation.
Adaptation
[source]¶ Bases:
object
Adaptive Metropolis (AM) algorithm based on [HST+01].
- Attributes:
-
run_adaptation
(covariance, options, isimu, iiadapt, rejected, chain, chainind, u, npar, alpha)[source]¶ Run adaptation step
- Args:
covariance (
CovarianceProcedures
): Covariance methods and variablesoptions (
SimulationOptions
): Options for MCMC simulationisimu (
int
): Simulation counteriiadapt (
int
): Adaptation counterrejected (
dict
): Rejection counterchain (
ndarray
): Sampling chainchainind (
ind
): Relative point in chainu (
ndarray
): Latest random sample pointsnpar (
int
): Number of parameters being sampledalpha (
float
): Latest Likelihood evaluation
- Returns:
covariance (
CovarianceProcedures
): Updated covariance object
-
pymcmcstat.samplers.Adaptation.
adjust_cov_matrix
(upcov, R, npar, qcov_adjust, qcov_scale, rejected, iiadapt, verbosity)[source]¶ Adjust covariance matrix if found to be singular.
- Args:
upcov (
ndarray
): Parameter covariance matrix.R (
ndarray
): Cholesky decomposition of covariance matrix.npar (
int
): Number of parameters.qcov_adjust (
float
): Covariance adjustment factor.qcov_scale (
float
): Scale parameterrejected (
dict
): Rejection counters.iiadapt (
int
): Adaptation counter.verbosity (
int
): Verbosity of display output.
- Returns:
R (
ndarray
): Cholesky decomposition of covariance matrix.
-
pymcmcstat.samplers.Adaptation.
below_burnin_threshold
(rejected, iiadapt, R, burnin_scale, verbosity)[source]¶ Update Cholesky Matrix using below burnin thershold
-
pymcmcstat.samplers.Adaptation.
check_for_singular_cov_matrix
(upcov, R, npar, qcov_adjust, qcov_scale, rejected, iiadapt, verbosity)[source]¶ Check if singular covariance matrix
- Args:
upcov (
ndarray
): Parameter covariance matrix.R (
ndarray
): Cholesky decomposition of covariance matrix.npar (
int
): Number of parameters.qcov_adjust (
float
): Covariance adjustment factor.qcov_scale (
float
): Scale parameterrejected (
dict
): Rejection counters.iiadapt (
int
): Adaptation counter.verbosity (
int
): Verbosity of display output.
- Returns:
R (
ndarray
): Adjusted Cholesky decomposition of covariance matrix.
-
pymcmcstat.samplers.Adaptation.
initialize_covariance_mean_sum
(x, w)[source]¶ Initialize covariance chain, local mean, local sum
-
pymcmcstat.samplers.Adaptation.
is_semi_pos_def_chol
(x)[source]¶ Check if matrix is semi-positive definite using Cholesky Decomposition
-
pymcmcstat.samplers.Adaptation.
scale_cholesky_decomposition
(Ra, qcov_scale)[source]¶ Scale Cholesky decomposition
-
pymcmcstat.samplers.Adaptation.
setup_cholupdate
(R, oldwsum, wsum, oldmean, xi)[source]¶ Setup input arguments to the Cholesky update function
- Args:
- Returns:
-
pymcmcstat.samplers.Adaptation.
setup_w_R
(w, oldR, n)[source]¶ Setup weights and Cholesky matrix
- Args:
- Returns:
-
pymcmcstat.samplers.Adaptation.
unpack_covariance_settings
(covariance)[source]¶ Unpack covariance settings
- Args:
covariance (
CovarianceProcedures
): Covariance methods and variables
- Returns:
last_index_since_adaptation (
int
): Last index since adaptation occured.R (
ndarray
): Cholesky decomposition of covariance matrix.oldcovchain (
ndarray
): Covariance matrix history.oldmeanchain (
ndarray
): Current mean chain values.oldwsum (
ndarray
): Weightsno_adapt_index (
numpy.ndarray
): Indices of parameters not being adapted.qcov_scale (
float
): Scale parameterqcov (
ndarray
): Covariance matrix
-
pymcmcstat.samplers.Adaptation.
unpack_simulation_options
(options)[source]¶ Unpack simulation options
- Args:
options (
SimulationOptions
): Options for MCMC simulation
- Returns:
burnintime (
int
):burnin_scale (
float
): Scale for burnin.ntry (
int
): Number of tries to take before rejection. Default is method dependent.drscale (
ndarray
): Reduced scale for sampling in DR algorithm. Default is [5,4,3].alphatarget (
float
): Acceptance ratio target.etaparam (
float
):qcov_adjust (
float
): Adjustment scale for covariance matrix.doram (
int
): Flag to perform'ram'
algorithm (Obsolete).verbosity (
int
): Verbosity of display output.
-
pymcmcstat.samplers.Adaptation.
update_cov_from_covchain
(covchain, qcov, no_adapt_index)[source]¶ Update covariance matrix from covariance matrix chain
- Args:
covchain (
ndarray
): Covariance matrix history.qcov (
ndarray
): Parameter covariance matrixno_adapt_index (
numpy.ndarray
): Indices of parameters not being adapted.
- Returns:
upcov (
ndarray
): Updated covariance matrix
-
pymcmcstat.samplers.Adaptation.
update_cov_via_ram
(u, isimu, etaparam, npar, alphatarget, alpha, R)[source]¶ Update covariance matrix via RAM
- Args:
- Returns:
upcov (
ndarray
): Updated parameter covariance matrix.
-
pymcmcstat.samplers.Adaptation.
update_covariance_mean_sum
(x, w, oldcov, oldmean, oldwsum, oldR=None)[source]¶ Update covariance chain, local mean, local sum
- Args:
- Returns:
pymcmcstat.samplers.DelayedRejection module¶
Created on Thu Jan 18 10:42:07 2018
@author: prmiles
-
class
pymcmcstat.samplers.DelayedRejection.
DelayedRejection
[source]¶ Bases:
object
Delayed Rejection (DR) algorithm based on [HLMS06].
-
classmethod
initialize_next_metropolis_step
(npar, old_theta, sigma2, RDR)[source]¶ Take metropolis step according to DR
- Args:
- Returns:
next_set (
ParameterSet
): New proposal setu (
numpy.ndarray
): Numbers sampled from standard normal distributions (u.shape = (1,npar)
)
-
run_delayed_rejection
(old_set, new_set, RDR, ntry, parameters, invR, sosobj, priorobj, custom=None)[source]¶ Perform delayed rejection step - occurs in standard metropolis is not accepted.
- Args:
old_set (
ParameterSet
): Features ofnew_set (
ParameterSet
): Features ofRDR (
ndarray
): Cholesky decomposition of parameter covariance matrix for DR stepsntry (
int
): Number of DR steps to perform until rejectionparameters (
ModelParameters
): Model parametersinvR (
ndarray
): Inverse Cholesky decomposition matrixsosobj (
SumOfSquares
): Sum-of-Squares functionpriorobj (
PriorFunction
): Prior function
- Returns:
accept (
int
): 0 - reject, 1 - acceptout_set (
ParameterSet
): If accept == 1, then latest DR set; Else,outbound (
int
): 1 - rejected due to sampling outside of parameter bounds
-
classmethod
-
pymcmcstat.samplers.DelayedRejection.
extract_state_elements
(iq, stage, trypath)[source]¶ Extract elements from tried paths.
-
pymcmcstat.samplers.DelayedRejection.
log_posterior_ratio
(x1, x2)[source]¶ Calculate the logarithm of the posterior ratio.
- Args:
x1 (
ParameterSet
): Old set -x2 (
ParameterSet
): New set -
- Returns:
zq (
float
): Logarithm of posterior ratio.
-
pymcmcstat.samplers.DelayedRejection.
nth_stage_log_proposal_ratio
(iq, trypath, invR)[source]¶ Gaussian nth stage log proposal ratio.
Logarithm of
-
pymcmcstat.samplers.DelayedRejection.
update_set_based_on_acceptance
(accept, old_set, next_set)[source]¶ Define output set based on acceptance
- Args:
accept (
int
): 0 - reject, 1 - acceptold_set (
ParameterSet
): Features ofnext_set (
ParameterSet
): New proposal set
- Returns:
out_set (
ParameterSet
): If accept == 1, then latest DR set; Else,
pymcmcstat.samplers.Metropolis module¶
Created on Thu Jan 18 10:30:29 2018
@author: prmiles
-
class
pymcmcstat.samplers.Metropolis.
Metropolis
[source]¶ Bases:
object
Pseudo-Algorithm:
Sample
Construct candidate
Compute
Compute
- If
Set
- Else
Set
- If
- Attributes:
acceptance_test()
-
classmethod
calculate_posterior_ratio
(ss1, ss2, sigma2, newprior, oldprior)[source]¶ Calculate acceptance ratio
where the Gaussian likelihood function is
and Gaussian prior function is
For the Gaussian likelihood and prior, this yields the acceptance ratio
For more details regarding the prior function, please refer to the
PriorFunction
class.Note
The default behavior of the package is to use Gaussian likelihood and prior functions (as of v1.8.0). Future releases will expand the functionality to allow for alternative likelihood and prior definitions.
- Args:
- Returns:
alpha (
float
): Result of likelihood function
-
run_metropolis_step
(old_set, parameters, R, prior_object, sos_object, custom=None)[source]¶ Run Metropolis step.
- Args:
old_set (
ParameterSet
): Features ofparameters (
ModelParameters
): Model parametersR (
ndarray
): Cholesky decomposition of parameter covariance matrixpriorobj (
PriorFunction
): Prior functionsosobj (
SumOfSquares
): Sum-of-Squares function
- Returns:
accept (
int
): 0 - reject, 1 - acceptnewset (
ParameterSet
): Features ofoutbound (
int
): 1 - rejected due to sampling outside of parameter boundsnpar_sample_from_normal (
ndarray
): Latet random sample points
pymcmcstat.samplers.SamplingMethods module¶
Created on Thu Jan 18 10:11:40 2018
@author: prmiles
pymcmcstat.samplers.utilities module¶
Created on Wed Jun 27 12:07:11 2018
Utility functions used by different samplers
@author: prmiles
-
pymcmcstat.samplers.utilities.
calculate_log_posterior_ratio
(loglikestar, loglike, logpriorstar, logprior)[source]¶ Calculate log posterior ratio:
For more details regarding the prior and likelihood functions, please refer to the
PriorFunction
andLikelihoodFunction
class, respectively.Note
The default behavior of the package is to use Gaussian likelihood and prior functions (as of v1.8.0). Future releases will expand the functionality to allow for alternative likelihood and prior definitions.
-
pymcmcstat.samplers.utilities.
is_sample_outside_bounds
(theta, lower_limits, upper_limits)[source]¶ Check whether proposal value is outside parameter limits
-
pymcmcstat.samplers.utilities.
log_posterior_ratio_acceptance_test
(alpha)[source]¶ Run log posterior ratio acceptance test
-
pymcmcstat.samplers.utilities.
posterior_ratio_acceptance_test
(alpha)[source]¶ Run posterior ratio acceptance test
-
pymcmcstat.samplers.utilities.
sample_candidate_from_gaussian_proposal
(npar, oldpar, R)[source]¶ Sample candidate from Gaussian proposal distribution
-
pymcmcstat.samplers.utilities.
set_outside_bounds
(next_set)[source]¶ Assign set features based on being outside bounds
- Args:
next_set (
ParameterSet
):
- Returns:
next_set (
ParameterSet
): with updated featuresoutbound (
bool
): True
pymcmcstat.settings package¶
pymcmcstat.settings.DataStructure module¶
Created on Wed Jan 17 09:03:37 2018
@author: prmiles
-
class
pymcmcstat.settings.DataStructure.
DataStructure
[source]¶ Bases:
object
Structure for storing data in MCMC object. The following random data sets will be referenced in examples for the different class methods:
x1 = np.random.random_sample(size = (5, 1)) y1 = np.random.random_sample(size = (5, 2)) x2 = np.random.random_sample(size = (10, 1)) y2 = np.random.random_sample(size = (10, 3))
- Attributes:
-
add_data_set
(x, y, n=None, weight=1, user_defined_object=0)[source]¶ Add data set to MCMC object.
This method must be called first before using any of the other methods within
DataStructure
.mcstat = MCMC() mcstat.data.add_data_set(x = x1, y = y1) mcstat.data.add_data_set(x = x2, y = y2)
This yields the following variables in the data structure.
- xdata (
list
): List of numpy arrays xdata[0] = x1, xdata[0].shape = (5,1)
xdata[1] = x2, xdata[1].shape = (10,1)
- xdata (
- ydata (
list
): List of numpy arrays ydata[0] = y1, ydata[0].shape = (5,2)
ydata[1] = y2, ydata[1].shape = (10,3)
- ydata (
n (
list
): List of integers.n = [5, 10]
shape (
list
): List of y.shape.shape = [(5,2),(10,3)]
weight (
list
): List of weights.weight = [1, 1]
user_defined_object (
list
): List of objects.user_defined_object = [0,0]
- Args:
x (
ndarray
): Independent data. Recommend input as column vectors.y (
ndarray
): Dependent data. Recommend input as column vectors.n (
list
): List of integers denoting number of data points.weight (
list
): Weight of each data set.user_defined_object (User Defined): Any object can be stored in this variable.
Note
In general, it is recommended that user’s format their data as a column vector. So, if you have nds independent data points, x and y should be [nds,1] or [nds,] numpy arrays. Note if a list is sent, the code will convert it to a numpy array.
-
get_number_of_batches
()[source]¶ Get number of batches in data structure. Essentially, each time you call the
add_data_set()
method you are adding another batch. It is also equivalent to say the number of batches is equal to the length of the list ydata. For example,nb = mcstat.data.get_number_of_batches()
should return
nb = 2
becauselen(mcstat.data.ydata) = 2
.- Returns:
nbatch (
int
): Number of batches.
-
get_number_of_data_sets
()[source]¶ Get number of data sets in data structure. A data set is strictly speaking defined as the total number of columns in each element of the ydata list. For example,
nds = mcstat.data.get_number_of_data_sets()
should return
nds = 2 + 3 = 5
because the number of columns in y1 is 2 and the number of columns in y2 is 3.- Returns:
Number of columns in ydata (
int
)
-
get_number_of_observations
()[source]¶ Get number of observations in data structure. An observation is essentially the total number of rows from each element of the ydata list. For example,
nobs = mcstat.data.get_number_of_observations()
should return
nobs = 5 + 10 = 15
because the number of rows in y1 is 5 and the number of rows in y2 is 10.- Returns:
Number of rows in ydata (
ndarray
)
pymcmcstat.settings.ModelParameters module¶
Created on Wed Jan 17 09:13:03 2018
@author: prmiles
-
class
pymcmcstat.settings.ModelParameters.
ModelParameters
[source]¶ Bases:
object
MCMC Model Parameters.
Example:
mcstat = MCMC() mcstat.parameters.add_model_parameter(name = 'm', theta0 = 1., minimum = -10, maximum = 10) mcstat.parameters.add_model_parameter(name = 'b', theta0 = -5., minimum = -10, maximum = 100) mcstat.parameters.display_model_parameter_settings()
This will display to screen:
Sampling these parameters: name start [ min, max] N( mu, sigma^2) m : 1.00 [-10.00, 10.00] N(0.00, inf) b : -5.00 [-10.00, 100.00] N(0.00, inf)
- Attributes:
-
add_model_parameter
(name=None, theta0=None, minimum=- inf, maximum=inf, prior_mu=array([0.0]), prior_sigma=inf, sample=True, local=0, adapt=True)[source]¶ Add model parameter to MCMC simulation.
- Args:
name (
str
): Parameter nametheta0 (
float
): Initial valueminimum (
float
): Lower parameter boundmaximum (
float
): Upper parameter boundprior_mu (
float
): Mean value of prior distributionprior_sigma (
float
): Standard deviation of prior distributionsample (
bool
): Flag to turn sampling on (True) or off (False)local (
int
): Local flag - still testing.
The default prior is a uniform distribution from minimum to maximum parameter value.
-
classmethod
setup_adaptation_indices
(parind, adapt)[source]¶ Setup adaptation parameter indices.
- Args:
- Returns:
..note:
The size of the returned arrays will equal the number of parameters being sampled.
-
classmethod
setup_adapting
(adapt, sample)[source]¶ Setup parameters being adapted.
All parameters that are not being sampled will automatically be thrown out of adaptation. This method checks that the default adaptation status is consistent.
-
pymcmcstat.settings.ModelParameters.
check_noadaptind
(no_adapt, npar)[source]¶ Check if noadaptind is None -> Empty List
-
pymcmcstat.settings.ModelParameters.
check_verbosity
(verbosity)[source]¶ Check if verbosity is None -> 0
-
pymcmcstat.settings.ModelParameters.
generate_default_name
(nparam)[source]¶ Generate generic parameter name. For example, if
nparam = 4
, then the generated name is:names = 'p_{3}'
-
pymcmcstat.settings.ModelParameters.
less_than_or_equal_to_zero
(x)[source]¶ Return result of test on number based on less than or equal to
-
pymcmcstat.settings.ModelParameters.
noadapt_display_setting
(no_adapt)[source]¶ Define display settins if index not being adapted.
-
pymcmcstat.settings.ModelParameters.
prior_display_setting
(x)[source]¶ Define display string for prior.
pymcmcstat.settings.ModelSettings module¶
Created on Wed Jan 17 09:06:51 2018
@author: prmiles
-
class
pymcmcstat.settings.ModelSettings.
ModelSettings
[source]¶ Bases:
object
MCMC Model Settings
- Attributes:
-
define_model_settings
(sos_function=None, prior_function=None, prior_type=1, prior_update_function=None, prior_pars=None, model_function=None, sigma2=None, N=None, S20=nan, N0=None, nbatch=None)[source]¶ Define model settings.
- Args:
sos_function: Handle for sum-of-squares function
prior_function: Handle for prior function
prior_type: Pending…
prior_update_function: Pending…
prior_pars: Pending…
model_function: Handle for model function (needed if
sos_function
not specified)sigma2 (
float
): List of initial error observations.N (
int
): Number of observations - seeDataStructure
.S20 (
float
): List of scaling parameter in observation error estimate.N0 (
float
): List of scaling parameter in observation error estimate.nbatch (
int
): Number of batch data sets - seeget_number_of_batches()
.
Note
Variables
sigma2, N, S20, N0
, andnbatch
converted tondarray
for subsequent processing.
pymcmcstat.settings.SimulationOptions module¶
Created on Wed Jan 17 09:08:13 2018
@author: prmiles
-
class
pymcmcstat.settings.SimulationOptions.
SimulationOptions
[source]¶ Bases:
object
MCMC simulation options.
-
define_simulation_options
(nsimu=10000, adaptint=None, ntry=None, method='dram', printint=None, adaptend=0, lastadapt=0, burnintime=0, waitbar=True, debug=0, qcov=None, updatesigma=False, stats=0, drscale=array([5.0, 4.0, 3.0]), adascale=None, savesize=0, maxmem=0, chainfile='chainfile', s2chainfile='s2chainfile', sschainfile='sschainfile', covchainfile='covchainfile', savedir=None, save_to_bin=False, skip=1, label=None, RDR=None, verbosity=1, maxiter=None, priorupdatestart=0, qcov_adjust=1e-08, burnin_scale=10, alphatarget=0.234, etaparam=0.7, initqcovn=None, doram=None, rndseq=None, results_filename=None, save_to_json=False, save_to_txt=False, json_restart_file=None, save_lightly=False)[source]¶ Define simulation options.
- Args:
nsimu (
int
): Number of parameter samples to simulate. Default is 1e4.adaptint (
int
): Number of interates between adaptation. Default is method dependent.ntry (
int
): Number of tries to take before rejection. Default is method dependent.method (
str
): Sampling method ('mh', 'am', 'dr', 'dram'
). Default is'dram'
.printint (
int
): Printing interval.adaptend (
int
): Obsolete.lastadapt (
int
): Last adaptation iteration (i.e., no more adaptation beyond this point).burnintime (
int
):waitbar (
int
): Flag to use progress bar. Default is 1 -> on (otherwise -> off).debug (
int
): Flag to perform debug. Default is 0 -> off.qcov (
ndarray
): Proposal parameter covariance matrix.updatesigma (
bool
): Flag for updating measurement error variance. Default is 0 -> off (1 -> on).stats (
int
): Calculate convergence statistics. Default is 0 -> off (1 -> on).drscale (
ndarray
): Reduced scale for sampling in DR algorithm. Default is [5,4,3].adascale (
float
): User defined covariance scale. Default is method dependent (untested).savesize (
int
): Size of chain segments when saving to log files. Default is 0.maxmem (
int
): Maximum memory available in mega bytes (Obsolete).chainfile (
str
): File name forchain
log file.sschainfile (
str
): File name forsschain
log file.s2chainfile (
str
): File name fors2chain
log file.covchainfile (
str
): File name forqcov
log file.savedir (
str
): Output directory of log files. Default is current directory.save_to_bin (
bool
): Save log files to binary. Default is False.save_to_txt (
bool
): Save log files to text. Default is False.skip (
int
):label (
str
):RDR (
ndarray
): R matrix for each stage of DR.verbosity (
int
): Verbosity of display output.maxiter (
int
): Obsolete.priorupdatestart
qcov_adjust (
float
): Adjustment scale for covariance matrix.burnin_scale (
float
): Scale for burnin.alphatarget (
float
): Acceptance ratio target.etaparam (
float
):initqcovn (
float
): Proposal covariance weight in update.doram (
int
): Flag to perform'ram'
algorithm (Obsolete).rndseq (
ndarray
): Random number sequence (Obsolete).results_filename (
str
): Output file name when saving results structure with json.save_to_json (
bool
): Save results structure to json file. Default is False.json_restart_file (
str
): Extract parameter covariance and last sample value from saved json file.save_lightly (
bool
): Flag to indicate whether results json file should include chains.
Note
For the log file names
chainfile, sschainfile, s2chainfile
andcovchainfile
do not include the extension. By specifying whether to save to text or to binary, the appropriate extension will be added.
-
-
pymcmcstat.settings.SimulationOptions.
check_lightly_save
(save_lightly, save_to_json, save_to_bin, save_to_txt)[source]¶ Check settings for lightly save
If saving to json, chains will be removed if already being stored via one of the logging methods, binary or text. If logging methods not being used, then chains will be included in json file.
pymcmcstat.structures package¶
pymcmcstat.structures.ParameterSet module¶
Created on Thu Jan 18 10:15:37 2018
@author: prmiles
pymcmcstat.structures.ResultsStructure module¶
Created on Wed Jan 17 09:18:19 2018
@author: prmiles
-
class
pymcmcstat.structures.ResultsStructure.
ResultsStructure
[source]¶ Bases:
object
Results from MCMC simulation.
Description: Class used to organize results of MCMC simulation.
- Attributes:
-
add_basic
(nsimu, covariance, parameters, rejected, simutime, theta)[source]¶ Add basic results from MCMC simulation to structure.
- Args:
nsimu (
int
): Number of MCMC simulations.model (
ModelSettings
): MCMC model settings.covariance (
CovarianceProcedures
): Covariance variables.parameters (
ModelParameters
): Model parameters.rejected (
dict
): Dictionary of rejection stats.simutime (
float
): Simulation run time in seconds.theta (
ndarray
): Last sampled values.
-
add_chain
(chain=None)[source]¶ Add chain to results structure.
- Args:
chain (
ndarray
): Model parameter sampling chain.
-
add_dram
(drscale, RDR, total_rejected, drsettings)[source]¶ Add results specific to performing DR algorithm.
- Args:
drscale (
ndarray
): Reduced scale for sampling in DR algorithm. Default is [5,4,3].RDR (
ndarray
): Cholesky decomposition of covariance matrix based on DR.total_rejected (
int
): Number of rejected samples.drsettings (
DelayedRejection
): Need access to counters within DR class.
-
add_model
(model=None)[source]¶ Saves subset of features of the model settings in a nested dictionary.
- Args:
model (
ModelSettings
): MCMC model settings.
-
add_options
(options=None)[source]¶ Saves subset of features of the simulation options in a nested dictionary.
- Args:
options (
SimulationOptions
): MCMC simulation options.
-
add_prior
(mu, sigma, priortype)[source]¶ Add results specific to prior function.
- Args:
Note
This feature is not currently implemented.
-
add_random_number_sequence
(rndseq)[source]¶ Add random number sequence to results structure.
- Args:
rndseq (
ndarray
): Sequence of sampled random numbers.
Note
This feature is not currently implemented.
-
add_s2chain
(s2chain=None)[source]¶ Add observiation error chain to results structure.
- Args:
s2chain (
ndarray
): Sampling chain of observation errors.
-
add_sschain
(sschain=None)[source]¶ Add sum-of-squares chain to results structure.
- Args:
sschain (
ndarray
): Calculated sum-of-squares error for each parameter chains set.
-
add_time_stats
(mtime, drtime, adtime)[source]¶ Add time spend using each sampling algorithm.
- Args:
Note
This feature is not currently implemented.
-
add_updatesigma
(updatesigma, sigma2, S20, N0)[source]¶ Add information to results structure related to observation error.
- Args:
If
updatesigma is True
, thenresults['sigma2'] = np.nan results['S20'] = S20 results['N0'] = N0
Otherwise
results['sigma2'] = sigma2 results['S20'] = np.nan results['N0'] = np.nan
-
classmethod
determine_filename
(options)[source]¶ Determine results filename.
If not specified by results_filename in the simulation options, then a default naming format is generated using the date string associated with the initialization of the simulation.
- Args:
options (
SimulationOptions
): MCMC simulation options.
- Returns:
filename (
str
): Filename string.
-
export_lightly
(results)[source]¶ Export minimal simulation results to a json file.
This will save the key terms in the results dict, excluding arrays. Ideally, this is used in conjunction with one of the chain saving methods. The goal is to provide a results dict to simplify post- processing and reduces storage overhead.
- Args:
results (
ResultsStructure
): Dictionary of MCMC simulation results/settings.
-
export_simulation_results_to_json_file
(results)[source]¶ Export simulation results to a json file.
- Args:
results (
ResultsStructure
): Dictionary of MCMC simulation results/settings.
-
pymcmcstat.structures.ResultsStructure.
lighten_results
(results)[source]¶ Saves subset of features of the simulation options in a nested dictionary.
- Args:
options (
SimulationOptions
): MCMC simulation options.
pymcmcstat.utilities package¶
pymcmcstat.utilities.NumpyEncoder module¶
Created on Mon Apr 2 08:42:56 2018
@author: prmiles
-
class
pymcmcstat.utilities.NumpyEncoder.
NumpyEncoder
(*, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, sort_keys=False, indent=None, separators=None, default=None)[source]¶ Bases:
json.encoder.JSONEncoder
Encoder used for storing numpy arrays in json files.
-
default
(obj)[source]¶ Implement this method in a subclass such that it returns a serializable object for
o
, or calls the base implementation (to raise aTypeError
).For example, to support arbitrary iterators, you could implement default like this:
def default(self, o): try: iterable = iter(o) except TypeError: pass else: return list(iterable) # Let the base class default method raise the TypeError return JSONEncoder.default(self, o)
-
pymcmcstat.utilities.general module¶
Created on Tue Jun 26 06:16:02 2018
General functions used throughout package
@author: prmiles
-
pymcmcstat.utilities.general.
check_settings
(default_settings, user_settings=None)[source]¶ Check user settings with default.
Recursively checks elements of user settings against the defaults and updates settings as it goes. If a user setting does not exist in the default, then the user setting is added to the settings. If the setting is defined in both the user and default settings, then the user setting overrides the default. Otherwise, the default settings persist.
pymcmcstat.utilities.progressbar module¶
-
pymcmcstat.utilities.progressbar.
progress_bar
(iters)[source]¶ Simulation progress bar.
A simple progress bar to monitor MCMC sampling progress. Modified from original code by Corey Goldberg (2010).
- Args:
iters (
int
): Number of iterations in simulation.
Example display:
[-------- 21% ] 2109 of 10000 complete in 0.5 sec
Note
Will display a progress bar as simulation runs, providing feedback as to the status of the simulation. Depending on the available resources, the appearance of the progress bar may differ.
References¶
- BG98
Stephen P Brooks and Andrew Gelman. General methods for monitoring convergence of iterative simulations. Journal of computational and graphical statistics, 7(4):434–455, 1998.
- BR98
Stephen P Brooks and Gareth O Roberts. Assessing convergence of markov chain monte carlo algorithms. Statistics and Computing, 8(4):319–335, 1998. URL: http://www.math.pitt.edu/~swigon/Homework/brooks97assessing.pdf.
- GR+92
Andrew Gelman, Donald B Rubin, and others. Inference from iterative simulation using multiple sequences. Statistical science, 7(4):457–472, 1992.
- HLMS06
Heikki Haario, Marko Laine, Antonietta Mira, and Eero Saksman. Dram: efficient adaptive mcmc. Statistics and Computing, 16(4):339–354, 2006. URL: https://link.springer.com/article/10.1007/s11222-006-9438-0.
- HST+01
Heikki Haario, Eero Saksman, Johanna Tamminen, and others. An adaptive metropolis algorithm. Bernoulli, 7(2):223–242, 2001. URL: https://projecteuclid.org/euclid.bj/1080222083.
- MT00
George Marsaglia and Wai Wan Tsang. A simple method for generating gamma variables. ACM Transactions on Mathematical Software (TOMS), 26(3):363–372, 2000. URL: https://dl.acm.org/citation.cfm?id=358414.
- Smi14
Ralph C. Smith. Uncertainty quantification: theory, implementation, and applications. Volume 12. SIAM, 2014.