The sherpa.sim module¶
MonteCarlo Markov Chain support for lowcount data (Poisson statistics).
The sherpa.sim
module provides a Bayesian algorithm with
MonteCarlo Markov Chain (MCMC) for Poisson data.
The Sherpa UI modules  e.g. sherpa.ui
and sherpa.astro.ui
 provide
many of the routines described below (e.g. list_samplers
).
Acknowledgements¶
The original version of the code was developed by the CHASC AstroStatistics collaboration http://heawww.harvard.edu/AstroStat/, and was called pyBLoCXS. It has since been developed by the Chandra Xray Center and weas added to Sherpa in version 4.5.1.
Overview¶
The algorithm explores parameter space at a suspected minimum  i.e. after a standard Sherpa fit. It supports a flexible definition of priors and allows for variations in the calibration information. It can be used to compute posterior predictive pvalues for the likelihood ratio test 2. Future versions will allow for the incorporation of calibration uncertainty 3.
MCMC is a complex computational technique that requires some sophistication on the part of its users to ensure that it both converges and explores the posterior distribution properly. The pyBLoCXS code has been tested with a number of simple singlecomponent spectral models. It should be used with great care in more complex settings. The code is based on the methods in 4 but employs a different MCMC sampler than is described in that article. A general description of the techniques employed along with their convergence diagnostics can be found in the Appendices of 4 and in 5.
Jumping Rules¶
The jumping rule determines how each step in the MonteCarlo Markov
Chain is calculated. The setting can be changed using set_sampler
.
The sherpa.sim
module provides the following rules, which may
be augmented by other modules:
MH
uses a MetropolisHastings jumping rule that is a multivariate tdistribution with userspecified degrees of freedom centered on the bestfit parameters, and with multivariate scale determined by thecovar
function applied at the bestfit location.MetropolisMH
mixes this Metropolis Hastings jumping rule with a Metropolis jumping rule centered at the current draw, in both cases drawing from the same tdistribution as used withMH
. The probability of using the bestfit location as the start of the jump is given by thep_M
parameter of the rule (useget_sampler
orget_sampler_opt
to view andset_sampler_opt
to set this value), otherwise the jump is from the previous location in the chain.
Options for the sampler are retrieved and set by get_sampler
or
get_sampler_opt
, and set_sampler_opt
respectively. The list of
available samplers is given by list_samplers
.
Choosing the parameter values¶
By default, the prior on each parameter is taken to be flat, varying
from the parameter minima to maxima values. This prior can be changed
using the set_prior
function, which can set the prior for a
parameter to a function or Sherpa model. The list of currently set
priorparameter pairs is returned by the list_priors
function, and the
prior function associated with a particular Sherpa model parameter may be
accessed with get_prior
.
Running the chain¶
The get_draws
function runs a pyBLoCXS chain using fit information
associated with the specified data set(s), and the currently set sampler and
parameter priors, for a specified number of iterations. It returns an array of
statistic values, an array of acceptance Booleans, and a 2D array of
associated parameter values.
Analyzing the results¶
The module contains several routines to visualize the results of the chain,
including plot_trace
, plot_cdf
, and plot_pdf
, along with
sherpa.utils.get_error_estimates
for calculating the limits from a
parameter chain.
References
 1
 2
“Statistics, Handle with Care: Detecting Multiple Model Components with the Likelihood Ratio Test”, Protassov et al., 2002, ApJ, 571, 545 http://adsabs.harvard.edu/abs/2002ApJ…571..545P
 3
“Accounting for Calibration Uncertainties in Xray Analysis: Effective Areas in Spectral Fitting”, Lee et al., 2011, ApJ, 731, 126 http://adsabs.harvard.edu/abs/2011ApJ…731..126L
 4(1,2)
“Analysis of Energy Spectra with Low Photon Counts via Bayesian Posterior Simulation”, van Dyk et al. 2001, ApJ, 548, 224 http://adsabs.harvard.edu/abs/2001ApJ…548..224V
 5
Chapter 11 of Gelman, Carlin, Stern, and Rubin (Bayesian Data Analysis, 2nd Edition, 2004, Chapman & Hall/CRC).
Examples
Analysis proceeds as normal, up to the point that a good fit has
been determined, as shown below (note that a Poisson likelihood,
such as the cash
statistic, must be used):
>>> from sherpa.astro import ui
>>> ui.load_pha('src.pi')
>>> ui.notice(0.5, 7)
>>> ui.set_source(ui.xsphabs.gal * ui.xspowerlaw.pl)
>>> ui.set_stat('cash')
>>> ui.set_method('simplex')
>>> ui.fit()
>>> ui.covar()
Once the bestfit location has been determined (which may require
multiple calls to fit
), the chain can be run. In this example
the default sampler (MetropolisMH
) and default parameter priors
(flat, varying between the minimum and maximum values) are used,
as well as the default number of iterations (1000):
>>> stats, accept, params = ui.get_draws()
The stats
array contains the fit statistic for each iteration
(the first element of these arrays is the starting point of the chain,
so there will be 1001 elements in this case). The “trace”  i.e.
statistic versus iteration  can be plotted using:
>>> ui.plot_trace(stats, name='stat')
The accept
array indicates whether, at each iteration, the proposed
jump was accepted, (True
) or if the previous iterations parameter
values are used. This can be used to look at the acceptance rate for
the chain (dropping the last element and a burnin period, which
here is arbitrarily taken to be 100):
>>> nburn = 100
>>> arate = accept[nburn:1].sum() * 1.0 / (len(accept)  nburn  1)
>>> print("acceptance rate = {}".format(arate))
The trace of the parameter values can also be displayed; in this example a burnin period has not been removed):
>>> par1 = params[:, 0]
>>> par2 = params[:, 1]
>>> ui.plot_trace(par1, name='par1')
>>> ui.plot_trace(par2, name='par2')
The cumulative distribution can also be viewed:
>>> ui.plot_cdf(par1[nburn:], name='par1')
as well as the probability density:
>>> ui.plot_[pdf(par2[nburn:], name='par2')
The traces can be used to estimate the credible interval for a parameter:
>>> from sherpa.utils import get_error_estimates
>>> pval, plo, phi = get_error_estimates(par1[nburn:])
Classes

Highlevel UI to pyBLoCXS that joins the loop in 'Walk' with the jumping rule in 'Sampler'. 

Resample a 1D dataset using asymmetric errors. 
Functions

The flat prior (returns 1 everywhere). 

Returns the inverse of x. 

Returns the invers of x^2. 