API

This page details the methods and classes provided by the emcee module. The main entry point is through the EnsembleSampler object.

The Affine-Invariant Ensemble Sampler

Standard usage of emcee involves instantiating an EnsembleSampler.

class emcee.EnsembleSampler(nwalkers, dim, lnpostfn, a=2.0, args=[], postargs=None, threads=1, pool=None, live_dangerously=False)

A generalized Ensemble sampler that uses 2 ensembles for parallelization. The __init__ function will raise an AssertionError if k < 2 * dim (and you haven’t set the live_dangerously parameter) or if k is odd.

Warning: The chain member of this object has the shape: (nwalkers, nlinks, dim) where nlinks is the number of steps taken by the chain and k is the number of walkers. Use the flatchain property to get the chain flattened to (nlinks, dim). For users of pre-1.0 versions, this shape is different so be careful!

Parameters:
  • nwalkers – The number of Goodman & Weare “walkers”.
  • dim – Number of dimensions in the parameter space.
  • lnpostfn – A function that takes a vector in the parameter space as input and returns the natural logarithm of the posterior probability for that position.
  • a – (optional) The proposal scale parameter. (default: 2.0)
  • args – (optional) A list of extra arguments for lnpostfn. lnpostfn will be called with the sequence lnpostfn(p, *args).
  • postargs – (optional) Alias of args for backwards compatibility.
  • threads – (optional) The number of threads to use for parallelization. If threads == 1, then the multiprocessing module is not used but if threads > 1, then a Pool object is created and calls to lnpostfn are run in parallel.
  • pool – (optional) An alternative method of using the parallelized algorithm. If provided, the value of threads is ignored and the object provided by pool is used for all parallelization. It can be any object with a map method that follows the same calling sequence as the built-in map function.
acceptance_fraction

An array (length: k) of the fraction of steps accepted for each walker.

acor

The autocorrelation time of each parameter in the chain (length: dim) as estimated by the acor module.

blobs

Get the list of “blobs” produced by sampling. The result is a list (of length iterations) of list s (of length nwalkers) of arbitrary objects. Note: this will actually be an empty list if your lnpostfn doesn’t return any metadata.

chain

A pointer to the Markov chain itself. The shape of this array is (k, iterations, dim).

clear_chain()

An alias for reset() kept for backwards compatibility.

flatchain

A shortcut for accessing chain flattened along the zeroth (walker) axis.

flatlnprobability

A shortcut to return the equivalent of lnprobability but aligned to flatchain rather than chain.

get_lnprob(p)

Return the log-probability at the given position.

lnprobability

A pointer to the matrix of the value of lnprobfn produced at each step for each walker. The shape is (k, iterations).

random_state

The state of the internal random number generator. In practice, it’s the result of calling get_state() on a numpy.random.mtrand.RandomState object. You can try to set this property but be warned that if you do this and it fails, it will do so silently.

reset()

Clear the chain and lnprobability array. Also reset the bookkeeping parameters.

run_mcmc(pos0, N, rstate0=None, lnprob0=None, **kwargs)

Iterate sample() for N iterations and return the result.

Parameters:
  • p0 – The initial position vector.
  • N – The number of steps to run.
  • lnprob0 – (optional) The log posterior probability at position p0. If lnprob is not provided, the initial value is calculated.
  • rstate0 – (optional) The state of the random number generator. See the random_state() property for details.
  • kwargs – (optional) Other parameters that are directly passed to sample().
sample(p0, lnprob0=None, rstate0=None, blobs0=None, iterations=1, thin=1, storechain=True, mh_proposal=None)

Advance the chain iterations steps as a generator.

Parameters:
  • p0 – A list of the initial positions of the walkers in the parameter space. It should have the shape (nwalkers, dim).
  • lnprob0 – (optional) The list of log posterior probabilities for the walkers at positions given by p0. If lnprob is None, the initial values are calculated. It should have the shape (k, dim).
  • rstate0 – (optional) The state of the random number generator. See the Sampler.random_state property for details.
  • iterations – (optional) The number of steps to run.
  • thin – (optional) If you only want to store and yield every thin samples in the chain, set thin to an integer greater than 1.
  • storechain – (optional) By default, the sampler stores (in memory) the positions and log-probabilities of the samples in the chain. If you are using another method to store the samples to a file or if you don’t need to analyse the samples after the fact (for burn-in for example) set storechain to False.
  • mh_proposal – (optional) A function that returns a list of positions for nwalkers walkers given a current list of positions of the same size. See utils.MH_proposal_axisaligned for an example.

At each iteration, this generator yields:

  • pos — A list of the current positions of the walkers in the parameter space. The shape of this object will be (nwalkers, dim).
  • lnprob — The list of log posterior probabilities for the walkers at positions given by pos . The shape of this object is (nwalkers, dim).
  • rstate — The current state of the random number generator.
  • blobs — (optional) The metadata “blobs” associated with the current position. The value is only returned if lnpostfn returns blobs too.

The Parallel-Tempered Ensemble Sampler

The PTSampler class performs a parallel-tempered ensemble sampling using EnsembleSampler to sample within each temperature. This sort of sampling is useful if you expect your distribution to be multi-modal. Take a look at the documentation to see how you might use this class.

class emcee.PTSampler(ntemps, nwalkers, dim, logl, logp, threads=1, pool=None, betas=None)

A parallel-tempered ensemble sampler, using EnsembleSampler for sampling within each parallel chain.

Parameters:
  • ntemps – The number of temperatures.
  • nwalkers – The number of ensemble walkers at each temperature.
  • dim – The dimension of parameter space.
  • logl – The log-likelihood function.
  • logp – The log-prior function.
  • threads – (optional) The number of parallel threads to use in sampling.
  • pool – (optional) Alternative to threads. Any object that implements a map method compatible with the built-in map will do here. For example, multi.Pool will do.
  • betas – (optional) Array giving the inverse temperatures, \(\beta=1/T\), used in the ladder. The default is for an exponential ladder, with beta decreasing by a factor of \(1/\sqrt{2}\) each rung.
acceptance_fraction

Matrix of shape (Ntemps, Nwalkers) detailing the acceptance fraction for each walker.

acor

Returns a matrix of autocorrelation lengths for each parameter in each temperature of shape (Ntemps, Ndim).

betas

Returns the sequence of inverse temperatures in the ladder.

chain

Returns the stored chain of samples; shape (Ntemps, Nwalkers, Nsteps, Ndim).

clear_chain()

An alias for reset() kept for backwards compatibility.

exponential_beta_ladder(ntemps)

Exponential ladder in \(1/T\), with \(T\) increasing by \(\sqrt{2}\) each step, with ntemps in total.

flatchain

Alias of chain provided for compatibility.

get_lnprob(p)

Return the log-probability at the given position.

lnlikelihood

Matrix of ln-likelihood values; shape (Ntemps, Nwalkers, Nsteps).

lnprobability

Matrix of lnprobability values; shape (Ntemps, Nwalkers, Nsteps).

random_state

The state of the internal random number generator. In practice, it’s the result of calling get_state() on a numpy.random.mtrand.RandomState object. You can try to set this property but be warned that if you do this and it fails, it will do so silently.

reset()

Clear the chain, lnprobability, lnlikelihood, acceptance_fraction, tswap_acceptance_fraction stored properties.

run_mcmc(pos0, N, rstate0=None, lnprob0=None, **kwargs)

Iterate sample() for N iterations and return the result.

Parameters:
  • p0 – The initial position vector.
  • N – The number of steps to run.
  • lnprob0 – (optional) The log posterior probability at position p0. If lnprob is not provided, the initial value is calculated.
  • rstate0 – (optional) The state of the random number generator. See the random_state() property for details.
  • kwargs – (optional) Other parameters that are directly passed to sample().
sample(p0, lnprob0=None, lnlike0=None, iterations=1, thin=1, storechain=True)

Advance the chains iterations steps as a generator.

Parameters:
  • p0 – The initial positions of the walkers. Shape should be (ntemps, nwalkers, dim).
  • lnprob0 – (optional) The initial posterior values for the ensembles. Shape (ntemps, nwalkers).
  • lnlike0 – (optional) The initial likelihood values for the ensembles. Shape (ntemps, nwalkers).
  • iterations – (optional) The number of iterations to preform.
  • thin – (optional) The number of iterations to perform between saving the state to the internal chain.
  • storechain – (optional) If True store the iterations in the chain property.

At each iteration, this generator yields

  • p, the current position of the walkers.
  • lnprob the current posterior values for the walkers.
  • lnlike the current likelihood values for the walkers.
thermodynamic_integration_log_evidence(logls=None, fburnin=0.1)

Thermodynamic integration estimate of the evidence.

Parameters:
  • logls – (optional) The log-likelihoods to use for computing the thermodynamic evidence. If None (the default), use the stored log-likelihoods in the sampler. Should be of shape (Ntemps, Nwalkers, Nsamples).
  • fburnin – (optional) The fraction of the chain to discard as burnin samples; only the final 1-fburnin fraction of the samples will be used to compute the evidence; the default is fburnin = 0.1.
Return (lnZ, dlnZ)(lnZ, dlnZ):
 

Returns an estimate of the log-evidence and the error associated with the finite number of temperatures at which the posterior has been sampled.

The evidence is the integral of the un-normalized posterior over all of parameter space:

\[Z \equiv \int d\theta \, l(\theta) p(\theta)\]

Thermodymanic integration is a technique for estimating the evidence integral using information from the chains at various temperatures. Let

\[Z(\beta) = \int d\theta \, l^\beta(\theta) p(\theta)\]

Then

\[\frac{d \ln Z}{d \beta} = \frac{1}{Z(\beta)} \int d\theta l^\beta p \ln l = \left \langle \ln l \right \rangle_\beta\]

so

\[\ln Z(\beta = 1) = \int_0^1 d\beta \left \langle \ln l \right\rangle_\beta\]

By computing the average of the log-likelihood at the difference temperatures, the sampler can approximate the above integral.

tswap_acceptance_fraction

Returns an array of accepted temperature swap fractions for each temperature; shape (ntemps, ).

Standard Metropolis-Hastings Sampler

The Metropolis-Hastings sampler included in this module is far from fine-tuned and optimized. It is, however, stable and it has a consistent API so it can be useful for testing and comparison.

class emcee.MHSampler(cov, *args, **kwargs)

The most basic possible Metropolis-Hastings style MCMC sampler

Parameters:
  • cov – The covariance matrix to use for the proposal distribution.
  • dim – Number of dimensions in the parameter space.
  • lnpostfn – A function that takes a vector in the parameter space as input and returns the natural logarithm of the posterior probability for that position.
  • args – (optional) A list of extra arguments for lnpostfn. lnpostfn will be called with the sequence lnpostfn(p, *args).
acceptance_fraction

The fraction of proposed steps that were accepted.

acor

The autocorrelation time of each parameter in the chain (length: dim) as estimated by the acor module.

chain

A pointer to the Markov chain.

clear_chain()

An alias for reset() kept for backwards compatibility.

flatchain

Alias of chain provided for compatibility.

get_lnprob(p)

Return the log-probability at the given position.

lnprobability

A list of the log-probability values associated with each step in the chain.

random_state

The state of the internal random number generator. In practice, it’s the result of calling get_state() on a numpy.random.mtrand.RandomState object. You can try to set this property but be warned that if you do this and it fails, it will do so silently.

run_mcmc(pos0, N, rstate0=None, lnprob0=None, **kwargs)

Iterate sample() for N iterations and return the result.

Parameters:
  • p0 – The initial position vector.
  • N – The number of steps to run.
  • lnprob0 – (optional) The log posterior probability at position p0. If lnprob is not provided, the initial value is calculated.
  • rstate0 – (optional) The state of the random number generator. See the random_state() property for details.
  • kwargs – (optional) Other parameters that are directly passed to sample().
sample(p0, lnprob=None, randomstate=None, thin=1, storechain=True, iterations=1)

Advances the chain iterations steps as an iterator

Parameters:
  • p0 – The initial position vector.
  • lnprob0 – (optional) The log posterior probability at position p0. If lnprob is not provided, the initial value is calculated.
  • rstate0 – (optional) The state of the random number generator. See the random_state() property for details.
  • iterations – (optional) The number of steps to run.
  • thin – (optional) If you only want to store and yield every thin samples in the chain, set thin to an integer greater than 1.
  • storechain – (optional) By default, the sampler stores (in memory) the positions and log-probabilities of the samples in the chain. If you are using another method to store the samples to a file or if you don’t need to analyse the samples after the fact (for burn-in for example) set storechain to False.

At each iteration, this generator yields:

  • pos — The current positions of the chain in the parameter space.
  • lnprob — The value of the log posterior at pos .
  • rstate — The current state of the random number generator.

Abstract Sampler Object

This section is mostly for developers who would be interested in implementing a new sampler for inclusion in emcee. A good starting point would be to subclass the sampler object and override the Sampler.sample() method.

class emcee.Sampler(dim, lnprobfn, args=[])

An abstract sampler object that implements various helper functions

Parameters:
  • dim – The number of dimensions in the parameter space.
  • lnpostfn – A function that takes a vector in the parameter space as input and returns the natural logarithm of the posterior probability for that position.
  • args – (optional) A list of extra arguments for lnpostfn. lnpostfn will be called with the sequence lnpostfn(p, *args).
acceptance_fraction

The fraction of proposed steps that were accepted.

acor

The autocorrelation time of each parameter in the chain (length: dim) as estimated by the acor module.

chain

A pointer to the Markov chain.

clear_chain()

An alias for reset() kept for backwards compatibility.

flatchain

Alias of chain provided for compatibility.

get_lnprob(p)

Return the log-probability at the given position.

lnprobability

A list of the log-probability values associated with each step in the chain.

random_state

The state of the internal random number generator. In practice, it’s the result of calling get_state() on a numpy.random.mtrand.RandomState object. You can try to set this property but be warned that if you do this and it fails, it will do so silently.

reset()

Clear chain, lnprobability and the bookkeeping parameters.

run_mcmc(pos0, N, rstate0=None, lnprob0=None, **kwargs)

Iterate sample() for N iterations and return the result.

Parameters:
  • p0 – The initial position vector.
  • N – The number of steps to run.
  • lnprob0 – (optional) The log posterior probability at position p0. If lnprob is not provided, the initial value is calculated.
  • rstate0 – (optional) The state of the random number generator. See the random_state() property for details.
  • kwargs – (optional) Other parameters that are directly passed to sample().

Utilities

emcee.utils.sample_ball(p0, std, size=1)[source]

Produce a ball of walkers around an initial parameter value.

Parameters:
  • p0 – The initial parameter value.
  • std – The axis-aligned standard deviation.
  • size – The number of samples to produce.
class emcee.utils.MH_proposal_axisaligned(stdev)[source]

A Metropolis-Hastings proposal, with axis-aligned Gaussian steps, for convenient use as the mh_proposal option to EnsembleSampler.sample() .

class emcee.utils.MPIPool(comm=<mpi4py.MPI.Intracomm object at 0x10f3c6b88>, debug=False)[source]

A pool that distributes tasks over a set of MPI processes. MPI is an API for distributed memory parallelism. This pool will let you run emcee without shared memory, letting you use much larger machines with emcee.

The pool only support the map() method at the moment because this is the only functionality that emcee needs. That being said, this pool is fairly general and it could be used for other purposes.

Contributed by Joe Zuntz.

Parameters:
  • comm – (optional) The mpi4py communicator.
  • debug – (optional) If True, print out a lot of status updates at each step.
close()[source]

Just send a message off to all the pool members which contains the special _close_pool_message sentinel.

is_master()[source]

Is the current process the master?

map(function, tasks)[source]

Like the built-in map() function, apply a function to all of the values in a list and return the list of results.

Parameters:
  • function – The function to apply to the list.
  • tasks – The list of elements.
wait()[source]

If this isn’t the master process, wait for instructions.

emcee is an extensible, pure-Python implementation of Goodman & Weare's Affine Invariant Markov chain Monte Carlo (MCMC) Ensemble sampler. It's designed for Bayesian parameter estimation and it's really sweet!

Table Of Contents

Related Topics

Fork me on GitHub