polychord sampler

PolyChord is an advanced nested sampler, that uses slice sampling to sample within the nested isolikelihoods contours. The use of slice sampling instead of rejection sampling makes PolyChord specially suited for high-dimensional parameter spaces, and allows for exploiting speed hierarchies in the parameter space. Also, PolyChord can explore multi-modal distributions efficiently.

PolyChord is an external sampler, not installed by default (just a wrapper for it). You need to install it yourself following the general instructions for installing external modules, or the manual installation instructions below.

Usage

To use PolyChord, you just need to mention it in the sampler block:

sampler:
  polychord:
    # polychord options ...

Just copy the options that you wish to modify from the defaults file below:

# Default arguments for the PolyChord sampler

sampler:
  polychord:
    # Path to the PolyChord installation folder
    path:  # use only if installed manually!
    # Optional parameters for the sampler:
    # ('Xd' means 'X steps per dimension')
    # -------------------------------------------------
    # Number of live points
    nlive: 25d
    # Number of slice-sampling steps each new sample.
    # Helps with curved degeneracies and thin "corners".
    # Recommended: '5d' if accurate evidences needed or weird posterior shapes.
    # You may want to use `d` for exploratory runs.
    num_repeats: 5d
    # Number of prior samples drawn before starting compression
    nprior:  # default: nlive
    # Whether to check for and explore multi-modality on the posterior
    do_clustering: True
    # Stopping criterion: fraction of the total evidence contained in the live points
    precision_criterion: 0.001
    # Stopping criterion (alt): maximum number of nested sampling iterations
    max_ndead: .inf
    # How often to print progress, update output files and do clustering
    # -- increase for more frequency (1 for update and callback per dead point)
    compression_factor: 0.36787944117144233  # = exp(-1)
    # Callback function -- see documentation
    callback_function:
    # Numerical value of log(0) sent to PolyChord's Fortran code
    # If null: `numpy.nan_to_num(-numpy.inf)` (produces an error in `ifort`!)
    logzero: -1e300  # safer
    # Increase number of posterior samples
    boost_posterior: 0  # increase up to `num_repeats`
    # Verbosity during the sampling process. Set to one of [0,1,2,3]
    feedback:  # default: Same as global `verbosity`
    # Manual speed blocking
    # ---------------------
    # To take full advantage of speed-blocking, sort the blocks by ascending speeds
    # (unless not possible due to the architecture of your likelihood)
    blocking:
    #  - [speed_1, [params_1]]
    #  - etc.
    # Treatment of unbounded parameters: confidence level to use
    # ----------------------------------------------------------
    # (Use with care if there are likelihood modes close to the edge of the prior)
    confidence_for_unbounded: 0.9999995  # 5 sigmas of the prior
    # Seeding runs
    # ------------
    seed:  # postitive integer
    # Raw output of PolyChord (no need to change them, normally)
    # ----------------------------------------------------------
    base_dir: raw_polychord_output
    file_root:
    posteriors: True
    equals: True
    cluster_posteriors: True
    write_resume: True
    read_resume: True
    write_stats: True
    write_live: True
    write_dead: True

Warning

If you want to sample with PolyChord, your priors need to be bounded. This is because PolyChord samples uniformly from a bounded hypercube, defined by a non-trivial transformation for general unbounded priors.

The option confidence_for_unbounded will automatically bind the priors at 5-sigma c.l., but this may cause problems with likelihood modes at the edge of the prior. In those cases, check for stability with respect to increasing that parameter. Of course, if confidence_for_unbounded is much smaller than unity, the resulting evidence may be biased towards a lower value.

The main output is the Monte Carlo sample of sequentially discarded live points, saved in the standard sample format together with the input.yaml and full.yaml files (see Output). The raw PolyChord products are saved in a subfolder of the output folder (determined by the option base_dir – default: raw_polychord_output). Since PolyChord is a nester sampler integrator, the log-evidence and its standard deviation are also returned.

If the posterior was found to be multi-modal, the output will include separate samples and evidences for each of the modes.

Computing Bayes ratios

If you are using only internal priors, your full prior is correctly normalized, so you can directly use the output evidence of PolyChord, e.g. in a Bayes ratio.

If you are using external priors (as described here), the full prior is not normalized by default, so the resulting evidence, \(\mathcal{Z}_\mathrm{u}\), needs to be divided by the prior volume. To compute the prior volume \(\mathcal{Z}_\pi\), substitute your likelihoods for the unit likelihood one. The normalized likelihood and its propagated error are in this case:

\[\begin{split}\log\mathcal{Z} &= \log\mathcal{Z}_\mathrm{u} - \log\mathcal{Z}_\pi\\ \sigma(\log\mathcal{Z}) &= \sigma(\log\mathcal{Z}_\mathrm{u}) + \sigma(\log\mathcal{Z}_\pi)\end{split}\]

Note

Only for Cobaya versions <1.X (using PolyChord 1.15)

If your prior is constant in the region of interest, in order for PolyChord not to get stuck, you have to add a small noise term to the likelihood one (see its documentation). The noise amplitude must be smaller by a couple of orders of magnitude than the inverse of a rough estimation of the expected prior volume, e.g. if your prior volume is expected to be \(\mathcal{O}(10^3)\), make the noise \(\mathcal{O}(10^{-5})\). PolyChord will take a while to converge (even if the prior evaluation is fast), and you may get some Non deterministic loglikelihood warnings coming from PolyChord, but don’t worry about them.

As an example, if we want to compute the area of the constant triangle \(y > x\) in a square of side 10 (expected area \(\sim 100\)), we would use the following input file:

params:
  x:
    prior:
      min:  0
      max: 10
  y:
    prior:
      min:  0
      max: 10
prior:
  triangle: "lambda x,y: np.log(y>x)"
likelihood:
  one:
    noise: 1e-4
sampler:
  polychord:

Taking advantage of a speed hierarchy – new in 1.2

PolyChord automatically sorts parameters optimally and chooses the number of repeats per likelihood (or parameter block). The user only needs to specify the speed of each likelihood.

Automatic speed-blocking takes advantage of differences in speed per likelihood (or theory). If the parameters of your likelihood or theory have some internal speed hierarchy that you would like to exploit (e.g. if your likelihood internally caches the result of a computation depending only on a subset of the likelihood parameters), you can specify a fine-grained list of parameter blocks and their speeds under the blocking option. The same syntax and caveats as in the MCMC sampler apply (excluding the mentions to oversampling and dragging).

Callback functions

A callback function can be specified through the callback_function option. It must be a function of a single argument, which at runtime is the current instance of the polychord sampler. You can access its attributes and methods inside your function, including the collection of live and dead points, the current calculation of the evidence, and the model (of which prior and likelihood are attributes). For example, the following callback function would print the number of points added to the chain since the last callback, the current evidence estimate and the maximum likelihood point:

def callback(sampler):
    print("There are %d dead points. The last %d were added since the last callback."%(
        sampler.dead.n(), sampler.dead.n() - sampler.last_point_callback))
    print("Current logZ = %g +/- %g"%(sampler.logZ, sampler.logZstd))
    # Maximum likelihood: since we sample over the posterior, it may be "dead"!
    min_chi2_dead = sampler.dead[sampler.dead["chi2"].values.argmin()]
    # At the end of the run, the list of live points is empty
    try:
        min_chi2_live = sampler.live[sampler.live["chi2"].values.argmin()]
        min_chi2_point = (min_chi2_live if min_chi2_live["chi2"] < min_chi2_dead["chi2"]
                          else min_chi2_dead)
    except:
        min_chi2_point = min_chi2_dead
    print("The maximum likelihood (min chi^2) point reached is\n%r"%min_chi2_point)

The frequency of calls of the callback function is given by the compression_factor (see contents of polychord.yaml above).

Troubleshooting

If you are getting an error whose cause is not immediately obvious, try substituting polychord by the dummy sampler evaluate.

If still in doubt, run with debug output and check what the prior and likelihood are getting and producing: either set debug: True in the input file and set debug_file to some file name, or add the --debug flag to cobaya-run and pipe the output to a file with cobaya-run [input.yaml] --debug > file.

If PolyChord gets stuck in started sampling, it probably means that your posterior is flat; if that was intentional, check the Computing Bayes ratios section, where it is discussed how to deal with those cases.

If everything seems to be working fine, but PolyChord is taking too long to converge, reduce the number of live points nlive to e.g. 10 per dimension, and the precision_criterion to 0.1 or so, and check that you get reasonable (but low-precision) sample and evidences.

Installation

Simply run cobaya-install polychord --modules [/path/to/modules] (or, instead of polychord after cobaya-install, mention an input file that uses polychord).

If it has been installed this way, it is not necessary to specify a path for it, as long as the modules folder has been indicated.

Note

To run PolyChord with MPI (highly recommended!) you need to make sure that MPI+Python is working in your system, see MPI parallelization (optional but encouraged!).

In addition, you need a MPI-wrapped Fortran compiler. You should have an MPI implementation installed if you followed the instructions to install mpi4py. On top of that, you need the Fortran compiler (we recommend the GNU one) and the development package of MPI. Use your system’s package manager to install them (sudo apt install gfortran libopenmpi-dev in Ubuntu/Debian systems), or contact your local IT service. If everything is correctly installed, you should be able to type mpif90 in the shell and not get a Command not found error.

Note

Polychord for Mac users:

To have PolyChord work, install gcc (maybe using Homebrew), and check that gcc-[X] works in the terminal, where [X] is the version that you have just installed.

Now install PolyChord, either by hand or using cobaya’s automatic installer, go to the newly created PolyChord folder, and compile it with

$ make veryclean
$ make PyPolyChord MPI= CC=gcc-[X] CXX=g++-[X]
$ python[Y] setup.py install

where [X] is your gcc version and [Y] is your python version: 2 or 3. Add a --user flag to the Python command if you get an error message showing [Errno 13] Permission denied anywhere.

If you want to use PolyChord with MPI on a Mac, you need to have compiled OpenMPI with the same gcc version with which you are compiling PolyChord. To do that, prepend OpenMPI’s make command with CC=gcc-[X], where [X] is your gcc version. Then follow the instructions above to compile PolyChord, but with MPI=1 instead when you do make PyPolyChord.

Thanks to Guadalupe Cañas Herrera for these instructions!

Manual installation of PolyChord

If you prefer to install PolyChord manually, assuming you want to install it at /path/to/polychord, simply do

$ cd /path/to/polychord
$ git clone https://github.com/PolyChord/PolyChordLite.git
$ cd PolyChordLite
$ make PyPolyChord MPI=1

After this, mention the path in your input file as

sampler:
  polychord:
    path: /path/to/polychord/PolyChordLite

PolyChord class

Synopsis:Interface for the PolyChord nested sampler
Author:Will Handley, Mike Hobson and Anthony Lasenby (for PolyChord), Jesus Torrado (for the cobaya wrapper only)
class samplers.polychord.polychord(info_sampler, model, output, resume=False, modules=None)
initialize()

Imports the PolyChord sampler and prepares its arguments.

run()

Prepares the posterior function and calls PolyChord’s run function.

close(exception_type=None, exception_value=None, traceback=None)

Loads the sample of live points from PolyChord’s raw output and writes it (if txt output requested).

products()

Auxiliary function to define what should be returned in a scripted call.

Returns:The sample Collection containing the sequentially discarded live points.