Quickstart example

Let us present here a simple example, without explaining much about it — each of its aspects will be broken down in the following sections.

We will run it first from the shell, and later from a Python interpreter (or a Jupyter notebook).

From the shell

The input of cobaya consists of a text file that usually looks like this:

likelihood:
  gaussian_mixture:
    means: [0.2, 0]
    covs: [[0.1, 0.05], [0.05,0.2]]
    derived: True

params:
  a:
    prior:
      min: -0.5
      max: 3
    latex: \alpha
  b:
    prior:
      dist: norm
      loc: 0
      scale: 1
    ref: 0
    proposal: 0.5
    latex: \beta
  derived_a:
    latex: \alpha^\prime
  derived_b:
    latex: \beta^\prime

sampler:
  mcmc:

output: chains/gaussian

You can see the following blocks up there:

  • A likelihood block, listing the likelihood pdf’s to be explored, here a gaussian with the mean and covariance stated.
  • A params block, stating the parameters that are going to be explored (or derived), their prior, the the Latex label that will be used in the plots, the reference (ref) starting point for the chains (optional), and the initial spread of the MCMC covariance matrix proposal.
  • A sampler block stating that we will use the mcmc sampler to explore the prior+likelihood described above, stating the maximum number of samples used, how many initial samples to ignore, and that we will sequentially refine our initial guess for a covariance matrix.
  • An output prefix, indicating where the products will be written and a prefix for their name.

To run this example, save the text above in a file called gaussian.yaml in a folder of your choice, and do

$ cobaya-run gaussian.yaml

Note

Cobaya is MPI-aware. If you have installed mpi4py (see this section), you can run

$ mpirun -n [n_processes] cobaya-run gaussian.yaml

which will converge faster!

After a few seconts, a folder named chains will be created, and inside it you will find three files:

chains
├── gaussian.input.yaml
├── gaussian.updated.yaml
└── gaussian.1.txt

The first file reproduces the same information as the input file given, here gaussian.yaml. The second contains the updated information needed to reproduce the sample, similar to the input one, but populated with the default options for the sampler, likelihood, etc. that you have used.

The third file, ending in .txt, contains the MCMC sample, and its first lines should look like

# weight  minuslogpost         a         b  derived_a  derived_b  minuslogprior  minuslogprior__0      chi2  chi2__gaussian
    10.0      4.232834  0.705346 -0.314669   1.598046  -1.356208       2.221210          2.221210  4.023248        4.023248
     2.0      4.829217 -0.121871  0.693151  -1.017847   2.041657       2.411930          2.411930  4.834574        4.834574

You can use GetDist to analyse the results of this sample: get marginalized statistics, convergence diagnostics and some plots. We recommend using the graphical user interface. Simply run GetDistGUI.py from anywhere, press the green + button, navigate in the pop-up window into the folder containing the chains (here chains) and click choose. Now you can get some result statistics from the Data menu, or generate some plots like this one (just mark the the options in the red boxes and hit Make plot):

_images/example_quickstart_getdistgui.png

Note

You can add an option label: non-latex $latex$ to your info, and it will be used as legend label when plotting multiple samples.

Note

For a detailed user manual and many more examples, check out the GetDist documentation!

From a Python interpreter

You use cobaya interactively within a Python interpreter or a Jupyter notebook. This will allow you to create input and process products programatically, making it easier to streamline a complicated analyses.

The actual input information of cobaya are Python dictionaries (a yaml file is just a representation of it). We can easily define the same information above as a dictionary:

from collections import OrderedDict as odict

info = {
    "likelihood": {
        "gaussian_mixture": {
            "means": [0.2, 0],
            "covs": [[0.1, 0.05],
                     [0.05, 0.2]],
            "derived": True}},
    "params": odict([
        ("a", {
            "prior": {"min": -0.5, "max": 3},
            "latex": r"\alpha"}),
        ("b", {
            "prior": {"dist": "norm", "loc": 0, "scale": 1},
            "ref": 0,
            "proposal": 0.5,
            "latex": r"\beta"}),
        ("derived_a", {
            "latex": r"\alpha^\prime"}),
        ("derived_b", {
            "latex": r"\beta^\prime"})]),
    "sampler": {
        "mcmc": None}}

The code above may look more complicated than the corresponding yaml one, but in exchange it is much more flexible, allowing you to quick modify and swap different parts of it.

Note

Most often the params block is defined using an OrderedDict, instead of a normal Python dictionary. Using an OrderedDict instead of a normal dictionary is optional but highly recommended: it keeps the order of the parameters consistent between input and output.

Same goes for the likelihoods, when one can use an OrderedDict to preserve the order of the likelihoods.

Notice that here we suppress the creation of the chain files by not including the field output, since this is a very basic example. The chains will thus only be loaded in memory.

Alternatively, we can load the input from a yaml file like the one above:

from cobaya.yaml import yaml_load_file

info_from_yaml = yaml_load_file("gaussian.yaml")

And info, info_from_yaml and the file gaussian.yaml should contain the same information (except that we have chosen not to add an output prefix to info).

Now, let’s run the example.

from cobaya.run import run

updated_info, products = run(info)

The run function returns two variables:

  • An information dictionary updated with the defaults, equivalent to the updated yaml file produced by the shell invocation.
  • A dictionary of products, which for the mcmc sampler contains only one chain under the key sample.

Let’s now analyse the chain and get some plots, using the interactive interface to GetDist instead of the GUI used above:

# Export the results to GetDist
from getdist.mcsamples import MCSamplesFromCobaya

gd_sample = MCSamplesFromCobaya(updated_info, products["sample"])
# Analyze and plot
mean = gd_sample.getMeans()[:2]
covmat = gd_sample.getCovMat().matrix[:2, :2]
print("Mean:")
print(mean)
print("Covariance matrix:")
print(covmat)
# %matplotlib inline  # uncomment if running from the Jupyter notebook
import getdist.plots as gdplt

gdplot = gdplt.getSubplotPlotter()
gdplot.triangle_plot(gd_sample, ["a", "b"], filled=True)

Output:

Mean:
[ 0.19495375 -0.02323249]
Covariance matrix:
[[ 0.0907666   0.0332858 ]
 [ 0.0332858   0.16461655]]
_images/example_quickstart_plot.png

Alternatively, if we had chosen to write the output as in the shell case by adding an output prefix, we could have loaded the chain in GetDist from the hard drive:

# Export the results to GetDist
from getdist.mcsamples import loadMCSamples
# Notice loadMCSamples requires a *full path*
import os

gd_sample = loadMCSamples(os.path.abspath(info["output"]))
# Analyze and plot
# [Exactly the same here...]

If we are only interested in plotting, we do not even need to generate a GetDist MCSamples object: we can ask the plotter to load all chains in the given folder, and then just name the corresponding one when plotting:

# Just plotting (loading on-the-fly)
import getdist.plots as gdplt
import os

folder, name = os.path.split(os.path.abspath(info["output"]))
gdplot = gdplt.getSubplotPlotter(chain_dir=folder)
gdplot.triangle_plot(name, ["a", "b"], filled=True)