Output

The results of a cobaya run are in all cases an updated information dictionary (interactive call) or file (shell call), plus the products generated by the sampler used.

Interactive call

The updated information and products mentioned above are returned by the run function of the cobaya.run module, which performs the sampling process.

from cobaya.run import run
updated_info, sampler = run(your_input)

sampler here is the sampler instance that just ran, e.g. the mcmc sampler. The results of the sampler can be obtained as sampler.products(), which returns a dictionary whose contents depend on the sampler used, e.g. one chain for the mcmc sampler.

If the input information contains a non-null output, products are written to the hard drive too, as described below.

Shell call

When called from the shell, cobaya generates most commonly the following output files:

  • [prefix].input.yaml: a file with the same content as the input file.
  • [prefix].updated.yaml: a file containing the input information plus the default values used by each component.
  • [prefix].[number].txt: one or more sample files, containing one sample per line, with values separated by spaces. The first line specifies the columns.

Note

Some samplers produce additional output, e.g.

  • MCMC produces an additional [prefix].progress file monitoring the convergence of the chain, that can be inspected or plotted.
  • PolyChord produces native output, which is translated into cobaya’s output format with the usual file names, but also kept under a sub-folder within the output folder.

To specify the folder where the output files will be written and their name, use the option output at the top-level of the input file (i.e. not inside any block, see the example input in the Quickstart example):

  • output: something: the output will be written into the current folder, and all output file names will start with something.
  • output: somefolder/something: similar to the last case, but writes into the folder somefolder, which is created at that point if necessary.
  • output: somefolder/: writes into the folder somefolder, which is created at that point if necessary, with no prefix for the file names.
  • output: null: will produce no output files whatsoever – the products will be just loaded in memory. Use only when invoking from the Python interpreter.

If calling cobaya-run from the command line, you can also specify the output prefix with an --output [something] flag (it takes precedence over the output defined inside the yaml file, if it exists).

Note

When calling from the command line, if output has not been specified, it defaults to the first case, using as a prefix the name of the input file without the yaml extension.

Instead, when calling from a Python interpreter, if output has not been specified, it is understood as output: null.

In all cases, the output folder is based on the invocation folder if cobaya is called from the command line, or the current working directory (i.e. the output of import os; os.getcwd()) if invoked within a Python script or a Jupyter notebook.

Warning

If cobaya output files already exist with the given prefix, it will raise an error, unless you explicitly request to resume or overwrite the existing sample (see Resuming or overwriting an existing run).

Note

When the output is written into a certain folder different from the invocation one, the value of output in the output .yaml file(s) is updated such that it drops the mention to that folder.

Sample files or SampleCollection instances

Samples are stored in files (if text output requested) or SampleCollection instances (in interactive mode). A typical sample file will look like the one presented in the quickstart example:

# weight  minuslogpost         a         b  derived_a  derived_b  minuslogprior  minuslogprior__0      chi2  chi2__gaussian
    10.0      4.232834  0.705346 -0.314669   1.598046  -1.356208       2.221210          2.221210  4.023248        4.023248
     2.0      4.829217 -0.121871  0.693151  -1.017847   2.041657       2.411930          2.411930  4.834574        4.834574

Both sample files and collections contain the following columns, in this order:

  • weight: the relative weight of the sample.
  • minuslogpost: minus the log-posterior, unnormalized.
  • a, b...: sampled parameter values for each sample
  • derived_a, derived_b: derived parameter values for each sample. They appear after the sampled ones, but cannot be distinguished from them by name (they just happen to start with derived_ in this particular example, but can have any name).
  • minuslogprior: minus the log-prior (unnormalized if external priors have been defined), sum of the individual log-priors.
  • minuslogprior__[...]: individual priors; the first of which, named 0, corresponds to the separable product of 1-dimensional priors defined in the params block, and the rest to external priors, if they exist.
  • chi2: total effective \(\chi^2\), equals twice minus the total log-likelihood.
  • chi2__[...]: individual effective \(\chi^2\)’s, adding up to the total one.

output module documentation

Synopsis:Generic output class and output drivers
Author:Jesus Torrado
output.split_prefix(prefix)

Splits an output prefix into folder and file name prefix.

If on Windows, allows for unix-like input.

output.get_info_path(folder, prefix, infix=None, kind='updated', ext='.yaml')

Gets path to info files saved by Output.

output.get_output(*args, **kwargs) → output.Output

Auxiliary function to retrieve the output driver (e.g. whether to get the MPI-wrapped one, or a dummy output driver).

class output.Output(prefix, resume=False, force=False, infix=None, output_prefix=None)

Basic output driver. It takes care of creating the output files, checking compatibility with old runs when resuming, cleaning up when forcing, preparing SampleCollection files, etc.

is_prefix_folder()

Returns True if the output prefix is a bare folder, e.g. chains/.

separator_if_needed(separator)

Returns the given separator if there is an actual file name prefix (i.e. the output prefix is not a bare folder), or an empty string otherwise.

Useful to add custom suffixes to output prefixes (may want to use Output.add_suffix for that).

sanitize_collection_extension(extension)

Returns the extension without the leading dot, if given, or the default one Output.ext otherwise.

add_suffix(suffix, separator='_')

Returns the full output prefix (folder and file name prefix) combined with a given suffix, inserting a given separator in between (default: _) if needed.

create_folder(folder)

Creates the given folder (MPI-aware).

updated_prefix()

Updated path: drops folder: now it’s relative to the chain’s location.

check_and_dump_info(input_info, updated_info, check_compatible=True, cache_old=False, use_cache_old=False, ignore_blocks=())
Saves the info in the chain folder twice:
  • the input info.
  • idem, populated with the components’ defaults.

If resuming a sample, checks first that old and new infos and versions are consistent.

delete_with_regexp(regexp, root=None)

Deletes all files compatible with the given regexp.

If regexp is None and root is defined, deletes the root folder.

delete_file_or_folder(filename)

Deletes a file or a folder. Fails silently.

prepare_collection(name=None, extension=None)

Generates a file name for the collection, as [folder]/[prefix].[name].[extension].

Notice that name=None generates a date, but name="" removes the name field, making it simply [folder]/[prefix].[extension].

collection_regexp(name=None, extension=None)

Returns a regexp for collections compatible with this output settings.

Use name for particular types of collections (default: any number). Pass False to mean there is nothing between the output prefix and the extension.

is_collection_file_name(file_name, name=None, extension=None)

Check if a file_name is a collection compatible with this Output instance.

Use name for particular types of collections (default: any number). Pass False to mean there is nothing between the output prefix and the extension.

find_collections(name=None, extension=None)

Returns all collection files found which are compatible with this Output instance, including their path in their name.

Use name for particular types of collections (default: matches any number). Pass False to mean there is nothing between the output prefix and the extension.

load_collections(model, skip=0, thin=1, concatenate=False, name=None, extension=None)

Loads all collection files found which are compatible with this Output instance, including their path in their name.

Use name for particular types of collections (default: any number). Pass False to mean there is nothing between the output prefix and the extension.

class output.OutputDummy(*args, **kwargs)

Dummy output class. Does nothing. Evaluates to ‘False’ as a class.

collection module documentation

Synopsis:Classes to store the Montecarlo samples and single points.
Author:Jesus Torrado and Antony Lewis
class collection.SampleCollection(model, output=None, cache_size=200, name=None, extension=None, file_name=None, resuming=False, load=False, onload_skip=0, onload_thin=1)

Holds a collection of samples, stored internally into a pandas.DataFrame.

The DataFrame itself is accessible as the SampleCollection.data property, but slicing can be done on the SampleCollection itself (returns a copy, not a view).

Note for developers: when expanding this class or inheriting from it, always access the underlying DataFrame as self.data and not self._data, to ensure the cache has been dumped. If you really need to access the actual attribute self._data in a method, make sure to decorate it with @ensure_cache_dumped.

reset()

Create/reset the DataFrame.

add(values, derived=None, weight=1, logpost=None, logpriors=None, loglikes=None)

Adds a point to the collection. If logpost not given, it is obtained as the sum of logpriors and loglikes (both optional otherwise).

append(collection)

Append another collection. Internal method: does not check for consistency!

copy() → collection.SampleCollection

Returns a copy of the collection.

mean(first=None, last=None, derived=False, pweight=False)

Returns the (weighted) mean of the parameters in the chain, between first (default 0) and last (default last obtained), optionally including derived parameters if derived=True (default False).

If pweight=True (default False) weights every point with its probability. The estimate of the mean in this case is unstable; use carefully.

cov(first=None, last=None, derived=False, pweight=False)

Returns the (weighted) covariance matrix of the parameters in the chain, between first (default 0) and last (default last obtained), optionally including derived parameters if derived=True (default False).

If pweight=True (default False) weights every point with its probability. The estimate of the covariance matrix in this case is unstable; use carefully.

bestfit()

Best fit (maximum likelihood) sample. Returns a copy.

MAP()

Maximum-a-posteriori (MAP) sample. Returns a copy.

sampled_to_getdist_mcsamples(first=None, last=None)

Basic interface with getdist – internal use only! (For analysis and plotting use getdist.mcsamples.MCSamplesFromCobaya.)

class collection.OnePoint(model, output=None, cache_size=200, name=None, extension=None, file_name=None, resuming=False, load=False, onload_skip=0, onload_thin=1)

Wrapper of SampleCollection to hold a single point, e.g. the best-fit point of a minimization run (not used by default MCMC).

class collection.OneSamplePoint(model, output_thin=1)

Wrapper to hold a single point, e.g. the current point of an MCMC. Alternative to OnePoint, faster but with less functionality.