Output
The results of a cobaya run are in all cases an updated information dictionary (interactive call) or file (shell call), plus the products generated by the sampler used.
Interactive call
The updated information and products mentioned above are returned by the run function of the cobaya.run module, which performs the sampling process.
from cobaya import run
updated_info, sampler = run(your_input)
sampler here is the sampler instance that just ran, e.g. the mcmc sampler. The results of the sampler can be obtained as sampler.products(), which returns a dictionary whose contents depend on the sampler used, e.g. one chain for the mcmc sampler.
If the input information contains a non-null output, products are written to the hard drive too, as described below.
Shell call
When called from the shell, cobaya generates most commonly the following output files:
[prefix].input.yaml: a file with the same content as the input file.[prefix].updated.yaml: a file containing the input information plus the default values used by each component.[prefix].[number].txt: one or more sample files, containing one sample per line, with values separated by spaces. The first line specifies the columns.
Note
Some samplers produce additional output, e.g.
To specify the folder where the output files will be written and their name, use the option output at the top-level of the input file (i.e. not inside any block, see the example input in the Quickstart example):
output: something: the output will be written into the current folder, and all output file names will start withsomething.output: somefolder/something: similar to the last case, but writes into the foldersomefolder, which is created at that point if necessary.output: somefolder/: writes into the foldersomefolder, which is created at that point if necessary, with no prefix for the file names.output: null: will produce no output files whatsoever – the products will be just loaded in memory. Use only when invoking from the Python interpreter.
If calling cobaya-run from the command line, you can also specify the output prefix with an --output [something] flag (it takes precedence over the output defined inside the yaml file, if it exists).
Note
When calling from the command line, if output has not been specified, it
defaults to the first case, using as a prefix the name of the input file without the yaml extension.
Instead, when calling from a Python interpreter, if output has not been specified, it is understood as output: null.
In all cases, the output folder is based on the invocation folder if cobaya is called from the command line, or the current working directory (i.e. the output of import os; os.getcwd()) if invoked within a Python script or a Jupyter notebook.
Warning
If cobaya output files already exist with the given prefix, it will raise an error, unless you explicitly request to resume or overwrite the existing sample (see Resuming or overwriting an existing run).
Note
When the output is written into a certain folder different from the invocation one, the value of output in the output .yaml file(s) is updated such that it drops the mention to that folder.
Sample files or SampleCollection instances
Samples are stored in files (if text output requested) or SampleCollection instances (in interactive mode). A typical sample file will look like the one presented in the quickstart example:
# weight minuslogpost a b derived_a derived_b minuslogprior minuslogprior__0 chi2 chi2__gaussian
10.0 4.232834 0.705346 -0.314669 1.598046 -1.356208 2.221210 2.221210 4.023248 4.023248
2.0 4.829217 -0.121871 0.693151 -1.017847 2.041657 2.411930 2.411930 4.834574 4.834574
Both sample files and collections contain the following columns, in this order:
weight: the relative weight of the sample.minuslogpost: minus the log-posterior, unnormalized.a, b...: sampled parameter values for each samplederived_a, derived_b: derived parameter values for each sample. They appear after the sampled ones, but cannot be distinguished from them by name (they just happen to start withderived_in this particular example, but can have any name).minuslogprior: minus the log-prior (unnormalized if external priors have been defined), sum of the individual log-priors.minuslogprior__[...]: individual priors; the first of which, named0, corresponds to the separable product of 1-dimensional priors defined in theparamsblock, and the rest to external priors, if they exist.chi2: total effective \(\chi^2\), equals twice minus the total log-likelihood.chi2__[...]: individual effective \(\chi^2\)’s, adding up to the total one.
output module documentation
- Synopsis:
Generic output class and output drivers
- Author:
Jesus Torrado
- output.split_prefix(prefix)
Splits an output prefix into folder and file name prefix.
If on Windows, allows for unix-like input.
- output.get_info_path(folder, prefix, infix=None, kind='updated', ext='.yaml')
Gets path to info files saved by Output.
- output.get_output(*args, **kwargs)
Auxiliary function to retrieve the output driver (e.g. whether to get the MPI-wrapped one, or a dummy output driver).
- Return type:
- class output.OutputReadOnly(prefix, infix=None)
A read-only output driver: it tracks naming of, and can load input and collection files. Contrary to
output.Output, this class is not MPI-aware, which makes it useful to be able to do these operations within isolated MPI processes.- is_prefix_folder()
Returns True if the output prefix is a bare folder, e.g. chains/.
- updated_prefix()
Updated path: drops folder: now it’s relative to the chain’s location.
- separator_if_needed(separator)
Returns the given separator if there is an actual file name prefix (i.e. the output prefix is not a bare folder), or an empty string otherwise.
Useful to add custom suffixes to output prefixes (may want to use Output.add_suffix for that).
- sanitize_collection_extension(extension)
Returns the extension without the leading dot, if given, or the default one Output.ext otherwise.
- add_suffix(suffix, separator='_')
Returns the full output prefix (folder and file name prefix) combined with a given suffix, inserting a given separator in between (default: _) if needed.
- get_updated_info(use_cache=False, cache=False)
Returns the version of the input file updated with defaults, loading it if necessary not previously cached, or if forced by
use_cache=False.If loading is forced and
cache=True, the loaded input will be cached for future calls.- Return type:
InputDict|None
- reload_updated_info(cache=False)
Reloads and returns the version of the input file updated with defaults.
If none is found, returns
Nonewithout raising an error.If
cache=True, the loaded input will be cached for future calls.- Return type:
InputDict|None
- prepare_collection(name=None, extension=None)
Generates a file name for the collection, as
[folder]/[prefix].[name].[extension].Notice that
name=Nonegenerates a date, butname=""removes thenamefield, making it simply[folder]/[prefix].[extension].
- collection_regexp(name=None, extension=None)
Returns a regexp for collections compatible with this output settings.
Use name for particular types of collections (default: any number). Pass False to mean there is nothing between the output prefix and the extension.
- is_collection_file_name(file_name, name=None, extension=None)
Check if a file_name is a collection compatible with this Output instance.
Use name for particular types of collections (default: any number). Pass False to mean there is nothing between the output prefix and the extension.
- find_collections(name=None, extension=None)
Returns all collection files found which are compatible with this Output instance, including their path in their name.
Use name for particular types of collections (default: matches any number). Pass False to mean there is nothing between the output prefix and the extension.
- load_collections(model, skip=0, thin=1, combined=False, name=None, extension=None, check_logp_sums=True)
Loads all collection files found which are compatible with this Output instance, including their path in their name.
Use name for particular types of collections (default: any number). Pass False to mean there is nothing between the output prefix and the extension.
If
check_logp_sums=Falseallows for samples to have individual chi2’s and logpriors that do not add up to the total ones, or are undefined.Notes
Unless you know what you are doing, use the
cobaya.output.load_samples()function instead to load samples.
- class output.Output(prefix, resume=False, force=False, infix=None)
Basic output driver. It takes care of creating the output files, checking compatibility with old runs when resuming, cleaning up when forcing, preparing
SampleCollectionfiles, etc.- create_folder(folder)
Creates the given folder (MPI-aware).
- reload_updated_info(cache=False)
Reloads and returns the version of the input file updated with defaults.
If none is found, returns
Nonewithout raising an error.If
cache=True, the loaded input will be cached for future calls.- Return type:
InputDict|None
- check_and_dump_info(input_info, updated_info, check_compatible=True, cache_old=False, use_cache_old=False, ignore_blocks=())
- Saves the info in the chain folder twice:
the input info.
idem, populated with the components’ defaults.
If resuming a sample, checks first that old and new infos and versions are consistent unless allow_changes is True.
- load_collections(model, skip=0, thin=1, combined=False, name=None, extension=None, concatenate=None)
Loads all collection files found which are compatible with this Output instance, including their path in their name.
Use name for particular types of collections (default: any number). Pass False to mean there is nothing between the output prefix and the extension.
Notes
Unless you know what you are doing, use the
cobaya.output.load_samples()function instead to load samples.
- delete_with_regexp(regexp, root=None)
Deletes all files compatible with the given regexp.
If regexp is None and root is defined, deletes the root folder.
- delete_file_or_folder(filename)
Deletes a file or a folder. Fails silently.
- class output.OutputDummy(*args, **kwargs)
Dummy output class. Does nothing. Evaluates to ‘False’ as a class.
collection module documentation
- Synopsis:
Classes to store the Montecarlo samples and single points.
- Author:
Jesus Torrado and Antony Lewis
- class collection.SampleCollection(model, output=None, cache_size=200, name=None, extension=None, file_name=None, resuming=False, load=False, temperature=None, onload_skip=0, onload_thin=1, sample_type=None, is_batch=False, check_logp_sums=True)
Holds a collection of samples, stored internally into a
pandas.DataFrame.The DataFrame itself is accessible as the
SampleCollection.dataproperty, but slicing can be done on theSampleCollectionitself (returns a copy, not a view).When
temperatureis different from 1, weights and log-posterior (but not priors or likelihoods’ chi squared’s) are those of the tempered sample, obtained assuming a posterior raised to the power of1/temperature. Functions returning statistics, e.g.cov(), will return the statistics of the original (untempered) posterior, unless indicated otherwise with a keyword argument.If
check_logp_sums=Falseallows for samples to have individual chi2’s and logpriors that do not add up to the total ones, or are undefined.Note for developers: when expanding this class or inheriting from it, always access the underlying DataFrame as
self.dataand notself._data, to ensure the cache has been dumped. If you really need to access the actual attributeself._datain a method, make sure to decorate it with@ensure_cache_dumped.- reset()
Create/reset the DataFrame.
- add(values, logpost=None, logpriors=None, loglikes=None, derived=None, weight=1)
Adds a point to the collection.
If logpost can be
LogPosterior, float or None (in which case, logpriors, loglikes are both required).If the weight is not specified, it is assumed to be 1.
- property is_tempered: bool
Whether the sample was obtained by drawing from a different-temperature distribution.
- property has_int_weights: bool
Whether weights are integer.
- reset_temperature(with_batch=None)
Drops the information about sampling temperature:
weightandminuslogpostcolumns will now correspond to those of a unit-temperature posterior sample.If this sample is part of a batch, call this method passing the rest of the batch as a list using the argument
with(otherwise inconsistent weights between samples will be introduced). If additional chains are passed withwith, their temperature will be reset in-place.This cannot be undone: (e.g. recovering original integer tempered weights). You may want to call this method on a copy (see
SampleCollection.copy()).
- property n_last_out
Index of the last point saved to the output.
- property data
Pandas’
DataFramecontaining the sample collection.
- to_numpy(dtype=None, copy=False)
Returns the sample collection as a numpy array.
- Return type:
ndarray
- copy(empty=False)
Returns a copy of the collection.
If
empty=True(defaultFalse), returns an empty copy.- Return type:
- mean(first=None, last=None, weights=None, derived=False, tempered=False)
Returns the (weighted) mean of the parameters in the chain, between first (default 0) and last (default last obtained), optionally including derived parameters if derived=True (default False).
Custom weights can be passed with the argument
weights.If
derivedisTrue(defaultFalse), the means of the derived parameters are included in the returned vector.If
tempered=True(defaultFalse) returns the mean of the tempered posteriorp**(1/temperature).NB: For tempered samples, if passed
tempered=False(default), detempered weights are computed on-the-fly. If this or any other function returning untempered statistical quantities of a tempered sample is expected to be called repeatedly, it would be more efficient to detemper the collection first withSampleCollection.reset_temperature(), and call these methods on the returned Collection.- Return type:
ndarray
- cov(first=None, last=None, weights=None, derived=False, tempered=False)
Returns the (weighted) covariance matrix of the parameters in the chain, between first (default 0) and last (default last obtained), optionally including derived parameters if derived=True (default False).
Custom weights can be passed with the argument
weights.If
derivedisTrue(defaultFalse), the covariances of/with the derived parameters are included in the returned matrix.If
tempered=True(defaultFalse) returns the covariances of the tempered posteriorp**(1/temperature).NB: For tempered samples, if passed
tempered=False(default), detempered weights are computed on-the-fly. If this or any other function returning untempered statistical quantities of a tempered sample is expected to be called repeatedly, it would be more efficient to detemper the collection first withSampleCollection.reset_temperature(), and call these methods on the returned Collection.- Return type:
ndarray
- reweight(importance_weights, with_batch=None, check=True)
Reweights the sample in-place with the given
importance_weights.Temperature information is dropped.
If this sample is part of a batch, call this method passing the rest of the batch as a list using the argument
with_match(otherwise inconsistent weights between samples will be introduced). If additional chains are passed withwith_batch, they will also be reweighted in-place. In that case,importance_weightsneeds to be a list of weight vectors, the first of which to be applied to the current instance, and the rest to the collections passed withwith_batch.This cannot be fully undone (e.g. recovering original integer weights). You may want to call this method on a copy (see
SampleCollection.copy()).For the sake of speed, length and positivity checks on the importance weights can be skipped with
check=False(defaultTrue).
- filtered_copy(where)
Returns a copy of the collection with some condition
whereimposed.- Return type:
- skip_samples(skip, inplace=False)
Skips some initial samples, or an initial fraction of them.
For collections coming from a Nested Sampler, prints a warning and does nothing.
- Parameters:
skip (float) – Specifies the amount of initial samples to be skipped, either directly if
skip>1(rounded up to next integer), or as a fraction if0<skip<1.inplace (bool, default: False) – If True, returns a copy of the collection.
- Returns:
The original collection with skipped initial samples (
inplace=True) or a copy of it (inplace=False).- Return type:
- Raises:
LoggedError – If badly defined
skipvalue.
- thin_samples(thin, inplace=False)
Thins the sample collection by some factor
thin>1.- Parameters:
thin (int) – Thin factor, must be
>1.inplace (bool, default: False) – If True, returns a copy of the collection.
- Returns:
Thinned version of the original collection (
inplace=True) or a copy of it (inplace=False).- Return type:
- Raises:
LoggedError – If badly defined
thinvalue.
- bestfit()
Best fit (maximum likelihood) sample. Returns a copy.
- MAP()
Maximum-a-posteriori (MAP) sample. Returns a copy.
- to_getdist(label=None, model=None, combine_with=None)
- Parameters:
label (str, optional) – Legend label in
GetDistplots (name_taginGetDistparlance).model (
cobaya.model.Model, optional) – Model with which the sample was created. Needed only if parameter labels or aliases have changed since the collection was generated.combine_with (list of
cobaya.collection.SampleCollection, optional) – Additional collections to be added when creating a getdist object. Compatibility between the collections is assumed and not checked.
- Returns:
This collection’s equivalent
getdist.MCSamplesobject.- Return type:
getdist.MCSamples- Raises:
LoggedError – Errors when processing the arguments.
- out_update()
Update the output file to the current state of the Collection.
- class collection.OnePoint(model, output=None, cache_size=200, name=None, extension=None, file_name=None, resuming=False, load=False, temperature=None, onload_skip=0, onload_thin=1, sample_type=None, is_batch=False, check_logp_sums=True)
Wrapper of
SampleCollectionto hold a single point, e.g. the best-fit point of a minimization run (not used by default MCMC).
- class collection.OneSamplePoint(model, temperature=1, output_thin=1)
Wrapper to hold a single point, e.g. the current point of an MCMC. Alternative to
OnePoint, faster but with less functionality.For tempered samples, stores the weight and -logp of the tempered posterior (but untempered priors and likelihoods).