Output¶
The results of a cobaya run are in all cases an updated information dictionary (interactive call) or file (shell call), plus the products generated by the sampler used.
Interactive call¶
The updated information and products mentioned above are returned by the run
function of the cobaya.run
module, which performs the sampling process.
from cobaya.run import run
updated_info, sampler = run(your_input)
sampler
here is the sampler instance that just ran, e.g. the mcmc
sampler. The results of the sampler can be obtained as sampler.products()
, which returns a dictionary whose contents depend on the sampler used, e.g. one chain for the mcmc
sampler.
If the input information contains a non-null output
, products are written to the hard drive too, as described below.
Shell call¶
When called from the shell, cobaya generates most commonly the following output files:
[prefix].input.yaml
: a file with the same content as the input file.[prefix].updated.yaml
: a file containing the input information plus the default values used by each component.[prefix].[number].txt
: one or more sample files, containing one sample per line, with values separated by spaces. The first line specifies the columns.
Note
Some samplers produce additional output, e.g.
To specify the folder where the output files will be written and their name, use the option output
at the top-level of the input file (i.e. not inside any block, see the example input in the Quickstart example):
output: something
: the output will be written into the current folder, and all output file names will start withsomething
.output: somefolder/something
: similar to the last case, but writes into the foldersomefolder
, which is created at that point if necessary.output: somefolder/
: writes into the foldersomefolder
, which is created at that point if necessary, with no prefix for the file names.output: null
: will produce no output files whatsoever – the products will be just loaded in memory. Use only when invoking from the Python interpreter.
If calling cobaya-run
from the command line, you can also specify the output prefix with an --output [something]
flag (it takes precedence over the output
defined inside the yaml file, if it exists).
Note
When calling from the command line, if output
has not been specified, it
defaults to the first case, using as a prefix the name of the input file without the yaml
extension.
Instead, when calling from a Python interpreter, if output
has not been specified, it is understood as output: null
.
In all cases, the output folder is based on the invocation folder if cobaya is called from the command line, or the current working directory (i.e. the output of import os; os.getcwd()
) if invoked within a Python script or a Jupyter notebook.
Warning
If cobaya output files already exist with the given prefix, it will raise an error, unless you explicitly request to resume or overwrite the existing sample (see Resuming or overwriting an existing run).
Note
When the output is written into a certain folder different from the invocation one, the value of output
in the output .yaml
file(s) is updated such that it drops the mention to that folder.
Sample files or SampleCollection
instances¶
Samples are stored in files (if text output requested) or SampleCollection
instances (in interactive mode). A typical sample file will look like the one presented in the quickstart example:
# weight minuslogpost a b derived_a derived_b minuslogprior minuslogprior__0 chi2 chi2__gaussian
10.0 4.232834 0.705346 -0.314669 1.598046 -1.356208 2.221210 2.221210 4.023248 4.023248
2.0 4.829217 -0.121871 0.693151 -1.017847 2.041657 2.411930 2.411930 4.834574 4.834574
Both sample files and collections contain the following columns, in this order:
weight
: the relative weight of the sample.minuslogpost
: minus the log-posterior, unnormalized.a, b...
: sampled parameter values for each samplederived_a, derived_b
: derived parameter values for each sample. They appear after the sampled ones, but cannot be distinguished from them by name (they just happen to start withderived_
in this particular example, but can have any name).minuslogprior
: minus the log-prior (unnormalized if external priors have been defined), sum of the individual log-priors.minuslogprior__[...]
: individual priors; the first of which, named0
, corresponds to the separable product of 1-dimensional priors defined in theparams
block, and the rest to external priors, if they exist.chi2
: total effective \(\chi^2\), equals twice minus the total log-likelihood.chi2__[...]
: individual effective \(\chi^2\)’s, adding up to the total one.
output
module documentation¶
- Synopsis:
Generic output class and output drivers
- Author:
Jesus Torrado
- output.split_prefix(prefix)¶
Splits an output prefix into folder and file name prefix.
If on Windows, allows for unix-like input.
- output.get_info_path(folder, prefix, infix=None, kind='updated', ext='.yaml')¶
Gets path to info files saved by Output.
- output.get_output(*args, **kwargs)¶
Auxiliary function to retrieve the output driver (e.g. whether to get the MPI-wrapped one, or a dummy output driver).
- Return type:
- class output.Output(prefix, resume=False, force=False, infix=None)¶
Basic output driver. It takes care of creating the output files, checking compatibility with old runs when resuming, cleaning up when forcing, preparing
SampleCollection
files, etc.- is_prefix_folder()¶
Returns True if the output prefix is a bare folder, e.g. chains/.
- separator_if_needed(separator)¶
Returns the given separator if there is an actual file name prefix (i.e. the output prefix is not a bare folder), or an empty string otherwise.
Useful to add custom suffixes to output prefixes (may want to use Output.add_suffix for that).
- sanitize_collection_extension(extension)¶
Returns the extension without the leading dot, if given, or the default one Output.ext otherwise.
- add_suffix(suffix, separator='_')¶
Returns the full output prefix (folder and file name prefix) combined with a given suffix, inserting a given separator in between (default: _) if needed.
- create_folder(folder)¶
Creates the given folder (MPI-aware).
- updated_prefix()¶
Updated path: drops folder: now it’s relative to the chain’s location.
- check_and_dump_info(input_info, updated_info, check_compatible=True, cache_old=False, use_cache_old=False, ignore_blocks=())¶
- Saves the info in the chain folder twice:
the input info.
idem, populated with the components’ defaults.
If resuming a sample, checks first that old and new infos and versions are consistent.
- delete_with_regexp(regexp, root=None)¶
Deletes all files compatible with the given regexp.
If regexp is None and root is defined, deletes the root folder.
- delete_file_or_folder(filename)¶
Deletes a file or a folder. Fails silently.
- prepare_collection(name=None, extension=None)¶
Generates a file name for the collection, as
[folder]/[prefix].[name].[extension]
.Notice that
name=None
generates a date, butname=""
removes thename
field, making it simply[folder]/[prefix].[extension]
.
- collection_regexp(name=None, extension=None)¶
Returns a regexp for collections compatible with this output settings.
Use name for particular types of collections (default: any number). Pass False to mean there is nothing between the output prefix and the extension.
- is_collection_file_name(file_name, name=None, extension=None)¶
Check if a file_name is a collection compatible with this Output instance.
Use name for particular types of collections (default: any number). Pass False to mean there is nothing between the output prefix and the extension.
- find_collections(name=None, extension=None)¶
Returns all collection files found which are compatible with this Output instance, including their path in their name.
Use name for particular types of collections (default: matches any number). Pass False to mean there is nothing between the output prefix and the extension.
- load_collections(model, skip=0, thin=1, concatenate=False, name=None, extension=None)¶
Loads all collection files found which are compatible with this Output instance, including their path in their name.
Use name for particular types of collections (default: any number). Pass False to mean there is nothing between the output prefix and the extension.
- class output.OutputDummy(*args, **kwargs)
Dummy output class. Does nothing. Evaluates to ‘False’ as a class.
collection
module documentation¶
- Synopsis:
Classes to store the Montecarlo samples and single points.
- Author:
Jesus Torrado and Antony Lewis
- class collection.SampleCollection(model, output=None, cache_size=200, name=None, extension=None, file_name=None, resuming=False, load=False, temperature=None, onload_skip=0, onload_thin=1, sample_type=None)¶
Holds a collection of samples, stored internally into a
pandas.DataFrame
.The DataFrame itself is accessible as the
SampleCollection.data
property, but slicing can be done on theSampleCollection
itself (returns a copy, not a view).When
temperature
is different from 1, weights and log-posterior (but not priors or likelihoods’ chi squared’s) are those of the tempered sample, obtained assuming a posterior raised to the power of1/temperature
. Functions returning statistics, e.g.cov()
, will return the statistics of the original (untempered) posterior, unless indicated otherwise with a keyword argument.Note for developers: when expanding this class or inheriting from it, always access the underlying DataFrame as
self.data
and notself._data
, to ensure the cache has been dumped. If you really need to access the actual attributeself._data
in a method, make sure to decorate it with@ensure_cache_dumped
.- reset()¶
Create/reset the DataFrame.
- add(values, logpost=None, logpriors=None, loglikes=None, derived=None, weight=1)¶
Adds a point to the collection.
If logpost can be
LogPosterior
, float or None (in which case, logpriors, loglikes are both required).If the weight is not specified, it is assumed to be 1.
- property is_tempered: bool¶
Whether the sample was obtained by drawing from a different-temperature distribution.
- property has_int_weights: bool¶
Whether weights are integer.
- reset_temperature()¶
Drops the information about sampling temperature:
weight
andminuslogpost
columns will now correspond to those of a unit-temperature posterior sample.This cannot be undone: (e.g. recovering original integer tempered weights). You may want to call this method on a copy (see
SampleCollection.copy()
).
- property n_last_out¶
Index of the last point saved to the output.
- property data¶
Pandas’
DataFrame
containing the sample collection.
- property values: ndarray¶
Returns the sample collection as a numpy array; to be deprecated in favour of
Collection.to_numpy
, following Pandas.
- to_numpy(dtype=None, copy=False)¶
Returns the sample collection as a numpy array.
- Return type:
ndarray
- copy(empty=False)¶
Returns a copy of the collection.
If
empty=True
(defaultFalse
), returns an empty copy.- Return type:
- mean(first=None, last=None, weights=None, derived=False, tempered=False)¶
Returns the (weighted) mean of the parameters in the chain, between first (default 0) and last (default last obtained), optionally including derived parameters if derived=True (default False).
Custom weights can be passed with the argument
weights
.If
derived
isTrue
(defaultFalse
), the means of the derived parameters are included in the returned vector.If
tempered=True
(defaultFalse
) returns the mean of the tempered posteriorp**(1/temperature)
.NB: For tempered samples, if passed
tempered=False
(default), detempered weights are computed on-the-fly. If this or any other function returning untempered statistical quantities of a tempered sample is expected to be called repeatedly, it would be more efficient to detemper the collection first withSampleCollection.reset_temperature()
, and call these methods on the returned Collection.- Return type:
ndarray
- cov(first=None, last=None, weights=None, derived=False, tempered=False)¶
Returns the (weighted) covariance matrix of the parameters in the chain, between first (default 0) and last (default last obtained), optionally including derived parameters if derived=True (default False).
Custom weights can be passed with the argument
weights
.If
derived
isTrue
(defaultFalse
), the covariances of/with the derived parameters are included in the returned matrix.If
tempered=True
(defaultFalse
) returns the covariances of the tempered posteriorp**(1/temperature)
.NB: For tempered samples, if passed
tempered=False
(default), detempered weights are computed on-the-fly. If this or any other function returning untempered statistical quantities of a tempered sample is expected to be called repeatedly, it would be more efficient to detemper the collection first withSampleCollection.reset_temperature()
, and call these methods on the returned Collection.- Return type:
ndarray
- reweight(importance_weights, check=True)¶
Reweights the sample with the given
importance_weights
.Temperature information is dropped.
This cannot be fully undone (e.g. recovering original integer weights). You may want to call this method on a copy (see
SampleCollection.copy()
).For the sake of speed, length and positivity checks on the importance weights can be skipped with
check=False
(defaultTrue
).
- filtered_copy(where)¶
Returns a copy of the collection with some condition
where
imposed.- Return type:
- skip_samples(skip, inplace=False)¶
Skips some initial samples, or an initial fraction of them.
For collections coming from a Nested Sampler, prints a warning and does nothing.
- Parameters:
skip (float) – Specified the amount of initial samples to be skipped, either directly if
skip>1
(rounded up to next integer), or as a fraction if0<skip<1
.inplace (bool, default: False) – If True, returns a copy of the collection.
- Returns:
The original collection with skipped initial samples (
inplace=True
) or a copy of it (inplace=False
).- Return type:
- Raises:
LoggedError – If badly defined
skip
value.
- thin_samples(thin, inplace=False)¶
Thins the sample collection by some factor
thin>1
.- Parameters:
thin (int) – Thin factor, must be
>1
.inplace (bool, default: False) – If True, returns a copy of the collection.
- Returns:
Thinned version of the original collection (
inplace=True
) or a copy of it (inplace=False
).- Return type:
- Raises:
LoggedError – If badly defined
thin
value.
- bestfit()¶
Best fit (maximum likelihood) sample. Returns a copy.
- MAP()¶
Maximum-a-posteriori (MAP) sample. Returns a copy.
- to_getdist(label=None, model=None)¶
- Parameters:
label (str, optional) – Legend label in
GetDist
plots (name_tag
inGetDist
parlance).model (
cobaya.model.Model
, optional) – Model with which the sample was created. Needed only if parameter labels or aliases have changed since the collection was generated.
- Returns:
This collection’s equivalent
getdist.MCSamples
object.- Return type:
getdist.MCSamples
- Raises:
LoggedError – Errors when processing the arguments.
- out_update()¶
Update the output file to the current state of the Collection.
- class collection.OnePoint(model, output=None, cache_size=200, name=None, extension=None, file_name=None, resuming=False, load=False, temperature=None, onload_skip=0, onload_thin=1, sample_type=None)
Wrapper of
SampleCollection
to hold a single point, e.g. the best-fit point of a minimization run (not used by default MCMC).
- class collection.OneSamplePoint(model, temperature=1, output_thin=1)
Wrapper to hold a single point, e.g. the current point of an MCMC. Alternative to
OnePoint
, faster but with less functionality.For tempered samples, stores the weight and -logp of the tempered posterior (but untempered priors and likelihoods).