Wiki

Clone wiki

cosmosis / postprocessing

Postprocessing

After running a most types of sampler you will want to generate some summary plots and statistics to represent the output.

For most kinds of chains these take the form of constraints - credible intervals on the values of the parameters. For a few samplers, and in particular the test sampler, we want different kinds of output.

The cosmosis postprocess program generates a collection of such plots and statistics automatically for all the samplers that cosmosis supports. But you can also make your own plots, or use the cosmosis postprocess command to generate plots from other codes.

The postprocess command

The command postprocess can be run on one or more ini files used to generate chains, or on the chain output files themselves, for example:

#!bash

    postprocess my_ini_file.ini my_chain.txt

If more than one file is included then both chains will be plotted (this doesn't currently work with the test sampler plots; sorry).

The postprocess command has a number of options, which can be explored by running:

#!bash

    postprocess --help

which will show all the options you can add:

#!bash

usage: postprocess-exe.py [-h] [--burn BURN] [--thin THIN] [-o OUTDIR]
                          [-p PREFIX] [--more-latex MORE_LATEX] [--no-latex]
                          [--blind-add] [--blind-mul] [--text] [--swap]
                          [--no-plots] [-f FILE_TYPE] [--no-smooth]
                          [--n-kde N_KDE] [--factor-kde FACTOR_KDE]
                          [--no-fill] [--extra EXTRA] [--tweaks TWEAKS]
                          [--no-image]
                          inifile [inifile ...]

Post-process cosmosis output

positional arguments:
  inifile

optional arguments:
  -h, --help            show this help message and exit

MCMC:
  Options for MCMC-type samplers

  --burn BURN           Fraction or number of samples to burn at the start
  --thin THIN           Keep every n'th sample in MCMC

General:
  General options for controlling postprocessing

  -o OUTDIR, --outdir OUTDIR
                        Output directory for all generated files
  -p PREFIX, --prefix PREFIX
                        Prefix for all generated files
  --more-latex MORE_LATEX
                        Load an additional latex file to the default
  --no-latex            Do not use latex-style labels, just use the text
  --blind-add           Blind results by adding adding a secret value to each
                        parameter
  --blind-mul           Blind results by scaling by a secret value for each
                        parameter

Inputs:
  Options controlling the inputs to this script

  --text                Tell postprocess that its argument is a text file,
                        regardless of its suffix

Plotting:
  Plotting options

  --swap                Swap the ordering of the parameters in (x,y)
  --no-plots            Do not make any default plots
  -f FILE_TYPE, --file-type FILE_TYPE
                        Filename suffix for plots; also chooses the file format.
  --no-smooth           Do not smooth grid plot joint constraints
  --n-kde N_KDE         Number of KDE smoothing points per dimension to use
                        for MCMC 2D curves. Reduce to speed up, but can make
                        plots look worse.
  --factor-kde FACTOR_KDE
                        Smoothing factor for MCMC plots. More makes plots look
                        better but can smooth out too much.
  --no-fill             Do not fill in 2D constraint plots with color
  --extra EXTRA         Load extra post-processing steps from this file.
  --tweaks TWEAKS       Load plot tweaks from this file.
  --no-image            Do not plot the image in 2D grids; just show the
                        contours

The command will generate a collection of text and image files, with filenames matching {outdir}/{prefix}_{plot_name}.

Postprocessing test runs

When run on an ini file that uses the TEST sampler, or when run on a directory, the postprocess command generates a collection of standard cosmology plots, like distances as a function of redshift, the CMB and matter power spectra, and others. The command tries to make all the plots no matter what the pipeline actually generated, but will just skip and plots it can't make (if, for example, your pipeline does not make a matter power spectrum then it will just be skipped).

There is currently a fairly limited selection of plots made automatically, and we would welcome additional ideas.

Custom test plots

If cosmosis doesn't make the kind of plot you want automatically then you have two choices: write your own script to make your plot using the outputs cosmosis saves, or adding a plot to the cosmosis system.

Making your own plotting scripts

The outputs saved but the test sampler are quite simple - they are in the folder given to the "save_dir" option in the ini file, and then within that there is one folder for each data section (category of data). Scalar numbers are then recorded in a file called "values.txt" in that directory, and 1D and 2D arrays each have their own text file.

So, for example, when the CAMB module saves cosmological distances they are written to a collection of files called distances/z.txt, distances/d_l.txt, distances/d_a.txt, etc. Your scripts, in whatever language, can load from those files (they have header lines starting with the hash symbol).

Adding a CosmoSIS Test plot

Each CosmoSIS plot is represented by a python class, a blueprint for data and behavior like a Fortran 90 "type" or C "struct" with functions attahed to it.

You create a new plot by making a new "sub-class" of the CosmoSIS test plot tool, customizing it to do the exact plot you want. This is fairly straightforward and an example is shown below, which makes a logarithmic instead of linear plot of the CMB TT power spectrum. The boilerplate, down to the line starting "super", should be the same same for any plot (though you should change the name, LogTTPlot, in the two places it appears, and the filename to be whatever you want).

You create one sub-class per plot type that you want.

#!python

# This code should be put in a new python file,
# for instance called "extra1.py"

from cosmosis.postprocessing.cosmology_theory_plots import Plot, plot_list
from cosmosis.postprocessing import lazy_pylab as pylab

class LogTTPlot(Plot):
    filename='logtt'
    def plot(self):
        super(LogTTPlot,self).plot()
        ell = self.load_file("cmb_cl", "ell")
        c_ell = self.load_file("cmb_cl", "tt")
        pylab.loglog(ell, c_ell)
        pylab.xlabel("$\ell$")
        pylab.ylabel("$\ell(\ell+1)C_\ell/2\pi$")

This uses the load_file function from its parent class Plot to get the two things (ell and C_ell) that it wants to plot, and then standard pylab (matplotlib) functions to do the plotting. Creation and saving of the figure is done by the parent class, so you don't need to do that part.

The lazy_pylab part is not strictly necessary but avoids an error message.

You can then tell postprocess to include this new plot by adding the flag "--extra extra1.py" when running it.

Postprocessing chains

Postprocessing chains is different from postprocessing single parameter runs. In this case we want to make plots and statistics that summarize the constraints that the chains show.

For grid samplers, the postprocess command makes smoothed 1D and 2D histograms by summing (marginalizing) along the other dimensions of the space.

For MCMC samplers, it makes them using Kernel Density Estimation (KDE), which boils down to drawing a small Gaussian kernel on top of each point in the chain and then summing them all together.

In both cases it also finds 68 and 95% contours numerically by looking at the cumulative density function of the likelihood, and also generates median and other statistics.

The main MCMC samplers generate samples from the posterior automatically. But some other samplers, like multinest and importance sampling, instead generate weighted sample points. CosmoSIS knows how to use the weighted points when making the histograms and constraints.

Custom chain plots

You can make custom plots from chains just like you can with the test sampler - by making a new class for each sample. But the particular way you do it is a little different. This may change in future; for the moment you can see an example in [Demo9] - see the file demos/extra9.ini for more documentation.

Updated