Clone wiki

vide_public / Preparing

Preparing

Move to the pipeline directory.

The first step is to create a dataset, which describes the simulation or observation where you would like to find voids. There are two example datasets for you to use to get started, and they can be found under the datasets/ directory.

Simulations

Here you describe the simulation parameters, where to put outputs, how many redshift slices, subvolumes, what kind of subsampling or HOD mocks you want, etc. See the example_simulation.py dataset for more information for all the parameters.

Run python ./prepareInputs.py --parm=path/to/dataset.py --scripts to generate pipeline scripts. This code takes the following additional options:

  • --subsamples perform subsampling on input particle files
  • --hod run the HOD code on input halo catalogs
  • --halos prepare halo catalogs

Some notes:

  • Currently supported formats: Gadget Type 1, SDF, RAMSES, ASCII

  • If your input particle files are already in the desired format and subsampling level, then you only need to run with the --scripts option (making sure doSubSamplingInPrep=False).

  • Doing subsampling during preparation (see doSubSamplingInPrep=True in dataset file) is only available for SDF and multidark formats. However, subsampling can be done automatically in the void-finding code (in the generateCatalog stage) for any kind of file if doSubSamplingInPrep=False. Subsampling during preparation is faster becuse it only needs to be done once.

  • To analyze a Gadget simulation with no subsampling, set these parameters (these are the defaults):

    • subSamples = [1.0]
    • subSamplingMode = "relative"
    • doSubSamplingInPrep = False
  • While not fully integrated into the pipeline, the fit_hod code in python_tools is able to generate HOD parameters from a given simulation and halo catalog. See the files in that directory for more information.

prepareInputs will produce a pipeline script for each subsampling factor or HOD mock you choose. It will place your pipeline scripts in the directory you chose in the dataset file. Dark matter particle scripts will have ss in them.

If you choose doPecVel = True, there will be two sets of script files: one with and one without peculiar velocities.

If you have multiple redshift particle files, and choose multiple slices and/or subdivisions, they will be packaged in the same pipeline script.

The example simulation file can be run as follows:

./prepareInputs.py --scripts --parm=datasets/example_simulation.py
./generateCatalog.py example_simulation/sim_ss1.0.py

The outputs will be located in examples/example_simulation/sim_ss1.0/sample_sim_ss1.0_z0.00_d00/.

Observations

For observations, you skip the prepareInputs.py stage and go directly to void finding with your dataset file. Here, you define your data samples directly.

Your galaxy catalogs needs to be in plain ASCII with the following columns:

  • 1: Index
  • 2: Not Used
  • 3: Not Used
  • 4: RA
  • 5: Dec
  • 6: z (in km/sec)
  • 7: Magnitude
  • 8: Not Used

You will also need a survey mask file in HEALPix format. The Python script python_tools/misc_tools/figureOutMask.py can construct a rudimentary mask from a list of galaxy positions in the above format.

Optionally, you can set a radial selection function file with two columns: redshift in km/s and completeness. Otherwise, your sample is assumed to be volume limited. If your sample extends outside the redshift range of your selection function, a default weighting of 1.0 will be applied in those regions.

See the example_observation.py dataset file for an explanation of parameters.

The example observation file can be run as follows:

./generateCatalog.py datasets/example_observation.py

The outputs will be located in examples/example_observation/sample_example_observation/.

Updated