Wiki

Clone wiki

BAM-matcher / Arguments

Running BAM-matcher

If the configuration file is set up with paths to REFERENCE and VCF file, you can run a comparison by just:

bam-matcher.py -B1 BAM_FILE_1 -B2 BAM_FILE_2 

As no output options are specified, the output is printed to standard-output, and also to a text file in current working directory.

Run bam-matcher.py -h to see the full help message.


Arguments

Most of the configuration parameters can also be overridden at run time.

Section: REQUIRED

Minimum required input, if the configuration file is set up fully and correctly.

REQUIRED:
  --bam1 BAM1, -B1 BAM1
                        First BAM file
  --bam2 BAM2, -B2 BAM2
                        Second BAM file

Section: CONFIGURATION

CONFIGURATION:
  --config CONFIG, -c CONFIG
                        Specify configuration file (default =
                        /dir/where/script/is/located/bam-matcher.conf)
  --generate-config GENERATE_CONFIG, -G GENERATE_CONFIG
                        Specify where to generate configuration file template

By default, BAM-matcher looks for the config file ("bam-matcher.conf") in the same directory as the script itself. The --config option can be used to specify a different config file.

To create a new configuration file, you can use the --generate-config/-G option to generate a few configuration template. If a file path is not specified, it will write to bam-matcher.conf.template in the current working directory.

Section: OUTPUT REPORT

OUTPUT REPORT:
  --output OUTPUT, -o OUTPUT
                        Specify output report path (default =
                        /current/dir/bam_matcher.SUBFIX)
  --short-output, -so   Short output mode (tab-separated).
  --html, -H            Enable HTML output. HTML file name = report + '.html'
  --no-report, -n       Don't write output to file. Results output to command
                        line only.
  --scratch-dir SCRATCH_DIR, -s SCRATCH_DIR
                        Scratch directory for temporary files. If not
                        specified, the report output directory will be used
                        (default = /tmp/[random_string])

If no output file is specified, BAM-matcher will print the results to standard output and write results to bam_matcher.SUBFIX in the current working directory, where SUBFIX includes the BAM file names and a random string.

The scratch directory is usually deleted at the end of a successful run, unless --debug option is set, then the temporary files will be kept. If you are using the --scratch-dir option, the specified path must not exist already (although its parent directory should exist).

Section: VARIANTS

VARIANTS:
  --vcf VCF, -V VCF     VCF file containing SNPs to check (default can be
                        specified in config file instead)

Use --vcf to specify the VCF file containing the genomic loci to compare. This will override the setting in the config file (VCF_file).

Section: CALLERS AND SETTINGS

Most of these settings should be set in the configuration file already. Values specified at run time will override configuration settings.

CALLERS AND SETTINGS (will override config values):
  --caller {gatk,freebayes,varscan}, -CL {gatk,freebayes,varscan}
                        Specify which caller to use (default = 'freebayes')
  --dp-threshold DP_THRESHOLD, -DP DP_THRESHOLD
                        Minimum required depth for comparing variants
  --number_of_snps NUMBER_OF_SNPS, -N NUMBER_OF_SNPS
                        Number of SNPs to compare.
  --fastfreebayes, -FF  Use --targets option for Freebayes
  --gatk-mem-gb GATK_MEM_GB, -GM GATK_MEM_GB
                        Specify Java heap size for GATK (GB, int)
  --gatk-nt GATK_NT, -GT GATK_NT
                        Specify number of threads for GATK UnifiedGenotyper
                        (-nt option)
  --varscan-mem-gb VARSCAN_MEM_GB, -VM VARSCAN_MEM_GB
                        Specify Java heap size for VarScan2 (GB, int)

Section: REFERENCES

Other than --about-alternate-ref, these are all the same as the settings in config file. Specifying values here will override config settings.

REFERENCES:
  --reference REFERENCE, -R REFERENCE
                        Default reference fasta file. Needs to be indexed with
                        samtools faidx
  --ref-alternate REF_ALTERNATE, -R2 REF_ALTERNATE
                        Alternate reference fasta file. Needs to be indexed
                        with samtools faidx. Overrides config settings.
  --chromosome-map CHROMOSOME_MAP, -M CHROMOSOME_MAP
                        Required when using alternate reference. Run BAM-
                        matcher with --about-alternate-ref for more details.
  --about-alternate-ref ABOUT_ALTERNATE_REF, -A ABOUT_ALTERNATE_REF
                        Print information about using --alternate-ref and
                        --chromosome-map

Section: BATCH OPERATIONS

These parameters alter caching behaviour.

If --do-no-cache/-NC is enabled, BAM-matcher will also not attempt to look for cached data.

Specifying cache directory here (--cache-dir/-CD) will only affect current run, and will not alter the configuration file. If the specified cache directory does not exist, BAM-matcher will attempt to create it.

BATCH OPERATIONS:
  --do-not-cache, -NC   Do not keep variant-calling output for future
                        comparison. By default (False) data is written to
                        /bam/filepath/without/dotbam.GT_compare_data
  --recalculate, -RC    Don't use cached variant calling data, redo variant-
                        calling. Will overwrite cached data unless told not to 
                        (-NC)
  --cache-dir CACHE_DIR, -CD CACHE_DIR
                        Specify directory for cached data. Overrides
                        configuration

Section: others

optional arguments:
  -h, --help            show this help message and exit
  --debug, -d           Debug mode. Temporary files are not removed
  --verbose, -v         Verbose reporting. Default = False

Updated