Error in computation of snp statistics

Matteo Sesia created an issue

I tried to run the following command using the latest version of QCTOOLS:

qctool -g $BGEN_FILE -s $SAMPLE_FILE -snp-stats -osnp snp-stats.txt

where $BGEN_FILE and $SAMPLE_FILE refer to the phased haplotypes on chromosome 22 from the UK Biobank (bgen v1.2), but I obtained the following error:

Opening genotype files                                      : [******************************] (1/1,1.8s,0.6/s)

Input SAMPLE file(s):           "/scratch/PI/candes/ukbiobank/haplotypes/ukb_hap_chr22.sample"
Output SAMPLE file:             "(n/a)".
Sample exclusion output file:   "(n/a)".

Input GEN file(s):
                                                    (  10911 snps)  "/scratch/PI/candes/ukbiobank/haplotypes/ukb_hap_chr22.bgen (bgen v1.2; 487409 named samples; zlib compression)"
                                         (total 10911 snps in 1 sources).
                      Number of samples: 487409
Output GEN file(s):             (n/a)
Output SNP position file(s):    (n/a)
Sample filter:                  .
# of samples in input files:    487409.
# of samples after filtering:   487409 (0 filtered out).


SNPSummaryComponent: the following components are in place:

Processing SNPs                                             :  (0/?,0.1s,0.0/s)

Number of SNPs:
                     -- in input file(s):                 10911.
 -- in output file(s):                0

Number of samples in input file(s):   487409.


!! Error (genfile::BadArgumentError): In argument(s) order_type=6, value_type=1 to function genfile::ToGP::set_number_of_entries(): Unsupported order_type/value_type combination..

  1. Gavin Band repo owner

    Thanks for raising this. Computing snp summary statistics from phased probabilities currently isn't implemented, though this is an issue that I should address. For this data, actually all probs are 0 or 1 because these are haplotype calls, and a workaround currently is to add -threshold 0.9 to the command line. This internally converts the probabilities to hard calls, which can then be processed. (The choice of threshold is arbitrary here, since they're all 0 or 1.)

