Error in computation of snp statistics

Issue #50 new
Matteo Sesia created an issue

I tried to run the following command using the latest version of QCTOOLS:

qctool -g $BGEN_FILE -s $SAMPLE_FILE -snp-stats -osnp snp-stats.txt

where $BGEN_FILE and $SAMPLE_FILE refer to the phased haplotypes on chromosome 22 from the UK Biobank (bgen v1.2), but I obtained the following error:

Welcome to qctool
(version: 2.0, revision 1fbf746)

(C) 2009-2017 University of Oxford

Opening genotype files                                      : [******************************] (1/1,1.8s,0.6/s)

Input SAMPLE file(s):           "/scratch/PI/candes/ukbiobank/haplotypes/ukb_hap_chr22.sample"
Output SAMPLE file:             "(n/a)".
Sample exclusion output file:   "(n/a)".

Input GEN file(s):
                                                    (  10911 snps)  "/scratch/PI/candes/ukbiobank/haplotypes/ukb_hap_chr22.bgen (bgen v1.2; 487409 named samples; zlib compression)"
                                         (total 10911 snps in 1 sources).
                      Number of samples: 487409
Output GEN file(s):             (n/a)
Output SNP position file(s):    (n/a)
Sample filter:                  .
# of samples in input files:    487409.
# of samples after filtering:   487409 (0 filtered out).


SNPSummaryComponent: the following components are in place:

Processing SNPs                                             :  (0/?,0.1s,0.0/s)

Number of SNPs:
                     -- in input file(s):                 10911.
 -- in output file(s):                0

Number of samples in input file(s):   487409.


!! Error (genfile::BadArgumentError): In argument(s) order_type=6, value_type=1 to function genfile::ToGP::set_number_of_entries(): Unsupported order_type/value_type combination..

Thank you for using qctool.

Comments (2)

  1. Gavin Band repo owner

    Thanks for raising this. Computing snp summary statistics from phased probabilities currently isn't implemented, though this is an issue that I should address. For this data, actually all probs are 0 or 1 because these are haplotype calls, and a workaround currently is to add -threshold 0.9 to the command line. This internally converts the probabilities to hard calls, which can then be processed. (The choice of threshold is arbitrary here, since they're all 0 or 1.)

  2. Log in to comment