Cannot subset SNP from large BGEN file (from UK Biobank data)

Issue #28 closed
Ying Liu
created an issue

I generally want to subset a few SNPs from BGEN files I got in the UK Biobank project (interim data of 150K samples). But I kept running into issues and it looks like a memory lack issue. Below are the code I used and error message I received. Thanks!

[liuy39@sbcs1 imp]$ qctool -g chr19impv1.bgen -s impv1.sample -og rs6857.bgen -incl-rsids rs6857.txt

Welcome to qctool (revision: abff35aeca09dbb5080e6817b943b1b8faa29c02)

(C) 2009-2011 University of Oxford

Opening genotype files : [******] (1/1,0.3s,2.9/s) ========================================================================

Input SAMPLE file(s): "impv1.sample" Output SAMPLE file: "(n/a)". Sample statistic output file: "(n/a)". Sample exclusion output file: "(n/a)".

Input GEN file(s): (not computed) "snp-id-data-filtered:chr19impv1.bgen" (total 1 sources, number of snps not computed). Number of samples: 152249 Output GEN file(s): "rs6857.bgen" Output SNP position file(s): (n/a) SNP statistic output file(s): Sample filter: (none). SNP filter: (none).

of samples in input files: 152249.

of samples after filtering: 152249 (0 filtered out).

========================================================================

glibc detected qctool: double free or corruption (out): 0x0000000004004330 *** ======= Backtrace: ========= /lib64/libc.so.6[0x3491675f3e] /lib64/libc.so.6[0x3491678dd0] qctool[0x4439f6] qctool[0x433e29] qctool[0x45c6a7] qctool[0x45bb6e] qctool[0x47358a] qctool[0x44b717] qctool[0x44c951] qctool[0x44ca52] qctool[0x44d203] qctool[0x512d84] qctool[0x5131ac] qctool[0x42e43b] qctool[0x42e935] qctool[0x407169] /lib64/libc.so.6(__libc_start_main+0xfd)[0x349161ed1d]

qctool(_ZNSt15basic_streambufIcSt11char_traitsIcEE6xsputnEPKcl+0x59)[0x406d39] ======= Memory map: ======== 00400000-00653000 r-xp 00000000 00:16 1100965001 /dors/sbcs/src/qctool/qctool_v1.3-linux-x86_64/qctool 00852000-00854000 rw-p 00252000 00:16 1100965001 /dors/sbcs/src/qctool/qctool_v1.3-linux-x86_64/qctool 00854000-00856000 rw-p 00000000 00:00 0 0241d000-07ff0000 rw-p 00000000 00:00 0 [heap] 3490e00000-3490e20000 r-xp 00000000 08:03 59619 /lib64/ld-2.12.so 349101f000-3491020000 r--p 0001f000 08:03 59619 /lib64/ld-2.12.so 3491020000-3491021000 rw-p 00020000 08:03 59619 /lib64/ld-2.12.so 3491021000-3491022000 rw-p 00000000 00:00 0 3491200000-3491217000 r-xp 00000000 08:03 59621 /lib64/libpthread-2.12.so 3491217000-3491417000 ---p 00017000 08:03 59621 /lib64/libpthread-2.12.so 3491417000-3491418000 r--p 00017000 08:03 59621 /lib64/libpthread-2.12.so 3491418000-3491419000 rw-p 00018000 08:03 59621 /lib64/libpthread-2.12.so 3491419000-349141d000 rw-p 00000000 00:00 0 3491600000-349178a000 r-xp 00000000 08:03 59620 /lib64/libc-2.12.so 349178a000-349198a000 ---p 0018a000 08:03 59620 /lib64/libc-2.12.so 349198a000-349198e000 r--p 0018a000 08:03 59620 /lib64/libc-2.12.so 349198e000-3491990000 rw-p 0018e000 08:03 59620 /lib64/libc-2.12.so 3491990000-3491994000 rw-p 00000000 00:00 0 3491a00000-3491a83000 r-xp 00000000 08:03 59627 /lib64/libm-2.12.so 3491a83000-3491c82000 ---p 00083000 08:03 59627 /lib64/libm-2.12.so 3491c82000-3491c83000 r--p 00082000 08:03 59627 /lib64/libm-2.12.so 3491c83000-3491c84000 rw-p 00083000 08:03 59627 /lib64/libm-2.12.so 7f4a58000000-7f4a58021000 rw-p 00000000 00:00 0 7f4a58021000-7f4a5c000000 ---p 00000000 00:00 0 7f4a5e08b000-7f4a5fa8c000 rw-p 00000000 00:00 0 7f4a60e0e000-7f4a60e12000 rw-p 00000000 00:00 0 7f4a60e12000-7f4a60e27000 r-xp 00000000 00:18 8587913 /gpfs22/local/centos6/gcc/4.6.1/lib64/libgcc_s.so.1 7f4a60e27000-7f4a61026000 ---p 00015000 00:18 8587913 /gpfs22/local/centos6/gcc/4.6.1/lib64/libgcc_s.so.1 7f4a61026000-7f4a61027000 rw-p 00014000 00:18 8587913 /gpfs22/local/centos6/gcc/4.6.1/lib64/libgcc_s.so.1 7f4a61027000-7f4a61028000 rw-p 00000000 00:00 0 7f4a61028000-7f4a6110e000 r-xp 00000000 00:18 10224141 /gpfs22/local/centos6/gcc/4.6.1/lib64/libstdc++.so.6.0.16 7f4a6110e000-7f4a6130d000 ---p 000e6000 00:18 10224141 /gpfs22/local/centos6/gcc/4.6.1/lib64/libstdc++.so.6.0.16 7f4a6130d000-7f4a61315000 r--p 000e5000 00:18 10224141 /gpfs22/local/centos6/gcc/4.6.1/lib64/libstdc++.so.6.0.16 7f4a61315000-7f4a61317000 rw-p 000ed000 00:18 10224141 /gpfs22/local/centos6/gcc/4.6.1/lib64/libstdc++.so.6.0.16 7f4a61317000-7f4a6132d000 rw-p 00000000 00:00 0 7f4a61344000-7f4a61346000 rw-p 00000000 00:00 0 7ffe8793b000-7ffe87951000 rw-p 00000000 00:00 0 [stack] 7ffe879b0000-7ffe879b1000 r-xp 00000000 00:00 0 [vdso] ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall] Aborted

Then I tried another time, and I only got: [liuy39@sbcs1 qctool_try]$ qctool -g chr18impv1.bgen -s impv1.sample -og rs6567160.bgen -incl-rsids rs6567160.txt

Welcome to qctool (revision: abff35aeca09dbb5080e6817b943b1b8faa29c02)

(C) 2009-2011 University of Oxford

Opening genotype files : [******] (1/1,0.6s,1.6/s)

Input SAMPLE file(s): "impv1.sample" Output SAMPLE file: "(n/a)". Sample statistic output file: "(n/a)". Sample exclusion output file: "(n/a)".

Input GEN file(s): (not computed) "snp-id-data-filtered:chr18impv1.bgen" (total 1 sources, number of snps not computed). Number of samples: 152249 Output GEN file(s): "rs6567160.bgen" Output SNP position file(s): (n/a) SNP statistic output file(s): Sample filter: (none). SNP filter: (none).

of samples in input files: 152249.

of samples after filtering: 152249 (0 filtered out).

========================================================================

Processing SNPs : Segmentation fault

Comments (2)

  1. Gavin Band repo owner

    Hi, At first glance I'd guess this is a build problem (especially if the problem occurs near the start of processing). Did you use one of the prebuilt binaries or compile yourself? If the former, please try compiling QCTOOL on your platform. Best, g.

  2. Log in to comment