Wiki
Clone wikiFAST / OutputFileFormats
FAST analysis options and output file format for various methods
Note: FAST will append the chr number to the output file name prefix provided as input with the ‘--out-file’ option; e.g. --out-file outfile with --chr 10 options will result in output file prefix outfile.chr10. So for GWiS method with linear regression, FAST will generate output file named outfile.chr10.GWiS.Linear.txt. Similarly for all other methods.
Method : GWiS using Linear Regression
Options : --linear-gwis (no permutations) --linear-gwis-perm (with permutations)
Output Filename: Out.chrXX.GWiS.Linear.txt Multiple lines are present for each multi-SNP model in a gene :- 1. First line of each model has SNP.name=NONE indicating the null (intercept only model when no covariates, or covariates only model when covariates are present). 2. Followed by one or more lines for each SNP added to the model. 3. Line with SNP.name=SUMMARY indicates end of the model for the gene. This line also prints the values of K, SSM, BIC, f.stat n.stop, n.better and pval for the final K-snp model in the gene. Format Chr : Chromosome GeneID : Unique gene id Name : Gene name Start : Gene start in bp End : Gene end in bp Length : Gene length in bp SNPs : No. of snps in the gene Tests : Effective no. of snps in the gene SNP.name : SNP entering the model SNP.pos : SNP position in bp SNP.MAF : SNP minor allele frequency SNP.qual : SNP imputation quality K : Current model size SSM : Sum of the squares of the model BIC : BIC increment for the snp F.stat : Current model F-statistic R2 : Multiple R2 of the snp with the others in the model. n.stop : No of permutations executed n.better : No of permutations with better BIC score pval : Gene pvalue
Method : GWiS using Logistic Regression
Options : --logistic-gwis (no permutations) --logistic-gwis-perm (with permutations)
Output Filename: Out.chrXX.GWiS.Logistic.txt Multiple lines are present for each multi-SNP model in a gene :- 1. First line of each model has SNP.name=NONE indicating the null (intercept only model when no covariates, or covariates only model when covariates are present). 2. Followed by one or more lines for each SNP added to the model. 3. Line with SNP.name=SUMMARY indicates end of the model for the gene. This line also prints the values of K, SSM, BIC, chi2, n.stop, n.better and pval for the final K-snp model in the gene. Format Chr : Chromosome GeneID : Unique gene id Name : Gene name Start : Gene start in bp End : Gene end in bp Length : Gene length in bp SNPs : No. of snps in the gene Tests : Effective no. of snps in the gene SNP.name : SNP entering the model SNP.pos : SNP position in bp SNP.MAF : SNP minor allele frequency SNP.qual : SNP imputation quality K : Current model size SSM : Sum of the squares of the model BIC : BIC increment for the snp chi2 : Current model chi squared R2 : Multiple R2 of the snp with the others in the model. n.stop : No of permutations executed n.better : No of permutations with better BIC score pval : Gene pvalue
Method : minSNP using Linear Regression
Options : --linear-minsnp (no permutations) --linear-minsnp-perm (with permutations)
Output Filename : Out.chrXX.minSNP.Linear.txt (for minSNP) Description of methods: One line for each snp mapped to a gene. Both methods assign pvalue of a gene with the pvalue of its most significant SNP, which in the output file is indicated by the row whose isBest column has value 1. Minsnp calculates this pvalue with empirical methods (F-statistic/Chi2), while minsnp-perm gets this pvalue with permutations (this is because for variants with low MAF, empirical method loses power). For Minsnp-perm, permutations are performed only for the most significant SNP in the original test. n.tot and n.better are output only for minsnp-perm. Format Chr : Chromosome GeneID : Unique gene id Name : Gene name Start : Gene start in bp (includes flank) End : Gene end in bp (includes flank) Length : Gene length in bp (includes flank) SNPs : No. of snps in the gene Tests : Effective no. of snps in the gene SNP.name : SNP entering the model SNP.pos : SNP position in bp SNP.MAF : SNP minor allele frequency SNP.qual : SNP imputation quality chi2 : SNP chi squared statistic n.tot : No of permutations executed n.better : No of permutations with better chi2 pval : p-value isBest : 0/1 indicating if this SNP has best chi2 in the gene.
Method : minSNP using Logistic Regression
Options : --logistic-minsnp (no permutations) --logistic-minsnp-perm (with permutations)
Output Filename : Out.chrXX.minSNP.Logistic.txt (for minSNP) Description of methods: One line for each snp mapped to a gene. Both methods assign pvalue of a gene with the pvalue of its most significant SNP, which in the output file is indicated by the row whose isBest column has value 1. Minsnp calculates this pvalue with empirical methods (F-statistic/Chi2), while minsnp-perm gets this pvalue with permutations (this is because for variants with low MAF, empirical method loses power). For Minsnp-perm, permutations are performed only for the most significant SNP in the original test. n.tot and n.better are output only for minsnp-perm. Format Chr : Chromosome GeneID : Unique gene id Name : Gene name Start : Gene start in bp (includes flank) End : Gene end in bp (includes flank) Length : Gene length in bp (includes flank) SNPs : No. of snps in the gene Tests : Effective no. of snps in the gene SNP.name : SNP entering the model SNP.pos : SNP position in bp SNP.MAF : SNP minor allele frequency SNP.qual : SNP imputation quality chi2 : SNP chi squared statistic n.tot : No of permutations executed n.better : No of permutations with better chi2 pval : p-value isBest : 0/1 indicating if this SNP has best chi2 in the gene.
Method : minSNP Gene using Linear Regression
Options : --linear-minsnp-gene-perm (with permutations)
Output Filename : Out.chrXX.minSNP_Gene.Linear.txt Description of method: Different from minSNP, which does permutations on the single best SNP in a gene; for Minsnp-gene-perm, permutations are done for each of the SNPs in a gene. The test statistic(or pvalue) of the most significant SNP in each of the permutations is compared with the most significant SNP from original data, to get a pvalue on the gene-level. The meaning of the columns are the same as with minsnp and minsnp-perm methods. Format Chr : Chromosome GeneID : Unique gene id Name : Gene name Start : Gene start in bp (includes flank) End : Gene end in bp (includes flank) Length : Gene length in bp (includes flank) SNPs : No. of snps in the gene Tests : Effective no. of snps in the gene SNP.name : SNP entering the model SNP.pos : SNP position in bp SNP.MAF : SNP minor allele frequency SNP.qual : SNP imputation quality chi2 : SNP chi squared statistic n.tot : No of permutations executed n.better : No of permutations with better chi2 pval : p-value isBest : 0/1 indicating if this SNP has best chi2 in the gene.
Method : minSNP Gene using Logistic Regression
Options : --logistic-minsnp-gene-perm (with permutations)
Output Filename : Out.chrXX.minSNP_Gene.Logistic.txt Description of method: Different from minSNP, which does permutations on the single best SNP in a gene; for Minsnp-gene-perm, permutations are done for each of the SNPs in a gene. The test statistic(or pvalue) of the most significant SNP in each of the permutations is compared with the most significant SNP from original data, to get a pvalue on the gene-level. The meaning of the columns are the same as with minsnp and minsnp-perm methods. Format Chr : Chromosome GeneID : Unique gene id Name : Gene name Start : Gene start in bp (includes flank) End : Gene end in bp (includes flank) Length : Gene length in bp (includes flank) SNPs : No. of snps in the gene Tests : Effective no. of snps in the gene SNP.name : SNP entering the model SNP.pos : SNP position in bp SNP.MAF : SNP minor allele frequency SNP.qual : SNP imputation quality chi2 : SNP chi squared statistic n.tot : No of permutations executed n.better : No of permutations with better chi2 pval : p-value isBest : 0/1 indicating if this SNP has best chi2 in the gene.
Method : Bimbam using Linear Regression
Options : --linear-bf (no permutations) --linear-bf-perm (with permutations)
Filename : Out.chrXX.BF.Linear.txt One line for each gene in the chromosome. Format Chr : Chromosome GeneID : Unique gene id Name : Gene name Start : Gene start in bp End : Gene end in bp Length : Gene length in bp SNPs : No. of snps in the gene Tests : Effective no. of snps in the gene BF_sum : Linear regression based Bayes Factor sum for the gene. n.tot : No of permutations executed n.better : No of permutations with better BF_sum pval : pvalue
Method : Bimbam using Logistic Regression
Options : --logistic-bf (no permutations) --logistic-bf-perm (with permutations)
Filename : Out.chrXX.BF.Logistic.txt One line for each gene in the chromosome. Format Chr : Chromosome GeneID : Unique gene id Name : Gene name Start : Gene start in bp End : Gene end in bp Length : Gene length in bp SNPs : No. of snps in the gene Tests : Effective no. of snps in the gene BF_sum : Logistic regression based Bayes Factor sum for the gene. n.tot : No of permutations executed n.better : No of permutations with better BF_sum pval : pvalue
Method : Vegas using Linear Regression
Options : --linear-bf (no permutations) --linear-bf-perm (with permutations)
Filename : Out.chrXX.Vegas.Linear.txt One line for each gene in the chromosome. Format Chr : Chromosome GeneID : Unique gene id Name : Gene name Start : Gene start in bp End : Gene end in bp Length : Gene length in bp SNPs : No. of snps in the gene Tests : Effective no. of snps in the gene Vegas_sum : Linear regression based Vegas score for the gene. n.tot : No of permutations executed n.better : No of permutations with better Vegas_sum pval : pvalue
Method : Vegas using Logistic Regression
Options : --logistic-bf (no permutations) --logistic-bf-perm (with permutations)
Filename : Out.chrXX.Vegas.Logistic.txt One line for each gene in the chromosome. Format Chr : Chromosome GeneID : Unique gene id Name : Gene name Start : Gene start in bp End : Gene end in bp Length : Gene length in bp SNPs : No. of snps in the gene Tests : Effective no. of snps in the gene Vegas_sum : Logistic regression based Vegas score for the gene. n.tot : No of permutations executed n.better : No of permutations with better Vegas_sum pval : pvalue
Method : Gates using Linear Regression
Options : --linear-bf (no permutations)
Filename : Out.chrXX.Gates.Linear.txt One line for each gene in the chromosome. Format Chr : Chromosome GeneID : Unique gene id Name : Gene name Start : Gene start in bp End : Gene end in bp Length : Gene length in bp SNPs : No. of snps in the gene Tests : Effective no. of snps in the gene Gates : Linear regression based Gates score for the gene. pval : pvalue
Method : Gates using Logistic Regression
Options : --logistic-bf (no permutations)
Filename : Out.chrXX.Gates.Logistic.txt One line for each gene in the chromosome. Format Chr : Chromosome GeneID : Unique gene id Name : Gene name Start : Gene start in bp End : Gene end in bp Length : Gene length in bp SNPs : No. of snps in the gene Tests : Effective no. of snps in the gene Gates : Logistic regression based Gates score for the gene. pval : pvalue
Tip 1. When a method is specified with the –perm suffix, permutations are performed when mode=genotype, simulations are performed when mode=summary. 2. Tip If you only have a few genes on a chromosome, use the option ----linear-snp-gene or --logistic-snp-gene. This will limit the single SNP computations to only these genes.
Method : Single SNP Cox PH Regression
Options : --cox-snp
Filename : Out.chrXX.allSNP.COX.txt One line for each SNP. Format SNP.id : SNP name chr : chromosome pos : SNP position in base pairs NonCoded.Allele : Allele coded as 0 Coded.Allele : Allele coded as 1 Beta : Regression coefficient Se : Regression standard error Z : Z-test score log10BF : Log Bayes Factor Coded.Af : Allele frequency of coded allele Qual : SNP imputation quality eSampleSize : Effective sample size for the SNP computed as (#samples) x Qual x 2 x MAF x (1-MAF) nGenes : No of genes to which this SNP belongs (set to 0) Nmiss : Number of samples with missing values pvalue : SNP parametric pvalue loglik : Log-liklihood ratio of current model over null model
Method : Gene-based Cox PH Regression
Options : --cox-gene
Filename : Out.chrXX.allSNP.COX.GENE.txt 1. First line of each model has SNP.name=NONE indicating the null (intercept only model when no covariates, or covariates only model when covariates are present). 2. Followed by one or more lines for each SNP added to the model. 3. Line with SNP.name=SUMMARY indicates end of the model for the gene. This line also prints the values of K, loglik, BIC for the final K-snp model in the gene. Format Chr : Chromosome GeneID : Unique gene id Name : Gene name Start : Gene start in bp End : Gene end in bp Length : Gene length in bp SNPs : No. of snps in the gene Tests : Effective no. of snps in the gene SNP.name : SNP entering the model SNP.pos : SNP position in bp SNP.MAF : SNP minor allele frequency SNP.qual : SNP imputation quality K : Current model size loglik : Log-liklihood ratio of current model over null model BIC : GWiS model score based on Cox PH model
Method : SAPPHO
Options : --sapphoI/--sapphoC
Filename : Out.sapphoI.result.txt/Out.sapphoC.result.txt One line for each SNP in the model. Format SNP.id : SNP name Chr : Chromosome Number Pos : SNP position in base pairs NonCoded.Allele : Allele coded as 0 Coded.Allele : Allele coded as 1 SNP.MAF : SNP minor allele frequency SNP.qual : SNP imputation quality K : Current number of association in the model log|Det(SIGMA)| : Determinant of the var-cov matrix of the residuals SapphoScore : SAPPHO model score Pheno : The phenotype that current SNP is associated with SapphoScoreDiff : Score for that SNP in the model, a measure of importance of that SNP
Additional Output files
Out.chrXX.allSNP.Linear.txt : This file lists the single SNP linear regression results for each SNP.
Format SNP.id : SNP name pos : SNP position in base pairs NonCoded.Allele : Allele coded as 0 Coded.Allele : Allele coded as 1 Beta : Regression coefficient Se : Regression standard error Chi2 : Chi Square logBF : Log Bayes Factor Coded.Af : Allele frequency of coded allele Qual : SNP imputation quality eSampleSize : Effective sample size for the SNP computed as (#samples) x Qual x 2 x MAF x (1-MAF) nGenes : No of genes to which this SNP belongs Fmiss : Fraction of samples with missing values pvalue : SNP parametric pvalue
Out.chrXX.allSNP.Logistic.txt : This file lists the single SNP logistic regression results for each SNP.
Format SNP.id : SNP name Chr : Chromosome pos : SNP position in base pairs NonCoded.Allele : Allele coded as 0 Coded.Allele : Allele coded as 1 Beta : Regression coefficient Se : Regression standard error Wald : Wald-statistic logBF : Log Bayes Factor Coded.Af : Allele frequency of coded allele Qual : SNP imputation quality eSampleSize : Effective sample size for the SNP computed as (#samples) x Qual x 2 x MAF x (1-MAF) nGenes : No of genes to which this SNP belongs Fmiss : Fraction of samples with missing values pvalue : SNP parametric pvalue
Out.chrXX.geneSNP.txt : This file lists the mapping of each SNP and gene. A SNP can appear multiple times in this file if it belong to multiple overlapping genes.
Format SNP.name : SNP name SNP.chr : chromosome SNP.bp : SNP position in base pairs GeneID : Unique gene id Gene.name : Gene name Gene.start : Gene start in bp Gene.end : Gene end in bp SNP.maf : Minor allele frequency Qual : SNP quality eSampleSize : SNP effective sample size
Updated