Anonymous avatar Anonymous committed 5565255

Comments (0)

Files changed (1)

 INSTALLATION
 
-<<TODO>>
+
 
 USAGE
 
  
 DESCRIPTION
 
-opticall reads in file of intensity data (currently Illumina normalized intensities) and clusters them considering both per-SNP and per-sample information, and provides genotype calls as output. The intensity input file is space separated, with SNPs are rows, and samples as columns. 
+opticall reads in file of intensity data (currently Illumina normalized intensities) and clusters them considering both per-SNP and per-sample information, and provides genotype calls as output. The intensity input file is tab separated, with SNPs are rows, and samples as columns. So a line would be: 
 
-So a line would be: rsid rscoord allelesAB id_1a id_1b id_2a id_2b etc.
+rsid<tab>rscoord<tab>allelesAB<tab>id_1A<tab>id_1B<tab>id_2A<tab>id_2B etc.
 
-where id_1a is the allele A intensity value for sample 1, and id_1b is the allele B intensity value for sample 1. The algorithm is known to perform well with Illumina normalized intensities.
+where id_1A is the allele A intensity value for sample 1, and id_1A is the allele B intensity value for sample 1. The algorithm is known to perform well with Illumina normalized intensities. Any missing intensities should be input as NaN for both the A and B alleles.
 
+The first line of the file should also be a header line of the form:
+
+SNP<tab>Coor<tab>Alleles<tab>sample1_idA<tab>sample1_idB<tab>sample2_idA<tab>sample2_idB etc.
+
+where sample1_id is your identifier for the first sample, and the A, B correspond to the different allele intensities. 
+
+An example input intensity file is provided with the program for your information.
  
 OPTIONS
 
 
 -out FILE
  
-The output file name. Two files will be created by the algorithm with the filepath specified. One will have the suffix '.calls' appended to it for the genotype calls, and the other '.probs' for the posterior probabilities. The output format is space-delimited with columns: coordinate, rs, perturbation score, allelesAB, call_1, call_2, call_3,.... The order of the calls is the same as the header from the input file. The calls are encoded as 1 = AA, 2 = AB (heterozygote), 3 = BB, 4 = NN (no call).
+The output file name. Two files will be created by the algorithm with the filepath specified. One will have the suffix '.calls' appended to it for the genotype calls, and the other '.probs' for the posterior probabilities. The output format is space-delimited with columns: coordinate, rs, pertubation value, allelesAB, call_1, call_2, call_3,.... The order of the calls is the same as the header from the input file. The calls are encoded as 1 = AA, 2 = AB (heterozygote), 3 = BB, 4 = NN (no call). The pertubation value is merely output to make call files compatible with those of the Illuminus caller, and not reflective of any pertubation analysis. 
 
--inblock FILE
 
-Another input intensity file name, in the format specified above, used to define regions for genotypes, in case you need regions defined from a different dataset. 
-
- 
Tip: Filter by directory path e.g. /media app.js to search for public/media/app.js.
Tip: Use camelCasing e.g. ProjME to search for ProjectModifiedEvent.java.
Tip: Filter by extension type e.g. /repo .js to search for all .js files in the /repo directory.
Tip: Separate your search with spaces e.g. /ssh pom.xml to search for src/ssh/pom.xml.
Tip: Use ↑ and ↓ arrow keys to navigate and return to view the file.
Tip: You can also navigate files with Ctrl+j (next) and Ctrl+k (previous) and view the file with Ctrl+o.
Tip: You can also navigate files with Alt+j (next) and Alt+k (previous) and view the file with Alt+o.