Files changed (2)
-mysql --user=genome --host=genome-mysql.cse.ucsc.edu -A -e "select chrom, size from hg19.chromInfo" > hg19.genome
+mysql --user=genome --host=genome-mysql.cse.ucsc.edu -A -e "select chrom, size from hg19.chromInfo" > hg19.genome
you will get out_dir/f_1.bed, out_dir/f_2.bed, out_dir/f_3.bed. This is much faster than running the bnMapper three times as the alignment file is loaded just once.
+The files for the following examples are on the //bx-python/test_data/epo_tests// directory. For example, in my system I have
+We will start with the conversion of the EPO alignments (epo_547_hs_mm_12way_mammals_65.out) to the chain format.
+out_to_chain.py --species homo_sapiens mus_musculus --chrsizes test_data/epo_tests/hg19.chrom.sizes test_data/epo_tests/mm9.chrom.sizes --output epo.HM.chain test_data/epo_tests/epo_547_hs_mm_12way_mammals_65.out
+This command is saying, extract the alignments of //homo_sapiens// and //mus_musculus// whose chromosome sizes are //test_data/epo_tests/hg19.chrom.sizes// and //test_data/epo_tests/mm9.chrom.sizes// respectively. Build the chain file with //homo_sapiens// as the target species and write the output on //epo.HM.chain//. To test the output check that //epo.HM.chain// is identical to //test_data/epo_tests/epo_547_hs_mm_12way_mammals_65.chain//
+Now we can use //epo.HM.chain// to map features from //homo_sapiens// to //mus_musculus//. If we wanted to map on the other direction we would have to produce another file, say //epo.MH.chain// like so
+out_to_chain.py --species mus_musculus homo_sapiens --chrsizes test_data/epo_tests/mm9.chrom.sizes test_data/epo_tests/hg19.chrom.sizes --output epo.MH.chain test_data/epo_tests/epo_547_hs_mm_12way_mammals_65.out
+To map the features on //test_data/epo_tests/hpeaks.bed// we simply type (in BED4 and BED12 respectively)
+or you can increase it with the -vdebug option. In the picture below you can notice the original peaks on the human genome and the mapped ones on the mouse genome. The alignment coverage track indicates matchin blocks and insertions in that species. Since peak3 falls in a human insertion, it is not mapped on mouse. On the other hand, peak2 is mapped in three blocks on mouse. The two gaps correspond to two insertions in mouse, whether the middle block contains two joined parts that spanned an insertion in human.
+To limit the mapping in peaks that have taken only a limited amount of indels, one can use the //--gap// and //--threshold// options like so
+This is filtering out all peaks that incurr a gap (insertion in mouse) of more than 9 bases as well as peaks of which less than 70% of bases are mapped.