Access to TNF matrix
Issue #99
resolved
Hi,
I'd like to use your TNF distance so I use --saveTNF saved.tnf in metabat1. However, the output file is a binary file and is there any way to make it readable?
Thanks a lot for your help! David Dinh PhD in Bioinformatics USC
Comments (2)
-
-
- changed status to resolved
- Log in to comment
Hello Dr Dinh,
We did not expect that the TNF calculations would be used outside of the MetaBAT program so we only provided that option in metabat1 as a way to speed up recalculations of the binning with different parameters. Since metabat2 made that option mostly obsolete, we did not include that ability in the later and supported versions.
That being said the binary format that metabat1 outputs is very simple, and we utilized the boost::archive::binary_archive interface to save and load it. You can look at the loadTNFFromFile function in metabat1.h for the actual implementation. The intended format is below, but the boost layer adds headers and may also include a compression layer:
1 x 8 byte unsigned integer for the minimum contig size
followed by 136 x 4 byte floats for each contig in your dataset above that minimum contig size in the order of your input fasta. Each of the 136 floats are the normalized TNF vector.
Unfortunately, providing a text file for this data is out of the scope for our current work on metabat2.
MetaBAT is open source, so you are free to modify it as you like. If all you want is the TNF calculations for your contigs, you can modify metabat2.cpp at line 529 and output the contents of the TNF matrix immediately after they are calculated there, in the format of your choosing.
-Rob