Missing species in ref DB / Software for generating ref db?

Issue #17 new
Marc Hoeppner created an issue


sorry, this is only partly an "issue", but I could not find the fotums or mailing list mentioned on the group's webpage. Anyhoo...

Digging through the reference db bundled with Metaphlan 2.0, I noticed that the bacterium "Streptococcus pneumoniae" is not at all represented? Normally, I wouldn't be bothered, but as a fairly prominent species with clinical relevance and many, many sequenced genomes, I would have expected to find it in the data set.

This brings me to my question: Is there any way to get access to the software / pipeline used to generated the taxonomic marker sequences? I have been thinking about this for some time and have a vague idea how to go about this, but if there is a chance of getting my hands on something already working, that'd be fantastic :)

Cheers, Marc

