1. Casey Dunn
  2. agalma
  3. Issues
Issue #68 resolved

bad oases header error message

Dave Carlson
created an issue

I'm attempting to run the postassemble and phylogeny pipelines on transcriptomes assembled with Trinity but not using Agalma. I keep running into an issue where I'll get an error message like:

"biolite.workflows.unpack_oases_header: bad oases header: comp0_c0_seq1"

This first occurred in the "exemplar" stage of the postassemble pipeline, however I was able to get around it by selecting my own exemplars and skipping the exemplar stage. Now, though, it is occurring again when I try to load my postassembled sequences in preparation for the phylogeny pipeline. Is there something I can do to fix this or get around the issue?


Comments (2)

  1. Casey Dunn repo owner

    The assemble pipeline annotates the headers in a specific format. This means that externally assembled data cannot be treated in the same way as internally assembled data.

    There are a few options:

    • I would encourage you to assemble the data within agalma as the tool is intended to be used. This has the advantage, among other things, of the speedups we have implemented in Trinity.

    • If there is some reason you must use external assemblies, and you have already selected exemplars for each gene, you could load the data as an external assembly. See the 00-catalog.sh and 06-load-Nematostella.sh files at https://bitbucket.org/caseywdunn/dunnhowisonzapata2013/src for an example of how this is done. This uses agalma's generic gene prediction import capabilities.

    • You could try to spoof our assembly headers and trick agalma into treating the assembly as it would an assembly it had run. We can't support this, though, and it wouldn't have any advantage above the above options.

  2. Log in to comment