Add some documentation on the various options

Comments (3)

ptlcc

Hi Anders

I have updated the “KMAspecification.pdf” for the options you mention.

To answer your questions:

Using -k 11 -k_t 21 when indexing will set the k-mer size to 21 for identifying templates, and 11 when aligning.

You are right by the force pairing, the reward for pairing is specified by the option “-per” (default 7), Unite prefers keeping the reads together, but does neither give a penalty nor reward for it.

The threads depends on the version and options used. With the latest version (1.3.9) and the flag “-status“ it more less follows what you describe.

The trimming parameters are specified with: -ml (minimum length), -mp (minimum phred score at leading and trailing bases), -eq (minimum read quality), -5p (constant trimming of bases from the 5') and -3p (constant trimming of bases from the 3'). The effect of trimming depends on the type of data you have and how samples have been treated in the lab.

Best,
Philip

2020-12-04T08:40:18+00:00

Anders Goncalves da Silva reporter

Thank you for the response Philip. Much appreciated! I’ll check out version 1.3.9.

I did find one combination of k, k_i, and k_t for a specific sample/DB that consistently causes a seg fault (only for this specific sample). I’ll put a package together so you can reproduce it.

2020-12-05T18:54:06+00:00

ptlcc

changed status to resolved

2021-02-05T06:43:12+00:00

ptlcc
Hi Anders

I have updated the “KMAspecification.pdf” for the options you mention.

To answer your questions:

Using -k 11 -k_t 21 when indexing will set the k-mer size to 21 for identifying templates, and 11 when aligning.

You are right by the force pairing, the reward for pairing is specified by the option “-per” (default 7), Unite prefers keeping the reads together, but does neither give a penalty nor reward for it.

The threads depends on the version and options used. With the latest version (1.3.9) and the flag “-status“ it more less follows what you describe.

The trimming parameters are specified with: -ml (minimum length), -mp (minimum phred score at leading and trailing bases), -eq (minimum read quality), -5p (constant trimming of bases from the 5') and -3p (constant trimming of bases from the 3'). The effect of trimming depends on the type of data you have and how samples have been treated in the lab.

Best,
Philip
- 2020-12-04T08:40:18+00:00
Anders Goncalves da Silva reporter
Thank you for the response Philip. Much appreciated! I’ll check out version 1.3.9.

I did find one combination of k, k_i, and k_t for a specific sample/DB that consistently causes a seg fault (only for this specific sample). I’ll put a package together so you can reproduce it.
- 2020-12-05T18:54:06+00:00
ptlcc
- changed status to resolved
- 2021-02-05T06:43:12+00:00
Log in to comment