Wiki
Clone wikienterobase-web / EnteroBase Backend Pipeline: prokka_annotation
Top level links:
- Main top level page for all documentation
- EnteroBase Features
- Registering on EnteroBase and logging in
- Tutorials
- Using the API
- About the underlying pipelines and other internals
- How schemes in EnteroBase work
- FAQ
prokka_annotation
Overview
prokka_annotation is an annotation pipeline and is run on an assembly (after successful assembly from short reads by QAssembly). Annotations are then available for download from the EnteroBase website in gzipped GFF and GenBank Flat File/GBK format files. (The normal method for user download of annotations is described here.)
The current version of prokka_annotation is 1.0.
Annotation
The prokka_annotation pipeline is primarily a wrapper around the program prokka (version 1.11). It goes through the following steps:
- read the assembled sequence
- runs prokka with the "--listdb" option in order to check if the desired bacterial genus is supported
- runs prokka with the "--compliant" option for GenBank/ENA/DDJB compliance, using "--genus" if the desired bacterial genus is supported or a genetic bacterial model if not
- some minor cleanup of the output GFF and GBK files from prokka
Pan-gene sets
Species | No. of genes |
---|---|
Salmonella | 21,065 |
Escherichia | 25,002 |
Yersinia | 19,591 |
Updated