Wiki

Clone wiki

enterobase-web / EnteroBase Backend Pipeline: prokka_annotation

Top level links:

prokka_annotation

Overview

prokka_annotation is an annotation pipeline and is run on an assembly (after successful assembly from short reads by QAssembly). Annotations are then available for download from the EnteroBase website in gzipped GFF and GenBank Flat File/GBK format files. (The normal method for user download of annotations is described here.)

The current version of prokka_annotation is 1.0.

Annotation

The prokka_annotation pipeline is primarily a wrapper around the program prokka (version 1.11). It goes through the following steps:

  • read the assembled sequence
  • runs prokka with the "--listdb" option in order to check if the desired bacterial genus is supported
  • runs prokka with the "--compliant" option for GenBank/ENA/DDJB compliance, using "--genus" if the desired bacterial genus is supported or a genetic bacterial model if not
  • some minor cleanup of the output GFF and GBK files from prokka

Pan-gene sets

Species No. of genes
Salmonella 21,065
Escherichia 25,002
Yersinia 19,591

Updated