Geneset merge script for Apollo annotations

Project description


The main perl script compares two genesets in GFF3 from the same assembly and classifies each gene model change. The original gene models are in the file "core.gff" while the updated gene models are located on the "cap.gff" file. Both GFF3 files reference to the sequences on the file "cap.fasta". It outputs a new GFF3 file with the genes that should go on the merged dataset and a file with the list of former gene ids to delete from the original geneset.


The software expects a directory for each species with 3 files "cap.gff", "cap.fasta"", "core.gff"". Each species directory name should be listed on a file (speciesList parameter). Bioperl and the gene_model_diff/modules must be included as perl libraries.

MySQL server connection details need to be provided in the config file. The modules directory have all the necessary code to store, compare, validate and retrieve gene models from a custom made database.

To run the software:

perl --config gene_model_diff.conf --speciesFile list_of_species.txt

This is an early release and the software is still under development.

Known Issues

This software only works with Bioperl 1.6 (it does not work with 1.7).