Clone wiki

ky_wbprojects / Updating_obo_tables_in_postgres

This obo file is used to display term info for variations, transgenes, strains, clones, rearrangements, genes in any OA interface that contains these objects. It needs to be updated with every release of acedb. The obo file is created from AQL queries of the latest WS.

In the Phenotype OA, all object fields, except strain should be autocomplete drop down lists. The files that are used to populate these fields are an obo-like format in that there is information attached to each object that shows up in the term info box when selected. Keeping the file updated from acedb and showing this information in the term info box helps during curation as it verifies the identity of the object being curated and saves the curator time from having to manually look up and verify the info themselves. These files although not technically 'obo files' will be referred to as obo files when referring to any flat file that contains a list of terms with accompanying information for display in the term info window. This is in contrast to other flat files that only contain a simple list of terms.

obo files for the phenotype OA

The following fields use an obo file, the name, source and script that generates the obo file used is noted.

  • Pub field -> paper.obo
  • Person field
  • Variation ->obo_oa_ontology
  • Transgene ->obo_oa_ontology
  • Rearrangement ->obo_oa_ontology
  • Caused by -> WBGene
  • Phenotype ->phenotype.obo
  • Molecule ->molecule.obo
  • Anatomy ->WBbt.obo
  • Life stage ->worm_development.obo
  • Child of ->phenotype.obo
  • Laboratory evidence
  • Entity ->chebi.obo, rex.obo, gene_ontology_ext.obo
  • Quality ->quality.obo

obo OA ontologies

This script : /home/postgres/work/pgpopulation/obo_oa_ontologies/update_obo_oa_ontologies.pl

Is on a cron job every day at 3am. It populates obo tables by downloading :

http://tazendra.caltech.edu/~azurebrd/cgi-bin/forms/phenotype_ontology_obo.cgi
ftp://ftp.ebi.ac.uk/pub/databases/chebi/ontology/chebi.obo
http://www.geneontology.org/ontology/obo_format_1_2/gene_ontology_ext.obo
http://obo.cvs.sourceforge.net/*checkout*/obo/obo/ontology/anatomy/gross_anatomy/animal_gross_anatomy/worm/worm_anatomy/WBbt.obo
http://www.berkeleybop.org/ontologies/obo-all/worm_development/worm_development.obo
http://www.berkeleybop.org/ontologies/obo-all/rex/rex.obo
http://www.berkeleybop.org/ontologies/obo-all/quality/quality.obo (do we even use entity / quality in the phenotype OA ?)
It also calls :
http://tazendra.caltech.edu/~azurebrd/cgi-bin/forms/generic.cgi?action=AddToVariationObo

AQL Queries for updating the variation obo

Instructions for retrieving object connections from the latest WS build

Variation-gene, variation-paper connections are used to select for variations of type allele or those that affect CDS (can include transposon and polymorphism alleles). This is necessary to keep the size of the variation autocomplete file manageable and not slow down the loading and using of the OA.

Variation_gene connections

''Find all variations in the allele group (excludes SNPs etc.) along with the WBGeneID and public gene name of the gene they are assigned, if that is available.'' You will be making a file named ''Variation_gene.txt''' that is a combination of ''vargene.txt'' and ''transposons.txt''

  • ''vargene.txt'' select a, a->gene, a->gene->public_name, a->reference from a in class variation where exists_tag a->allele Export as vargene.txt (choose Separator character set to blank (TAB))
  • ''transposons.txt'' select t, t->gene, t->gene->public_name, t>reference from t in class variation where exists_tag t->transposon_insertion and exists t->gene Export as above as transposons.txt
  • Make '''Variation_gene.txt''' by copying and pasting ''transposons.txt'' to the end of ''vargene.txt'' and saving as ''Variation_gene.txt"
  • ''total_variations.txt'' select v, v->gene, v->gene->public_name, v->reference from v in class variation
    Export as total_variations.txt as above
    This is required for building an exclusion list that filters out SNPs, and is referred to as a junk list

Transgene_summary_paper connections

''List transgenes already linked to a paper''

  • '''transpapsum.txt''' select t, t->reference, t->summary from t in class transgene where exists t->reference Export as transpapsum.txt

Rearrangement_inside_gene connections

''List rearrangements with LG, 'genes inside' and ‘gene outside’ (public names only)''

  • '''rearragene.txt''' select r, r->map, r->gene_inside->public_name, r->gene_outside->public_name from r in class rearrangement
    Export as rearragene.txt

Strain info

Added 5.17.11 for Daniela
select s, s->genotype, s->location from s in class strain

Clone info

Added 5.17.11 for Daniela
select a, a->Type, a->transgene, a->strain, a->general_remark, a->location, a->accession_number, a->reference from a in class clone


Repopulating .obo's

Two scripts run off of these files to update the .obo's for the OA. The scripts are on tazendra and run off of files '''Variation_gene.txt''', '''transgene_summary_reference.txt''' and '''rearr_simple.txt'''. So files need to be transferred to tazendra and renamed to be recognizable by those scripts.

Transfer files to tazendra

scp all files to ''acedb@tazendra.caltech.edu:/home/acedb/jolene/WS_AQL_queries''

Scripts for repopulating the OA obo files

  • Cron job: ''populate_newobjects_cgi_postgres_tables.pl'' updates information based on Variation_gene.txt and transgene_summary.txt. This script is required for posting new allele and transgene entries on to the New objects cgi and sending notifications to the relevant curators. (Make sure files are named accordingly or the program won’t see them).

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

To check if the re-population scripts worked, check out the [http://tazendra.caltech.edu/~azurebrd/var/work/phenote/ws_current.obo WS_current] info field The date will tell you when it was last updated; it should reflect the date the script was run.

<br>

[http://www.wormbase.org/wiki/index.php/Caltech_documentation ''back'']

--[[User:Kyook|kjy]]

Category:Phenotype Curation [[Category:Phenotype]]

/home/postgres/work/pgpopulation/obo_oa_ontologies/update_obo_oa_ontologies.pl

Updated