Clone wiki

ky_wbprojects / Transgene_Curation

Transgene Model
Transgene First Pass
Transgene Curation-OA
Transgene.ace dumper
Changes to Transgenes tables, OA and .ace dumper

Interaction with other curators

Transgenes are used by many other datatypes, such as Expr_pattern, Phenotype, Gene_regulation.

For all datatypes, requests for transgene names are made to Karen.

(Obsolete) Xiaodong sends out a file at the beginning of every month requesting new transgene IDs. The transgene curator should fill in the new IDs and send the file back to her.

(Obsolete) Transgene IDs requested by Phenotype curation is done via the webform: An email is sent to the transgene curator when there are transgene names pending approval. <Once approved/entered, the name will replace the placeholder name in the variation phenotype tables?>

(Obsolete) At the moment, I can log into Phenote to curate new transgenes by myself. However, in the future, we can probably combine all three data types and do it via the webform.

It would be good to have one central place for people to request Transgene names.

  • (ky) Strain:
    Strain data seems to have relevant remarks about the transgene in the strain, is there a way we can link strain info to transgene curation. Possible ways we could do this:
    1. automate the display of strain info when curating transgenes to sync up data and paper links.
    2. dump all strains containing transgenes and use the import (or manually update) the transgene info.
    3. It would be good to sort and display the strains that are available from the CGC.

/postgres/work/pgpopulation/textpresso/transgene/* - populates postgres with the results of Arun's transgene pattern matching script track down this script???
- adds new Is/In transgenes and new paper connections to pre-existing ones.
Input: C.elegans corpus
Input: /home/postgres/work/pgpopulation/textpresso/transgene/Obsolete.Tg.txt (no longer needed with switch to OA)
Note: Obsolete.Tg.txt was manually edited to store false transgene objects and paper connections that were picked up by Arun's pattern matching scripts. Obsolete objects are now tagged as 'Fails' so they will not be entered into the transgene table again as they will always be picked up by Arun's script.
Output: postgres transgene table
Runs: every morning, 4am; the textpresso transgene table is wiped clean during each run and repopulated
Written by: Juancarlos

<Arun's script >
What: Uses pattern matching to find all Is, In, transgene names, and paper
Input: C.elegans corpus
Written by: Arun

<Arun's other script> that includes Ex and genomic expressions
What: finds all Is, In, and Ex expressions plus any possible genomic expression following the transgene regular expression name
Input: C.elegans corpus
Output: transgenes_in_regular_papers.out and transgenes_summary.out
Runs: manually
Written by: Arun /home/citpub/Karen/TgSummary/
What: Parses output of transgenes_summary to display only those transgenes that do not exist in WS216 and displays ones that have more than one reference. All new entries will get assigned curator "Arun" and can be retrieved that way
Input 1: WSTg.ace lists all transgenes already in WormBase
Input 2: transgenes_summary.out output file from Textpresso"
Output1: NewTg.txt (all new Ex lines not in WS216.)
Output2: NewTgHighPriority.txt (all new Ex lines that has at least 2 paper entry.)
Runs: manually
Written by: Wen -> /home/acedb/wen/phenote_transgene/
What: dumps all data in transgene postgres table into a .ace file from the phenote table (switching to the OA for WS220)
Input: transgene phenote tables
Output: /home/acedb/wen/phenote_transgene/transgene.ace.20080917
Output copy: home/citace/Data_for_citace/Data_from_Karen on citace
Runs: every Thursday at 4am
Written: Juancarlos