ky_wbprojects / wbprojects

2007 Specific Aims

1 Increase Database Content

1A Curate genomic DNA sequence, associated gene structures and features

Genomic consensus0.15ManualSemi-AutoCurrentCurrent
Gene model curation-large-scale1.50Semi-AutoSemi-AutoCurrentCurrent
Gene model curation-user input & literature1.00ManualManualCurrentCurrent
Gene model curation-core species0.50Man/SemiMan/SemiInitiatedGet Current
Sequence feature curation/operons & SNPs1.10Semi-AutoSemi-AutoCurrentCurrent
Scripting and tool development0.50ManualManualCurrentCurrent

1B Curate large genomic-scale data sets

Microarray, Mass Spec0.45Semi-AutoSemi-AutoCurrentCurrent
Interaction Data0.20Semi-AutoSemi-AutoCurr/Backlog Get Current
Metabolomics, Tiling-microarray0.30Semi-AutoSemi-AutoInitiatedGet Current

1C Curate information derived from the literature and user submission

Person-Paper, Person, Lineage curation0.65ManualManualCurr/Backlog Current
PostgreSQL Database, Curation forms1.00ManualManualCurrentCurrent
Paper Acquisition0.55ManualManualCurr/BacklogGet Current
First-Pass0.80Semi-AutoSemi-AutoCurr/BacklogGet Current

1D Curate information about genetic variation, gene mapping and nomenclature

Genetic var. (2001- ) extraction, Ko, Mos10.20Semi-AutoSemi-AutoCurrentCurrent
Genetic var. (2001-present) curation0.10ManualManualCurrentCurrent
Genetic var. (pre-2001)0.10ManualSemi-AutoCurr/BacklogGet Current
Map data0.05ManualManualCurr/BacklogGet Current

1E Curate information about gene function

Phenotype- allele0.80ManualSemi-AutoCurr/BacklogGet Current
Phenotype- RNAi (<50 genes)0.30ManualManualCurr/BacklogGet Current
Phenotype- RNAi (<50 gene interactions)0.10ManualManualInitiatedGet Current
Phenotype- RNAi (>50 genes)0.05Semi-AutoSemi-AutoCurr/Backlog Get Current
Phenotype- RNAi (>50 gene interactions)0.05Semi-AutoSemi-AutoInitiatedGet Current
Phenotype- transgene0.10ManualSemi-AutoInitiatedGet Current
Phenotype Ontology (PO)0.20ManualManualCurrentCurrent
Concise Descriptions- well-studied genes0.40ManualManualCurrentCurrent
Concise Descriptions- families & remaining0.40Man/SemiMan/SemiCurr/BacklogGet Current
Concise Descriptions- other nematode0.10N/ASemi-AutoN/AGet Current
GO- phenotype, InterPro, subcellular loc.0.20Semi-AutoSemi-AutoCurr/BacklogCurrent
GO- processes, component & function1.00ManualManualCurr/BacklogCurrent

1F Curate information about gene expression and regulation

Expression Pattern-transgene, antibody0.90ManualManualCurrentCurrent
Higher-level transcriptional regulation0.50ManualManualCurr/BacklogCurrent
DNA-Binding PWM & sites (small scale)0.20ManualManualInitiatedGet Current
DNA-Binding PWM & sites (large scale)0.30AutoAutoAnticipatedGet Current

1G Curate information about gene and gene product interactions

Gene-Gene Interactions (Textpresso-based)0.05Semi-AutoSemi-AutoCurrentCurrent
Gene-Gene Interactions (text & tables)0.10ManualManualInitiatedGet Current
Protein-Protein Interactions (small scale)0.25ManualSemi-AutoInitiatedGet Current

1H Curate cell-based anatomy and development information

Anatomy Ontology (AO)0.30ManualManualCurrentCurrent
Ablation, Site of Action0.40ManualManualInitiatedGet Current

1I Curate and represent process and genetic pathways

Processes and Pathways0.45N/ASemi-AutoAnticipatedInitiate

1J Curate and represent small molecules and physiology

Small molecules and Physiology0.10ManualManualAnticipatedInitiate

Ensembl automatic annotation pipeline

Ensembl automatic annotation pipeline2.00Semi-AutoSemi-AutoCurrentCurrent

2 Make the website easy to use, customizable, and compatible

2A We will build a flexible and durable web application

2B We will enhance usability through highly customizable and interactive displays

2C We will maintain a flexible and secure database architecture to handle additional data and

multiple genomes===

2D We will continue to explore Acedb migration paths and cross-MOD interoperability


Project Plan

We seek broad coverage of all genes for a given data type

Our priorities are:

  • to keep or get current with our existing high quality datasets
  • to increased efficiency of curation
  • to expand key aspects of our curation, computational and display pipelines to additional nematode genomic sequences
  • to expand our coverage of information related to gene regulatory networks
  • to provide convenient data mining databases for use by thebioinformatics community
  • to increase interoperability with other databases

A major new focus will be on other nematode genomes

  • Caenorhabditis species (briggsae, remanei, brenneri, japonica, and newly discovered species),
  • Pristionchus pacificus, and Heterorhabditis bacteriophora.
  • The >20 more distantly-related plant and animal parasitic nematodes include: Ancylostoma caninum, Ascaris lumbricoides, Onchocerca volvulus, Brugia malayi, Ostertagia ostertagii, Haemonchus contortus, Strongyloides ratti, Meloidogyne hapla, Trichinella spiralis, and Necator americanus. For these other species, our priorities will be gene structures and large-scale data sets, followed by individual curation efforts that can be efficiently done using our C. elegans pipeline. We will prioritize effort both by data type and species.