GSA Project Index

Overview pages

File locations

  • All files are on -- /data1/Users/arunr/gsa/
Original XML
from DJS
linked XML/worm/html/yeast/html/fly/html
Known entities/worm/known_entitiesyeast/known_entities/fly/known_entities
Entity source filesvarious, see wb specific entity sectionyeast source file for entitiesFB entity scripts and rules
Lexicon filesWB lexicon
WB exclusions
SGD lexicon
SGD exclusions
FB lexicon
FB exclusions
Entity link tablesWB link tablesSGD link tablesFB link tables
upload linked HTML fileupload WB linked fileupload SGD linked fileupload FB linked file
file ftp'd to DJS
need login and password


  • All the scripts are located on textpresso-dev /data1/Users/arunr/gsa/<species>/scripts/ where <species> is one of [worm, yeast, fly].
ScriptPurposeFor databaseTo launch
GSA/DJS webservice client scriptDeposits XML on textpresso serverAllN/A
01downloadModEntities.plgets the list of entities from the MODsAll
02formSortedLexicon.plforms the lexiconAll
03link.plgenerates the linksAll
removeUnwantedLinkssubroutine for false positive detection
rerun linking scriptreruns the linking script on a given paper
used by WB after adding new objects to the
author first pass form
04 <entitylinktable>creates a table of the objects linked
and the link, for WB, this script also checks the
site to see if the page is live or not
All, see note about WB
arg <path to file>
FTPs the linked file to DJS and emails
confirmation to DJS and the curators.
This script changes the .html file
name to .xml and deposits a copy in the linked_xml
directory on textpresso-dev as well as a
copy on
AllTo manually launch this script
./ with arg <path to .XML>
upload WB linked file
upload SGD linked file
upload FB linked file

NOTE: Make sure the file name of the one you are uploading is the same one that was send in the initial link.
Access the DJS ftp server here:
(ask Karen for the login and password). Open file and make sure it is the QC'd version, the suffix should be .xml.

Manual QC editing

  • Editing rules and guidelines click here
  • Editing tools
    • vi editor
    • Espresso for MAC OS by macrabbit
    • Komodo for MAC OS

WB QC page shortcuts

Desired changes/fixes/enhancements to the pipeline

  • Entity list table
    • entity list with live/dead links measured to accompany the first linked XML
    • add values for the total number of distinct links as well as total links per paper to the entity list tables for before and after manual QC
  • change juancarlos' html/link list viewer to textpresso-dev from dev.textpresso
  • journal first pass form
    • add a button 'no new objects to declare'
    • add phenotype box
    • add an 'other' field that automatically has a ~ ignore function