get.seq with db='swissprot'

Issue #216 resolved
Lars Skjærven created an issue

Could we reconsider limiting input IDs to 6 chars for db='swissprot' in get.seq?

> get.seq("TCPA_YEAST", db="swissprot")
trying URL 'http://www.uniprot.org/uniprot/TCPA_Y.fasta'
Error in download.file(get.files[k], outfile, mode = "a") : 
  cannot open URL 'http://www.uniprot.org/uniprot/TCPA_Y.fasta'
In addition: Warning messages:
1: In get.seq("TCPA_YEAST", db = "swissprot") :
  ids should be standard 6 character SWISSPROT/UNIPROT formart: trying first 6 char...
2: In get.seq("TCPA_YEAST", db = "swissprot") :

Seems like the full length uniprot ID can be used:

download.file("http://www.uniprot.org/uniprot/TCPA_YEAST.fasta", tempfile())
trying URL 'http://www.uniprot.org/uniprot/TCPA_YEAST.fasta'
Content type 'text/plain' length 700 bytes
opened URL
==================================================
downloaded 700 bytes

Comments (9)

  1. Xinqiu Yao

    That's a good point. I think we should remove the limiting here, which is only meaningful when it represents a PDB ID.

  2. Xinqiu Yao

    Sorry for the confusing. Please see the amendment in this commit

    get.seq("TCPA_YEAST", db="swissprot")
                           1        .         .         .         .         50 
    sp|P12612|TCPA_YEAST   MSQLFNNSRSDTLFLGGEKISGDDIRNQNVLATMAVANVVKSSLGPVGLD
                           1        .         .         .         .         50 
    
                          51        .         .         .         .         100 
    sp|P12612|TCPA_YEAST   KMLVDDIGDFTVTNDGATILSLLDVQHPAGKILVELAQQQDREIGDGTTS
                          51        .         .         .         .         100 
    
                         101        .         .         .         .         150 
    sp|P12612|TCPA_YEAST   VVIIASELLKRANELVKNKIHPTTIITGFRVALREAIRFINEVLSTSVDT
                         101        .         .         .         .         150 
    
                         151        .         .         .         .         200 
    sp|P12612|TCPA_YEAST   LGKETLINIAKTSMSSKIIGADSDFFSNMVVDALLAVKTQNSKGEIKYPV
                         151        .         .         .         .         200 
    
                         201        .         .         .         .         250 
    sp|P12612|TCPA_YEAST   KAVNVLKAHGKSATESLLVPGYALNCTVASQAMPKRIAGGNVKIACLDLN
                         201        .         .         .         .         250 
    
                         251        .         .         .         .         300 
    sp|P12612|TCPA_YEAST   LQKARMAMGVQINIDDPEQLEQIRKREAGIVLERVKKIIDAGAQVVLTTK
                         251        .         .         .         .         300 
    
                         301        .         .         .         .         350 
    sp|P12612|TCPA_YEAST   GIDDLCLKEFVEAKIMGVRRCKKEDLRRIARATGATLVSSMSNLEGEETF
                         301        .         .         .         .         350 
    
                         351        .         .         .         .         400 
    sp|P12612|TCPA_YEAST   ESSYLGLCDEVVQAKFSDDECILIKGTSKHSSSSIILRGANDYSLDEMER
                         351        .         .         .         .         400 
    
                         401        .         .         .         .         450 
    sp|P12612|TCPA_YEAST   SLHDSLSVVKRTLESGNVVPGGGCVEAALNIYLDNFATTVGSREQLAIAE
                         401        .         .         .         .         450 
    
                         451        .         .         .         .         500 
    sp|P12612|TCPA_YEAST   FAAALLIIPKTLAVNAAKDSSELVAKLRSYHAASQMAKPEDVKRRSYRNY
                         451        .         .         .         .         500 
    
                         501        .         .         .         .         550 
    sp|P12612|TCPA_YEAST   GLDLIRGKIVDEIHAGVLEPTISKVKSLKSALEACVAILRIDTMITVDPE
                         501        .         .         .         .         550 
    
                         551       559 
    sp|P12612|TCPA_YEAST   PPKEDPHDH
                         551       559 
    
    Call:
      read.fasta(file = outfile)
    
    Class:
      fasta
    
    Alignment dimensions:
      1 sequence rows; 559 position columns (559 non-gap, 0 gap) 
    
    + attr: id, ali, call
    
  3. Log in to comment