Using EBI SOAP or REST Muscle

Issue #174 resolved
Barry Grant created an issue

Perhaps worth considering, If a user does not have a local instal of muscle we could try to use the online EBI web serves version, see:

http://www.ebi.ac.uk/Tools/webservices/services/msa/muscle_rest http://www.ebi.ac.uk/Tools/webservices/services/msa/muscle_soap

This would be a low priority change. Would it be worth implementing?

Comments (11)

  1. Xinqiu Yao

    Have implemented this functionality in new_funs/seqaln.R. See this commit. For example, under an environment without MUSCLE/CLUSTALO:

    source('~/bio3d/new_funs/seqaln.R')
    seqs <- get.seq(c("4q21_A", "1ftn_A"))
    aln <- seqaln(seqs)
    Launching external program failed:
      make sure 'muscle' is in your search path
    Will try to align sequences online.
    Error in seqaln(seqs) : 
      A valid E-Mail address is required to use EMBL-EBI Web Service
    
    
    aln <- seqaln(seqs, web.args=list(email='XXX@XXX.edu'))
    Launching external program failed:
      make sure 'muscle' is in your search path
    Will try to align sequences online.
    Job successfully submited (job ID: muscle-R20160916-221649-0641-523738-pg)
      Waiting for job to finish...
    > aln
                                   1        .         .         .         .         50 
    [Truncated_Name:1]gi|1578369   --MTEYKLVVVGAGGVGKSALTIQLIQNHFVDEYDPTIEDSYRKQVVIDG
    [Truncated_Name:2]gi|1578311   MAAIRKKLVIVGDGACGKTCLLIVNSKDQFPEVYVPTVFENYVADIEVDG
                                         ***^** *^ **^ * *      * ^ * **^ ^ *   ^ ^** 
                                   1        .         .         .         .         50 
    
                                  51        .         .         .         .         100 
    [Truncated_Name:1]gi|1578369   ETCLLDILDTAGQEEYSAMRDQYMRTGEGFLCVFAINNTKSFEDI-HQYR
    [Truncated_Name:2]gi|1578311   KQVELALWDTAGQEDYDRLRPLSYPDTDVILMCFSIDSPDSLENIPEKWT
                                       * ^ ******^*  ^*       ^  *  * *    * * *   ^  
                                  51        .         .         .         .         100 
    ...
    Call:
      seqaln(aln = seqs, web.args = list(email = "xinqyao@umich.edu"))
    
    Class:
      fasta
    
    Alignment dimensions:
      2 sequence rows; 206 position columns (176 non-gap, 30 gap) 
    
    + attr: id, ali, call
    

    One drawback is: The server always requires a valid E-Mail address. So, we require users to provide their emails if no local program is found. Let me know what you think!

  2. Lars Skjærven

    Excellent Xinqiu. This looks awesome. Too bad about the email, but a dummy-email does work:

    aln <- seqaln(seqs, exefile="", web.args=list(email="test@thegrantlab.org"))
    Launching external program failed:
      make sure '' is in your search path
    Will try to align sequences online.
    Job successfully submited (job ID: muscle-R20160919-092250-0863-18915181-es)
      Waiting for job to finish...
    
  3. Xinqiu Yao

    Yes, it works with dummy-email but may be of risk, as on the website it states: If you use a fake e-mail address then we will not be able to contact you and will very likely result in your jobs being killed and your IP, Organisation or entire domain being black-listed, see Terms of Use.

    Hmm, idea?

  4. Barry Grant reporter

    We will have to ask a user for their email address within the function along with a msg stating something like:

    "You do not have muscle installed/working locally on your machine. We can attempt to use the EBI webserver if you provide an email address (required by the EBI).

    Please note that the EBI state 'using fake e-mail address may result in your jobs being killed and your IP, Organisation or entire domain being black-listed', see their Terms of Use."

    I think this is a very useful new feature. The email address requirement is a shame but we can live with it until a better server becomes apparent.

  5. Log in to comment