Sequence alignment of several structures

Issue #916 new
Former user created an issue

Dear members of the Grant Lab,

I'm trying to perform a structure PCA of around 8000 different structures form the same protein family, but I am having some trouble right in the first steps. It seems that the alignment step within Bio3D is getting stuck at some point. Would it be possible for me to align the sequences of these proteins outside Bio3D and then use the MSA information to make the structural alignment around a specified core? I haven't found a way to do this so far.

Thanks you very much in advance for your attention.

Comments (6)

  1. Xinqiu Yao

    Yes, you can use pre-aligned sequences and load them into R to do other analyses with bio3d. The alignment should be in FASTA format. Then, look at functions including read.fasta() and read.fasta.pdb(). Check the documents and also some tutorials about basic sequence/structure analysis or PCA. All available from http://thegrantlab.org/bio3d/

    Let me know if you still have problems.

  2. Fernando Augusto Teixeira Pinto Meireles

    Dear Xinqiu,

    Thank you very much for your reply! I was able to do the structural alignment using a combination of read.fasta() and read.fasta.pdb().

    I just have one more question: When I try to find a core for alignment of my proteins I got a message (for one of my larger MSA files) that the function was not able to find a single non-gap position in my alignment, though I’m sure that there are some positions that are conserved in all sequences (I checked for “-” in these columns using awk and all of them are completely filled). Could I be missing something in my file or does Bio3D has a special definition of a non-gap position?

  3. Log in to comment