read.fasta.pdb ERROR
I am trying to renumber the residues of a pdb taking into account the alignment of its sequence with the uniprot sequence for this gene. I have used the following commands:
library(bio3d)
uni.seq <- get.seq("P51610")
file.name <- get.pdb("4GO6")
pdb <- read.pdb(file.name)
pdb.seq <- pdbseq(pdb, atom.select(pdb, chain="D", elety="CA"))
aln <- seqaln( seqbind( pdb.seq, uni.seq), id=c( file.name, unlist("P51610")) )
pdbs <- read.fasta.pdb(aln)
pdb_ali <- cbind( 1:ncol(pdbs$ali), t(pdbs$ali), pdbs$resno[1,])
I expect to have a table with 4 columns being:
- number of the residue respect the uniprot sequence;
- aminoacids of the pdb sequence;
- aminoacids of the uniprot sequence;
- number of the residue respect the pdb sequence --> they have to be the same as I get in the variable pdb.seq for every aminoacid.
The problem appears if I want to do it for a chain of the pdb that is not the first with the sequence used. In this case B and D are the same protein but have some differences in the aminoacids. If I do the same with the B chain, there aren't errors, but if I try to do it with chain D the comand --> read.fasta.pdb raise an error because the sequence of the alignment does not match the sequence in the pdb (it is checking chain B not D in the pdb)
ERROR:
ERROR Alignment mismatch. See alignment below for further details (row 1 of aln and sequence of './4GO6.pdb'). First mismatch residue in PDB is: ASP-1833 (B) occurring at alignment position: 1853
Comments (3)
-
-
First, this is not a bug.
The error explicitly tells you what happened: your sequence is from chain D but the pdb file you were trying to read is four chains, and thereby they mismatch.
To solve it, create pdb files based on single chains and read a chain-specific pdb file. For example,
file.name <- get.pdb("4GO6", split=TRUE) file.name [1] "./split_chain/4GO6_A.pdb" "./split_chain/4GO6_B.pdb" "./split_chain/4GO6_C.pdb" "./split_chain/4GO6_D.pdb" pdb <- read.pdb(file.name[4]) pdb.seq <- pdbseq(pdb) aln <- seqaln( seqbind( pdb.seq, uni.seq), id=c( file.name[4], unlist("P51610")) ) pdbs <- read.fasta.pdb(aln) resno <- rep(NA, ncol(pdbs$ali)) resno[!is.gap(pdbs$ali[1,])] <- trim(pdb, 'calpha')$atom$resno pdb_ali <- cbind( 1:ncol(pdbs$ali), t(pdbs$ali), resno) head(pdb_ali) ./split_chain/4GO6_D.pdb P51610 resno [1,] "1" "-" "M" NA [2,] "2" "-" "A" NA [3,] "3" "-" "S" NA [4,] "4" "-" "A" NA [5,] "5" "-" "V" NA [6,] "6" "-" "S" NA
-
- changed status to resolved
Q&A
- Log in to comment