pca.project doesn't work if structures to project are smaller than those used to make the pca
Hello,
I have made a PCA using several pdb structures of my protein of interest and I would now like to project frames from an MD simulation atop (as done here http://thegrantlab.org/bio3d/articles/online/enma_vignettes/Bio3D_nma-dhfr-partI.html). The PCA has been calculated with 2 domains of my protein but one of my simulations only has one of the domains. When I try to project this simulation onto the PCA I get this error:
Error in project.pca(closed_tBamA_trj.fit[, closed_tBamA_inds$b$xyz], : Dimensionality mismatch: ncol(data)!=ncol(pca$U)
Is there any way to avoid this error, or is it impossible to project coordinates of a part of a protein onto a PCA calculated with the whole protein?
Comments (17)
-
-
but why can you make the initial PCA from a list of pdbs which have come from different organisms and thus have different lengths, only to then not be able to project frames of an MD onto it, is there no way that the function could do whatever the original PCA did to keep only the common parts of the structures?
-
The example in the tutorial you mentioned used the same dimension (i.e., the aligned C-alpha atoms) for both PDB and MD. In your case, you want to project a single-domain MD to a PCA done on two domains, which is impossible. You have to redo PCA with just one domain and then project your one-domain MD trajectory (and two-domain MD trajectories if you have, and in this case, pick up residues from MD that match the domain used in PCA).
-
In other words, the coordinates used in PCA can be a subset of all data (trajectories) you want to project, but not the opposite.
-
in my case the PCA was made fro ma curated list of PDBs that did not contain some accessory proteins while the MD I wish to project onto the PCA includes these accessory proteins. The problem seems sto be at the trajectory fitting to its own first frame though
-
It worked for one trajectory but now I have another issue when trying to add on a second trajectory- at the point where I’m aligning the first frame PDB of the trajectory onto the curated set of pdbs, I use
apo_inds <- pdb2aln.ind(pdbs, pdb_apo, gaps.res$f.inds)
and it returns the error
In pdb2aln.ind(pdbs, pdb_apo) :Gaps are found in equivalent positions in PDB
what can I do to make it ignore the gaps? rm.gaps=TRUE does not work
-
That tells there are one or more aligned positions (residues) in the original alignment that were used to calculate the PCA cannot find equivalent residues in the trajectory PDB.
There are two possible ways to solve it:
- Check the alignment manually (open the 'pdb2aln.fa’ file using SEAVIEW, for example) and see if you can fix the gap problem by adjusting the alignment based on, e.g., structures.
Or
- Redo PCA using aligned (non-gap) positions that all have equivalent residues in the trajectory PDB.
-
I’m confused since the PDB in question is so similar to the other larger one which didn’t return this issue…
-
I also tried remaking the PCA with the PDB included and still aligning the PDB to that PCA failed for the smae reason
-
If you included the PDB, then there shouldn’t be a problem. Can you provide a short example to reproduce the error?
-
sure.
pdb_apo <- read.pdb("path to file/apo.pdb") #works dcd_apo <- read.dcd("path to file/apo.dcd") #works pdbs <- pdbaln(files, fit=TRUE) #works gaps.pos <- gap.inspect(pdbs$xyz) #works gaps.res <- gap.inspect(pdbs$ali) #works pc.xray <- pca(pdbs, core.find=TRUE) #works apo_inds <- pdb2aln.ind(pdbs, pdb_apo, gaps.res$f.inds) #fails
plotting the PCA components 1 and 2 also works as does projecting a different trajectory on top of it.
How can I send yo uthe files?
-
I also tried the unwrapped version of making the PCA
-
You can attach files here or send them to me, xinqiu.yao@gmail.com. Make sure file sizes are not too large.
-
I'm just sending them now, all of the ones with ‘bilayer’ in the name were used to make the PCA
-
trying various things now remaking the intial list of PDBS, returning a new erroe
Error in read.fasta.pdb(s, prefix = "", pdbext = "", pdblist = files, :
No corresponding PDB files foundbut the files are the same as before, I haven’t changed them
-
- changed status to closed
Closed because a solution was provided by Email but the user has no response. Can be reopened if the issue is going on.
-
- marked as task
Not a bug
- Log in to comment
It is impossible to project part of a protein to a PCA using the entire protein. The dimension must match.