cna_for_large_complex
Hi,
I am doing cna for a large multimeric complex. The problem is that it takes forever, already more than 4 days.
The ncore seems to not work. Only a single core is used.
Is it a normal behavior of the cna()? I am trying to figure out whether my installation is OK.
thanks
Comments (23)
-
reporter -
cna() does not work with multicores. It is slow for large systems, but 4 days sounds strange. What parameters do you use, such as cij.cutoff, whether using contact map, etc? Have you checked edges with pymol()? If too many edges, the calculation will be super slow because of community detection.
-
reporter Hey Xinqiu,
Here is my code:
library(bio3d) load("image.RData") cij_for_cna <- dccm(modes) net <- cna(cij_for_cna,cutoff.cij_for_cna = 0.35)
Right now I am running another calculations with previously filtered correlation matrix. This should speed up the calculations, but I am not sure about reliability of my results, because I am going to filter out with cutoff.cij=0.5, and then network analysis (cna()) with cutoff.cij=0.35. So, I am afraid of removing of to much detail. My matrix is 4684x4684.
Regarding edges in pymol. Generally pymol does not work for me. The protein is to big. I plan to use VMD, but first I have to reach the stage of accomplished cna.
Do you have any ideas?
-
If the matrix is already filtered with 0.5 cutoff, you do not need any further cutoff in cna(). And current calculation should be faster. Let me know.
You do not need to call cna() before checking edges. But if pymol doesn’t work on your machine, it is a problem. I am not sure there is a vmd version for this purpose but will have a look shortly.
-
reporter Ok. Thanks. I will let you know whether it worked for me.
-
I found there is no vmd related function for checking correlations yet. I have a in-house code to show cylinders in vmd that might do the job for you (see the attached). You need an input to define residue pairs between which you want to have an edge, and also the value of the edge to define the radius (e.g. correlation strength) and color (positive or negative correlation) of each edge. For example,
# Support you have a correlation matrix called `cij`, # and a pdb object matching the cij. source('vmd.cylinder.R') resno <- pdb$atom$resno[pdb$calpha] inds <- which(lower.tri(cij), arr.ind=TRUE) edges <- cbind(resno[inds[, 1]], resno[inds[, 2]], cij[lower.tri(cij)]) vmd.cylinder(edges, pdb, file='test', cutoff =0.5)
Then, open a file called 'test.vmd' in VMD. Blue and red cylinders show positive and negative correlations, respectively, and cylinder radius is proportional to correlation strength.
You might want to start with a high cutoff first, because otherwise it will take a long time to generate the file and to load in VMD as well.
One disadvantage of the script is that it cannot adjust cutoff directly from VMD, like what
pymol.dccm()
does in pymol, which hampers your experiment on different cutoff values a little bit.Hope it will help.
-
- attached vmd.cylinder.R
-
reporter THANK YOU.
I will try and I will let you know.
Best
-
reporter Hi,
I managed to make pymol running on my machine and the things look much better now. I could play with different cij cutoffs and see how my systems looks like. Nevertheless, thank you very much for VMD script. In a free time I will also try it.
I would have a question, regarding the filtering of dccm for cna. I checked your paper "Dynamic coupling and allosteric networks in the alpha subunit of heterotrimeric G proteins" and I could see that you first filtered cmap with dcut=10, and then you filtered correlations in your system, using this cmap and cutoff.cij >=0.6. Correct? If yes, one question: How exactly did you do it?
-
cmap(pdb,dcut=10) <-- how about values of scut and pcut ?
-
filtered <- filter.dccm(cij, cutoff.cij=0.6, cmap=cm)
Correct?
Thank you
-
-
The method we used for the paper was actually based on a simulation "ensemble", i.e. for each network we had 5 simulation replicates and network edges were determined by checking the robustness of strong correlations across replicates, as described in the 'Methods' part of the paper. Specifically,
cm <- cmap(xyz, dcut=10, scut=1, pcut=0.75) cij <- filter.dccm(cijs, cutoff.cij=0.6, cmap=cm)
where
xyz
is the coordiates of C-alpha atoms from simulation trajectories andcijs
is a 3D array containing 5 correlation matrices calculated from the 5 simulation replicates (The 3rd dimension indicates the replicate).Hope it helps.
-
reporter Very helpful. Thank you.
So, the cna was simply:
net <- cna(cijs, cutoff.cij = 0.0)
Correct?
-
Yes, but remember input should be a matrix rather than an array, i.e. “cij” instead of “cijs”
-
reporter Sure. Thank you.
I think that you can close this ticket.
Best regards
-
- changed status to resolved
-
reporter Hi,
This is a continuation, so I don't want to open a new ticket.
I filtered my cij and then I run cna(), using this cij and cmap.
My network:
COMMUNITY NODES#: 269 EDGES#: 50
max(net$communities$modularity) = 0.8812209
tree$num.of.comms[ max.mod.ind ] = [1] 269
all( net$communities$membership == tree$tree[max.mod.ind,] ) = [1] TRUE
As you could see my network is characterised by extremely high modularity. Furthermore, not all communities are connected, but this is acceptable. I know my system, so I know what I should expect. I looked at my tree (community.tree()) and I could see a clear pick in modularity, but not sure about the value. I think its between 8-9. I don't know how to export my plot(tree) to text file. I tried with write.table().
I looked at the system in vmd.cna(). When viewing in VMD I coarse grained my network by
radius = table(net$communities$membership)/35
and it looks good. Now, I want to get the same network, using plot.cna(). I mean, number, location and size of the communities, will be the same as in VMD. Is it possible?
Before plot.cna() I tried:
#! memb.k.X <- tree$tree[ tree$num.of.comms == X, ] new.net <- network.amendment(net, memb.k.X)
, where X were 8 or 9. Nevertheless, I am getting: Error in network.amendment(net, memb.k.X) : Input membership and x$community$membership must be of the same length. I get it, but how to coarse grain "net" to get 8 or 9 communities? I think that is the problem?
Also, I would have a general question. Could you advise me, which algorithm would be best or worth to try in order to get a network with smaller number of communities? As I said, the cutoff for cij is OK, so I don't think that you can get more using default algorithm. I tried combinations of different dcut and scut and the changes are marginal. The only thing, which matters here is cutoff for cij. I have also tried prune(), but did not work well for me. I went through all threads about cna on this forum, but I could not find an answer to my problems.
thanks
-
You should get exactly the same number and sizes of communities from
vmd.cna()
andplot.cna()
. However, the location might be different because while you can freely rotate the molecule in VMD, the layout inplot.cna()
is fixed by default. Tryplot.cna(..., interactive=TRUE)
and then you can manually adjust the location in the 2D plot to match that is shown in VMD.It is weird that you got an error when changing the number of communities using
network.amendment()
. Have you modifiednet
? Could you print the contents ofmemb.k.x
andnet$community$membership
and paste them here for checking?There are other ways to detect communities in
cna()
and definitely you should try to see if results get better. Seehelp(cna)
for more detail. But to simply reduce the number of communities, I would recommend usenetwork.amendment()
after resolving the problem.Let me know whenever you get any updates.
-
reporter Hi,
Thank you for your reply.
When I invoke "interactive=TRUE" I get this error message. It works normally without this flag: Error in if (is.character(col) && any(substr(col, 1, 1) == "#" & nchar(col) == : missing value where TRUE/FALSE needed
I tried on two different computers and operating system.
Currently working on your other comments.
thanks
-
Can you provide more info about your environment? (For example, copy&paste the output of
sessionInfo()
). You need latest bio3d and igraph, and also need an R supporting tcltk. A quick way to check is printingsessionInfo()
afterlibrary(bio3d)
andlibrary(igraph)
. -
reporter Sessioninfo():
**R version 3.4.1 (2017-06-30) Platform: x86_64-apple-darwin15.6.0 (64-bit) Running under: OS X El Capitan 10.11.3
Matrix products: default BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib
locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages: [1] stats graphics grDevices utils datasets methods base
other attached packages: [1] igraph_1.1.2 bio3d_2.3-3.9000
loaded via a namespace (and not attached): [1] compiler_3.4.1 magrittr_1.5 parallel_3.4.1 tools_3.4.1
[5] Rcpp_0.12.13 grid_3.4.1 pkgconfig_2.0.1 tcltk_3.4.1**
When I try:
plot.cna(net, layout=cent.full, full=TRUE, vertex.label=NA, vertex.size=5, interactive = TRUE)
Error in if (is.character(col) && any(substr(col, 1, 1) == "#" & nchar(col) == : missing value where TRUE/FALSE needed
thank you
-
Your R environment seems fine. Not sure if it is OS problems as I don't have a Mac to test. Can you find a Linux or Windows machine and try it again? I have tested both and they worked well.
Alternatively, you can send me an example and see if I can reproduce your errors. xinqiu.yao@gmail.com. Let me know.
-
reporter Hi,
Regarding the plot.cna(), restating the computer did the job. The flag interactive works. Also, the colouring, the number of communities and their size, are the same as in VMD. Nice:)
General question. Once you mentioned (one of the previous posts) that "To my own experience, a good proportion of edges (i.e. number of edges divided by total number of possible node pairs) is around 4%." By number of edges you meant: all edges of the network divided by number of all network nodes?
When using the option "interactive" I can't rotate my network using mouse. I mean, I can manually adjust location of communities, but not necessary select whole network and rotate as a one object. The only way of moving whole network is to use GUI options, e.g., move by 1, 5, 90 degrees. Is it normal behaviour? I want to make sure that "interactive" works correctly.
thank you
-
Glad you've solved the plot.cna() problem.
Regarding 4%, I meant number of edges divided by number of node pairs not just nodes. For example, for a 5-node system, the number of all possible node pairs should be 10. Note that this percentage is only a rough estimation and I cannot guarantee its applicability to many different systems. Be sure check network edges using e.g.
pymol.dccm
before running into deeper network analysis.Yes, it is normal. So far, I haven't seen an option to rotate the entire network by an arbitrary degree using mouse.
-
reporter OK. I get it.
You can close the ticket. In case of further problems, I will reopen again.
THANK YOU
- Log in to comment
I would be grateful if you can comment on it.
thanks