Calculation of 'significantly' stronger correlation couplings in CNA

Issue #768 closed
Former user created an issue

Hello Bio3D team, I have a query regarding calculation of significantly stronger couplings in CNA, as done in 'Comparative structural dynamic analysis of GTPases' plos comp bio paper and in '' jbc paper.

if i have n number of replicates (in my case its 3 for each state), and i do construct CNA for different states of my protein, where underlying dccm is consensus average such that value is >=0.6 in all replicates for given state.

How should i calculate significantly stronger couplings in communities, a) Do i have to only consider common communities in different states of my system ( for example communities that have atleast 90% of residues overlap ?) b) or should i take residue membership of wildtype state and impose it on other different states (such that my overall community count can be increased prior performing statistical test) c) once i have comparable 'consensus' community structure from either of a) or b), performing either of wilcox or t test will give me only significance of their distributions, how to actually infer 'significantly' stronger or weaker 'community' couplings from this ?

Comments (3)

  1. Xinqiu Yao

    Hi,

    First of all, 3 replicates might be too small to talk about statistical significance. I recommend increase the number to at least 5.

    To me, both a) and b) solutions might be biased a little bit. By “consensus”, you might want to consider all the states not just the wildtype. Also, you need identical community definitions for the comparison and so 90% overlap may still be not enough. In the papers, we derived the consensus communities by considering several factors, including the original community partition obtained directly from the calculations, secondary structures, conserved sequence motifs, etc. A more general method can be using the so-called “common contact” across states and detecting communities from the network defined by these contact (Hence, communities can be regarded as some “invariant” subdomains and changes between communities are of the primary interest to explore). For example, see the paper: https://pubs.acs.org/doi/10.1021/acs.jcim.8b00250

    No matter which method you choose, once you have a community partition, apply it to the correlation matrices of all states and all replicates. Difference between communities is calculated by summing up all residue correlations between communities, then taking average across replicates, and then subtracting from one state to the other. Significance test gives a p-value showing how likely the difference occurs by chance (i.e., not significant). Note that the significance test is across replicates, not inter-community residues. So, by applying some thresholds (e.g., p-value <0.05 and fold change >2), you can identify edges that contain significantly different average correlations between states.

    Hope it may help.

  2. Log in to comment