LSA_output; several LS for same OTU pairs

Issue #10 invalid
Former user created an issue

Hi, I have some issues to understand and analysis the output of an LSA analysis. I have run 'lsa_compute ' via 'par_ana' on a sequencing dataset as follow:

python ~/charade-elsa-80c7298487ce/lsa/par_ana.py Raw_abundance_OTUs_dom.txt Raw_abundance_OTUs_dom.lsa 'lsa_compute.py %s %s -e Raw_abundance_OTUs_dom.txt -r 1 -d 0 -s 7 -b 0 -n robustZ -p theo' $PWD

Here, I would like to analysis the OTUs cooccurrence in 7 sites (delay =0, then).

After filtering of the output files (based on p and q values), I haver noticed that for some OTU pairs, I have several LS scores. For instance, i have:

X Y LS lowCI upCI Xs Ys Len Delay P PCC Ppcc SPCC Pspcc Dspcc SCC Pscc SSCC Psscc Dsscc Q Qpcc Qspcc Qscc Qsscc Xi Yi 
OTU_1380    OTU_12316   -1.083529   -1.083529   -1.083529   6   6   1   0   0.016419    -0.128134   0.80885 -0.128134   0.80885 0   -0.054772   0.917924    -0.054772   0.917924    0   0.005144    0.999432    0.999432    0.002234    0.002234    1   2826
OTU_1380    OTU_12316   1.069152    1.069152    1.069152    6   6   2   0   0.018619    0.620174    0.189003    0.620174    0.189003    0   0.602495    0.205611    0.602495    0.205611    0   0.005144    0.982799    0.982799    0.000564    0.000564    1   989
OTU_1380    OTU_12316   1.069152    1.069152    1.069152    6   6   2   0   0.018619    0.620174    0.189003    0.620174    0.189003    0   0.602495    0.205611    0.602495    0.205611    0   0.005144    0.982799    0.982799    0.000564    0.000564    1   98
OTU_1380    OTU_12316   1.069152    1.069152    1.069152    6   6   2   0   0.018619    0.620174    0.189003    0.620174    0.189003    0   0.602495    0.205611    0.602495    0.205611    0   0.005144    0.982799    0.982799    0.000564    0.000564    1   929
OTU_1380    OTU_12316   1.069152    1.069152    1.069152    6   6   2   0   0.018619    0.620174    0.189003    0.620174    0.189003    0   0.602495    0.205611    0.602495    0.205611    0   0.005144    0.982799    0.982799    0.000564    0.000564    1   920
OTU_1380    OTU_12316   1.069152    1.069152    1.069152    6   6   2   0   0.018619    0.620174    0.189003    0.620174    0.189003    0   0.602495    0.205611    0.602495    0.205611    0   0.005144    0.982799    0.982799    0.000564    0.000564    1   695
OTU_1380    OTU_12316   1.069152    1.069152    1.069152    6   6   2   0   0.018619    0.620174    0.189003    0.620174    0.189003    0   0.602495    0.205611    0.602495    0.205611    0   0.005144    0.982799    0.982799    0.000564    0.000564    1   664
OTU_1380    OTU_12316   1.069152    1.069152    1.069152    6   6   2   0   0.018619    0.620174    0.189003    0.620174    0.189003    0   0.602495    0.205611    0.602495    0.205611    0   0.005144    0.982799    0.982799    0.000564    0.000564    1   431

How should I interpret such results? In this example, all the results are ordered according to their p and q values. Here, the first line (best p and q values) has a negative LS score while the other ones are all positive. It is quite confusing. Best,

Comments (5)

  1. Charlie Xia repo owner

    Hi. Thanks for raising the issue. Can you attach a minimal input to replicate the error. On the meanwhile, you can filter out the duplicated lines if they are just identical to proceed your project.

    Thank you.

  2. jeff63

    Hi, And thank you for your reply. Here is the first lines of my input dataset:

    #OTU_ID t1      t2      t3      t4      t5      t6      t7
    OTU_12283       40      8       935     3       157     80      345
    OTU_9775        390     76      11      86      487     513     2
    OTU_751 4       38      33      18      49      3       1419
    OTU_11718       0       8       0       0       1       0       1554
    OTU_6787        12      83      607     523     3       170     158
    

    Indeed, it seems that most of the lines are duplicated (solely the last output criteria (Yi) may varied between them). My question was concerning the case when I may have two different output results for the same OTUs pairs. I mean "different" when the LS scores can be either negative or positive for a relative similar p/q values. I have solely this "case" 15 times for a total of around 20000 edges but I wonder what values should I trust (which may change the biological meaning of such OTUs pairs). In each case, I have only one line with a negative LS score for a lot of duplicated lines with positive and identical LS score.

    Thanks again for your help.

  3. Charlie Xia repo owner

    lsa_compute ../test/multiLS.txt ../test/multiLS.lsa -r 1 -s 7 -d 3 -p theo -x 1000 -f none -n percentileZ -e ../test/multiLS.txt -m 0

    tried your file with above command, cannot replicate error. Mark as invalid and close.

  4. Log in to comment