Thanks for the package, it’s been helpful to me as I learn to analyse TCR data from the 10X Genomics platform.
I’ve got a few questions about diversity tests at a fixed diversity order, some based on the vignette (https://alakazam.readthedocs.io/en/stable/vignettes/Diversity-Vignette/#view-diversity-tests-at-a-fixed-diversity-order) and some more general questions:
- The q=0 and q=2 plots look identical to my eye; should they be different?
- Do the three points on the plot correspond to (mean - sd, mean, mean + sd) and the error bars around each point indicate some measure of bootstrap variation?
- Do you have any references for how the significance testing is actually implemented and/or justified? E.g., my understand the boostrapping is done to estimate the per-sample variability of the diversity curves but not how they values are actually compared between samples (e.g., t-test, Wilcoxon test, etc.).
- How do you recommend that the significance testing be extended to a situation with replicates? For example, say I have 2 conditions (infected and uninfected) and 3 biological replicates per condition (infected_1, infected_2, infected_3, uninfected_1, uninfected_2, uninfected_3; samples are not paired). Seemingly, the current implementation allows me to do pairwise tests between each replicate by specifying
group = “sample”(e.g., infected_1 vs. infected_2, …, infected_1 vs. infected_6, …., infected_5 vs. infected_6) or to compare the conditions by aggregating the data across the replicates by specifying
group = “sample”, but this essentially ignores the biological variability (at least that’s my initial impression coming from a background of analysing gene expression data).
Thanks for any help or guidance you can provide