Calculate clonal frequencies

Issue #99 resolved
Giancarlo mazzetti created an issue

Hi There,

Thank you very much for the bioinformatic pipeline. we all are very grateful for your work.

I have a small question, in my group we work also with clonal frequencies to detect high expanded clones in the B cell repertoire.

Is there any way to calculate the clonal frequencies after the change-o step ? What I would like is to have a new column beside the detected CDR3 with the frequencies that it is present inside the total repertoire (example 0.001, 0.01, 0,05 ecc…).

Thank you in advance for your support.

Comments (2)

  1. Jason Vander Heiden

    Greetings @Giancarlo mazzetti,

    This can be done with alakazam::countClones. If you want to expand the results out to the full input data, you can do a left join to duplicate the abundance values across all members of a clone. For example:

    library(alakazam)
    library(dplyr)
    clones <- countClones(ExampleDb, groups="sample_id")
    db <- left_join(ExampleDb, clones, by=c("sample_id", "clone_id"))
    

    This will add the same value for sequence_count (number of sequences) and seq_freq (faction of sequences within groups) to each row of the input data with the sample sample_id and clone_id pair.

    Alternatively, if you have already subset the data to one row per clone, then you can just sort+cbind or do a standard inner join.

    More complicated statistical approaches are here:

    https://alakazam.readthedocs.io/en/stable/vignettes/Diversity-Vignette/

  2. Log in to comment