inconsistency in 1-mer substitution matrices

Issue #73 resolved
Julian Zhou created an issue

HH_S1F, obtained by normalizing HS1FDistance (which is symmetric) by rows, is symmetric.

However, HKL_S1F and MK_RS1NF, produced by createSubstitutionMatrix(...returnModel="1mer"...), and reported in Cui et al., 2016, are not symmetric.

Need to find out why and update HH_S1F accordingly.

Previous documentation for HKL_S1F and MK_RS1NF saying that they are symmetric have been fixed.

Comments (7)

  1. Jason Vander Heiden

    Also, distances are calculated from targeting values for 5-mer models and from substitution values for 1-mer models.

  2. Jason Vander Heiden
    1. Rebuild HH_S1F from Yaari, 2013 S5F (or Namita's email).
    2. Delete M1N. Update docs of MK_RS1NF to indicate it replaces M1N.
  3. Jason Vander Heiden

    From @guryaari:

    # HS1F (all patients combined):
          A     C     G     T
    A     0 25080 39261 17697
    C 21319     0 26449 39500
    G 81060 51891     0 26768
    T 17620 39261 22516     0
    
    # Normalized:
              A         C         G         T
    A 0.0000000 0.2442934 0.5075163 0.2219227
    C 0.3057120 0.0000000 0.3248893 0.4944897
    G 0.4785709 0.3030779 0.0000000 0.2835875
    T 0.2157171 0.4526287 0.1675943 0.0000000
    
    # Symmetrized:
              A         C         G         T
    A 0.0000000 0.2343033 0.4874240 0.2782727
    C 0.2343033 0.0000000 0.2782727 0.4874240
    G 0.4874240 0.2782727 0.0000000 0.2343033
    T 0.2782727 0.4874240 0.2343033 0.0000000
    
    # 1/Symmetrized (for distance):
             A        C        G        T
    A 0.000000 2.080312 1.000000 1.751606
    C 2.080312 0.000000 1.751606 1.000000
    G 1.000000 1.751606 0.000000 2.080312
    T 1.751606 1.000000 2.080312 0.000000
    
  4. Julian Zhou reporter

    @javh For HS1FDistance and HH_S1F, is the goal to delete and replace HS1FDistance with the updated version of HH_S1F (just like we're deleting and replacing M1NDistance with MK_RS1NF)?

  5. Julian Zhou reporter

    This commit: 1) updated HH_S1F [now non-symmetric]; 2) deleted HS1FDistance, replaced with updated HH_S1F, made a note of replacement in HH_S1F's doc; 3) deleted M1NDistance, replaced with MK_RS1NF, and made a note of replacement in MK_RS1NF's doc

  6. Log in to comment