Mutual information doesn't seem to work for Markov Chain

Issue #63 resolved
Samuel Cheng created an issue

Please see the simple example below.

=====

from lea import markov

weather = markov.chain_from_matrix(('sunny','rainy'),
('sunny',(0.95, 0.05)),
('rainy',(0.8, 0.2)))

today=weather.next_state('sunny') # weather today, yesterday was sunny
tomorrow=today.next_state() # weather tomorrow
lea.mutual_information(today,tomorrow)

====

The computed mutual information is nearly zero. But the weathers for today and tomorrow should be highly correlated.

Comments (13)

  1. Pierre Denis repo owner

    Hi Samuel,

    Interesting issue! I agree that this is not the expected result. The root cause is that, in lea.markov design, the calculated distributions have no interdependency. In the test above, today and tomorrow are independent distributions. So, their mutual information is zero (or something close to zero, due to float rounding errors). It’s agreed that this is undocumented and counterintuitive, considering other Lea’s constructs.

    Here should be a provisional workaround.

    >>> tomorrow2 = weather._next_state_tlea.given(weather.state==today)
    >>> tomorrow2
    rainy : 0.0575
    sunny : 0.9425
    

    This is actually the same distribution as tomorrow, but with an internal representation that keeps the dependency with today. You get now:

    >>> lea.mutual_information(today, tomorrow2)
    0.00926634104120394
    

    This is still low but much higher than the previous result. This value is a bit surprising but it may be correct provided that there is not much uncertainty about sunny tomorrow, whatever today weather. Actually, the expanded formula is the following

    >>> today.entropy + tomorrow2.entropy - lea.joint(today,tomorrow2).entropy
    0.00926634104120394
    

    Could you please verify this value on your side, by other means?

    I need some time to think about a sensible fix in Lea. This is not that easy because it involves some design choices in lea.markov module. By the way, the tomorrow2 object above is not aware of any Markov chain and, in particular, it does not support next_state method.

  2. Samuel Cheng reporter

    Pierre, thanks for the quick reply!

    Somehow, it seems the number is off somehow. I tried to create a steady weather state by

    today=weather.next_state('sunny',1000)
    

    Then explicitly compute tomorrow’s weather as

    tomorrow = today.switch({'sunny':lea.pmf({'sunny': 0.95, 'rainy': 0.05}),'rainy':lea.pmf({'sunny': 0.8, 'rainy': 0.2})})
    

    Then

    >>> lea.mutual_information(today,tomorrow)
    0.010740523089006304
    

    I also tried to compute the conditional entropy explicitly and compute mutual information I(W1;W0) as H(W1) - H(W1|W0). The result seems to be consistent as above.

    from lea import P
    
    today=weather.next_state('sunny',1000)
    
    # H(W1|W0)=P('sunny')H(W1|W0='sunny')+P('rainy')H(W1|W0='rainy')
    cond_entropy = P(today=='sunny')*lea.event(0.95).entropy \
                    + P(today=='rainy')*lea.event(0.2).entropy
    
    # I(W1;W0)=H(W1)-H(W1|W0)
    print(f'Mutual information = {today.entropy - cond_entropy}')
    

  3. Pierre Denis repo owner

    Great! You found another workaround and also a different way to calculate the MI (I was not aware of this formula). BTW, should you ignore it, Lea provides out of the box a conditional entropy method.

    >>> today.entropy - today.cond_entropy(tomorrow)
    0.010740523089006304
    

    All these results are consistent, which is good news! As stated in my reply above, I'm currently thinking about a definitive fix, without side effects nor backward incompatibilities. Stay tuned...

  4. Pierre Denis repo owner

    This is now fixed, … after quite deep changes in lea.markov module (this was expected).

    weather = markov.chain_from_matrix(('sunny','rainy'),
                              ('sunny',(  0.95 ,  0.05 )),
                              ('rainy',(  0.80 ,  0.20 )))
    today = weather.next_state('sunny',1000)
    tomorrow = today.next_state()
    lea.mutual_information(today,tomorrow)
    # -> 0.010740523089006304
    

    The two variables now keep their dependency. This can be seen also, for example, by calculating the joint distribution of transitions:

    today + " -> " + tomorrow
    #-> rainy -> rainy : 0.011764705882352943
    #   rainy -> sunny : 0.04705882352941177
    #   sunny -> rainy : 0.04705882352941177
    #   sunny -> sunny : 0.8941176470588236
    

    (this expression give different result in Lea 3.4.0 or below).

    This fix will be included in Lea 3.4.1, to be released very soon.

  5. Pierre Denis repo owner

    Note that the fix introduces a side-effect however. For some next_state calls, if n is too large, a RecursionError may be raised.

    weather.next_state(weather.state,1000)
    #-> lea.lea.Error: RecursionError raised - HINT: decrease the value of n in next_state() call or add argument keeps_dependency=False, provided that keeping dependency with initial state is not required
    

    There are two algorithms for next_state, the default one keeping dependencies when required, which is more demanding in resources. The previous example with today started from a certain initial state 'sunny', allowing to use a simple algorithm because no dependency is involved with a certain event. Here, we have weather.state as initial state, which may require to keep dependency (as for calculating tomorrow above).

    If the distribution has just to be calculated, without keeping track of dependency (i.e. no need to calculate mutual information, joints, conditional probabilities, etc.), the workaround is to add argument keeps_dependency=False:

    weather.next_state(weather.state,1000,keeps_dependency=False)
    sunny : 0.9411764705882353
    rainy : 0.058823529411764705
    

    or simply:

    weather.next_state(weather.state,1000,False)
    

    This gives the correct result… but calling afterwards mutual_information with this distribution will 0 (or close to), as before the bug fix.

  6. Log in to comment