Check for Independence in Joint Distributions; Two independent variables are not independent?

Pierre Denis repo owner

Hi, Sorry for this late answer. For some unknown reason, I was not notified by e-mail about this specific issue.

Actually, the formula P(AB) = P(A)*P(B) is about comparing probability values, where A, B are events. Your last statement compare probability distributions, which is something else. Furthermore, you try to compare probability distribution, which have no support value in common:

lea.joint(die_1, die_2) is the probability distribution of each of the 36 combinations (1,1), (1,2), …, (6,6), each having a probability 1/36
die_1 * die_2 is the probability distribution of the product of two dice, the values going from 1x1=1 to 6x6=36, with non-uniform prob. distribution

The values within the two probability distributions being different in format (tuples vs integers), they can never be equal, hence the answer False : 1.0

Now, you have several ways to get evidences of independence. Here are some checks based on specific values (you can change these values as you wish):

>>> P = lea.P
>>> P((die_1==5)&(die_2==6)) == P(die_1==5) * P(die_2==6)
True
>>> P((die_1==4).given(die_2==3)) == P(die_1==4)
True

If you want to avoid such tedious tests, you can also use information theory (see wiki page 3) :

>>> lea.mutual_information(die_1,die_2)
2.6645352591003757e-15
>>> lea.joint_entropy(die_1,die_2) - die_1.entropy - die_2.entropy
-2.6645352591003757e-15

These small values are to be interpreted as 0.0 (rounding errors unavoidable with float representation). As an exercise, if you want to show what happen with dependent variables, you could replace die_2 as

die_2 = 7 – die_1

Re-evaluating then the previous expressions shall exhibit different results since the two dice are now (strongly) interdependent.

Hope this helps, despite the delay! Do not hesitate to submit more complex cases.

2019-06-19T15:05:09+00:00

Comments (4)

Pierre Denis repo owner
Hi, Sorry for this late answer. For some unknown reason, I was not notified by e-mail about this specific issue.

Actually, the formula P(AB) = P(A)*P(B) is about comparing probability values, where A, B are events. Your last statement compare probability distributions, which is something else. Furthermore, you try to compare probability distribution, which have no support value in common:
- lea.joint(die_1, die_2) is the probability distribution of each of the 36 combinations (1,1), (1,2), …, (6,6), each having a probability 1/36
- die_1 * die_2 is the probability distribution of the product of two dice, the values going from 1x1=1 to 6x6=36, with non-uniform prob. distribution
The values within the two probability distributions being different in format (tuples vs integers), they can never be equal, hence the answer False : 1.0

Now, you have several ways to get evidences of independence. Here are some checks based on specific values (you can change these values as you wish):
```
>>> P = lea.P
>>> P((die_1==5)&(die_2==6)) == P(die_1==5) * P(die_2==6)
True
>>> P((die_1==4).given(die_2==3)) == P(die_1==4)
True
```
If you want to avoid such tedious tests, you can also use information theory (see wiki page 3) :
```
>>> lea.mutual_information(die_1,die_2)
2.6645352591003757e-15
>>> lea.joint_entropy(die_1,die_2) - die_1.entropy - die_2.entropy
-2.6645352591003757e-15
```
These small values are to be interpreted as 0.0 (rounding errors unavoidable with float representation). As an exercise, if you want to show what happen with dependent variables, you could replace die_2 as
```
die_2 = 7 – die_1
```
Re-evaluating then the previous expressions shall exhibit different results since the two dice are now (strongly) interdependent.

Hope this helps, despite the delay! Do not hesitate to submit more complex cases.
- 2019-06-19T15:05:09+00:00
Pierre Denis repo owner
- changed status to open
- 2019-06-19T15:08:05+00:00
Pierre Denis repo owner
- assigned issue to
  
  Pierre Denis
- 2019-06-19T15:09:06+00:00
Pierre Denis repo owner
- changed status to invalid
- 2019-07-01T20:51:15+00:00
Log in to comment