Intervals and support of infinite support (e.g. Poisson, geometric distributions)

Issue #17 open
Pierre Denis repo owner created an issue

Lea is unable to model a probability distribution having infinite set of values, like Poisson and geometric probability distributions. The current Lea.poisson method returns an approximation, with the probabilities overestimated to get a sum of 1.

A better solution should be to accept, besides "normal" scalar values, special value objects representing interval of values, possibly unbounded, i.e. with infinite upper or lower bounds. For example, here is how a geometric distribution of probability 1/4 could be modelled:

   1 : 68719476736/274877906944 
   2 : 51539607552/274877906944 
   3 : 38654705664/274877906944 
   4 : 28991029248/274877906944 
   5 : 21743271936/274877906944 
   6 : 16307453952/274877906944 
   7 : 12230590464/274877906944 
   8 :  9172942848/274877906944 
   9 :  6879707136/274877906944 
  10 :  5159780352/274877906944 
  11 :  3869835264/274877906944 
  12 :  2902376448/274877906944 
  13 :  2176782336/274877906944 
  14 :  1632586752/274877906944 
  15 :  1224440064/274877906944 
  16 :   918330048/274877906944 
  17 :   688747536/274877906944 
  18 :   516560652/274877906944 
  19 :   387420489/274877906944 
(20+):  1162261467/274877906944

The last entry means the interval from 20 to +infinity, with associated probability sum. Then, the following expressions shall return exact results:

>>> g.p(19)
387420489/274877906944
>>> (g >= 19).p(True)
387420489/68719476736
>>> (g >= 20).p(True)
1162261467/274877906944

The issue shall be to cope with values lying in the in the interval. The easy way would be to raise an exception ("unknown probability"). A more sensible approach would be to raise an exception with a message giving the range of probability.

>>> g.p(20)
ERROR: unknown probability in the range[0, 1162261467/274877906944]
>>> g.p(21)
ERROR: unknown probability in the range[0, 1162261467/274877906944]
>>> (g > 20).p(True)
ERROR: unknown probability in the range[0, 1162261467/274877906944]
>>> (g <= 20).p(True)
ERROR: unknown probability in the range[273715645477/274877906944, 1] 

but the implementation is expected to be difficult.

Comments (5)

  1. Log in to comment