- edited description
Intervals and support of infinite support (e.g. Poisson, geometric distributions)
Lea is unable to model a probability distribution having infinite set of values, like Poisson and geometric probability distributions. The current Lea.poisson method returns an approximation, with the probabilities overestimated to get a sum of 1.
A better solution should be to accept, besides "normal" scalar values, special value objects representing interval of values, possibly unbounded, i.e. with infinite upper or lower bounds. For example, here is how a geometric distribution of probability 1/4 could be modelled:
1 : 68719476736/274877906944
2 : 51539607552/274877906944
3 : 38654705664/274877906944
4 : 28991029248/274877906944
5 : 21743271936/274877906944
6 : 16307453952/274877906944
7 : 12230590464/274877906944
8 : 9172942848/274877906944
9 : 6879707136/274877906944
10 : 5159780352/274877906944
11 : 3869835264/274877906944
12 : 2902376448/274877906944
13 : 2176782336/274877906944
14 : 1632586752/274877906944
15 : 1224440064/274877906944
16 : 918330048/274877906944
17 : 688747536/274877906944
18 : 516560652/274877906944
19 : 387420489/274877906944
(20+): 1162261467/274877906944
The last entry means the interval from 20 to +infinity, with associated probability sum. Then, the following expressions shall return exact results:
>>> g.p(19)
387420489/274877906944
>>> (g >= 19).p(True)
387420489/68719476736
>>> (g >= 20).p(True)
1162261467/274877906944
The issue shall be to cope with values lying in the in the interval. The easy way would be to raise an exception ("unknown probability"). A more sensible approach would be to raise an exception with a message giving the range of probability.
>>> g.p(20)
ERROR: unknown probability in the range[0, 1162261467/274877906944]
>>> g.p(21)
ERROR: unknown probability in the range[0, 1162261467/274877906944]
>>> (g > 20).p(True)
ERROR: unknown probability in the range[0, 1162261467/274877906944]
>>> (g <= 20).p(True)
ERROR: unknown probability in the range[273715645477/274877906944, 1]
but the implementation is expected to be difficult.
Comments (5)
-
reporter -
reporter - changed status to open
-
reporter First implementation of Interval - to be continued (refs #17)
→ <<cset 91561e298ee7>>
-
reporter -
reporter First implementation of Interval - to be continued (refs #17)
→ <<cset 44fe19548edb>>
- Log in to comment