Wiki
Clone wikiLea / Lea3_Tutorial_1
Table of Content
- Introduction
- Defining discrete probability distributions
- Probability distribution indicators (mean, var, std, mode, entropy methods)
- Random sampling (random, random_draw methods)
- Combining probability distributions
- Referential consistency
- Conditional probabilities (given method)
- Getting raw distribution data (support, ps, pmf_tuple, pmf_dict, cdf_tuple, cdf_dict methods)
- What's next?
Introduction
The present tutorial explains how to use the Lea package, through a sequence of simple examples executed in the Python console.
Note: You may also practice the present tutorial online in the Jupyter Notebook Tutorial #1 (no installation required)!
What is Lea?
Lea is a Python library for dealing with discrete probability distributions in a simple way.
Lea allows you modeling a broad range of random phenomena, like gambling, weather, finance, etc. It lets you define random variables with given probability distributions on any Python object; then, these random variables can be combined together using usual arithmetic, comparison, logical operators, joints, conditional probability and even user-defined functions. Lea can then calculate the probability distributions on these derived random variables; it provides also usual probability indicators (e.g. the mean and the variance) and measures from information theory (e.g. the entropy). Generation of random samples can be performed on any probability distribution, even non-uniform. Advanced functions let you build up probabilistic models using joint probability distributions (JPD), conditional probability tables (CPT), Bayes networks and Markov chains. Lea is definitely a toolkit for probabilistic programming.
Lea is designed to be easy to use, putting the focus on the expressiveness of the models and hiding as much as possible the underlying mathematics. It can be used as a toolkit for teaching probability theory and probabilistic programming.
Lea offers different ways to represent probabilities, including floating-point numbers, fractions and even symbolic variables (thanks to SymPy library). You can easily switch from one representation to another. Whatever the probability representation, the same calculation algorithm is used, producing results with the probability type you've chosen.
The probability calculation in Lea is performed through an original algorithm, the Statues algorithm, which is based on generators and variable binding. This algorithm performs exact calculation. The implementation has been tuned to optimize the CPU. However, should the combinatorial become prohibitively large, an approximate algorithm based on Monte-Carlo sampling is also available.
Lea is a pure Python library. Most of the functions do not require installation of other packages. For advanced functions however, you may take advantage of the following optional libraries (these are automatically detected and integrated when you use Lea): matplotlib for plotting histograms, SymPy for symbolic calculations and pandas for building joint probability distributions. Note that these three libraries are part of the Anaconda distribution; so, if you install Lea within your Anaconda environment, then you have access to all Lea features without worrying about installation.
Lea 4 runs on Python 3.8+. For earlier Python versions, you may use Lea 3 (Python 2.6+).
Lea is available under LGPL license.
Prerequisites
In order to execute the statements given in the present tutorial, you need
- Python 3.8+
- Lea 4 module installed - see Lea installation page. Lea 4 is the baseline for the present tutorial. Most examples are valid in Lea 3.
Lea is a pure Python module. For most of the functions, there are no other module to install. As already explained above, you may optionally install matplotlib, SymPy and pandas to have access to a couple of advanced features. These libraries can be installed at any time, independently each from each other, without reinstalling Lea.
How to use this tutorial?
Note: you can skip this section if you are already experienced in Python!
REPL mode
The best way to learn Lea is to use it interactively, through the REPL mode ("Read–eval–print loop). You start a Python console and type the following statement:
>>> import lea
>>>
is just the the Python prompt. From here, all the examples provided in the present tutorial can be executed. What follow the >>>
prompt are the Python statements or expressions that you type; then, on the line(s) below, the result of the execution are displayed (if any).
See this as your new probability pocket calculator!
Program mode
When you are confident with Lea and you are willing to make more serious things, all you have to do is importing lea
module in your program. In such case, when you need to display results, you just have to pass the Lea instances and expressions to the standard print(…)
function (using print
not necessary in the REPL mode, hence omitted in the following).
Hints and tips
When you are used to Lea functions, you could also use the alternative form:
>>> from lea import *
lea.
prefix in all the examples! This is actually a handy feature of Python. The first variant has been preferred however in the present tutorial, so you can clearly distinguish Lea functions from Python builtin functions.
Should you want to follow the present tutorial using probability displayed as fractions instead of floating-point numbers, then simply execute the following statement, once for all:
>>> lea.set_prob_type('r')
'1/4'
). More details on this topic are given in the section "Changing probability representation".
Defining discrete probability distributions
The first thing to do when working with Lea is to define a probability distribution that models a given phenomenon. In Lea, probability distributions are discrete and finite; internally, any probability distribution X is defined by a collection of associations v : P(X=v), which defines for any possible value v of X, the probability that X=v. Such collection is named a probability mass function or, for short, a pmf. So, in Lea, defining a probability distribution means defining a pmf.
There are essentially three ways to define such pmf:
- providing a collection of values assumed to be equiprobable,
- providing an explicit pmf,
- using predefined standard functions (binomial, poisson, etc.).
Whatever the chosen method, the result is a new probability distribution, defined by a probability mass function associating each possible value to the probability that it occurs.
We explain hereafter how these 3 approaches can be followed in Lea.
Definition by values (vals
, interval
functions)
The simplest way to define a Lea distribution is by providing a collection of values. The vals(…)
function does that job. As an example, the following statement defines the result of coin flipping, assuming fair chances to get head or tail:
>>> flip = lea.vals('Head','Tail')
The pmf distribution can be displayed simply by entering its name. This displays the different values, in sorted order, with their respective probabilities:
>>> flip Head : 0.5 Tail : 0.5
print
function to get the same output:
print (flip)
If the same value occurs several times in the arguments, then its probability of this value is proportional to its number of occurrences. For instance, an unfair coin can be defined by the following statement:
>>> biased_flip = lea.vals('Head', 'Tail', 'Tail')
>>> biased_flip Head : 0.3333333333333333 Tail : 0.6666666666666666
You can associate probabilities to Python objects of any type, provided that it is hashable (which is the case for most Python types). Consider for example a fair die with six faces. It is natural to model this die with integers from 1 to 6:
>>> die = lea.vals(1, 2, 3, 4, 5, 6) >>> die 1 : 0.16666666666666666 2 : 0.16666666666666666 3 : 0.16666666666666666 4 : 0.16666666666666666 5 : 0.16666666666666666 6 : 0.16666666666666666
>>> my_tuple = (1, 2, 3, 4, 5, 6) >>> die = lea.vals(*my_tuple) >>> die = lea.vals(*range(1, 7))
>>> die = lea.vals(*[v+1 for v in range(6)]) >>> die = lea.vals(*(6-v for v in range(6)))
ordered
keyword argument:
lea.vals(1, 6, 2, 5, 3, 4, ordered=True) 1 : 0.16666666666666666 6 : 0.16666666666666666 2 : 0.16666666666666666 5 : 0.16666666666666666 3 : 0.16666666666666666 4 : 0.16666666666666666
ordered=True
option just affects the display, not the mathematical meaning of the probability definition. Note also that this option requires that all given values occur only once (otherwise, an exception is raised).
For the specific case of sequences of equiprobable integers, which is typical for dice, Lea provides a convenience method, interval(a,b)
. It generates integers from a
to b
inclusive:
>>> die = lea.interval(1,6)
v
, use the p(v)
method.
>>> biased_flip.p('Head') 0.3333333333333333 >>> die.p(5) 0.16666666666666666
Definition by probability mass function (pmf
function)
Another way to define Lea distributions, which is the most general, is to use the pmf(…)
function, where the argument is a probability mass function made up of pairs (v, P(v)). This way of doing is especially handy for non-uniform distributions. The simplest way to proceed is to pass the pmf as a dictionary. For instance, the following statement defines the very same unfair coin as above.
>>> biased_flip = lea.pmf({'Head': 1/3, 'Tail': 2/3}) >>> biased_flip Head : 0.3333333333333333 Tail : 0.6666666666666666
pmf
function divides all the given probabilities by their sum, before storing them. This process, called normalization, guarantees that the sum of all probabilities is equal to 1 (or as close as possible to 1 in case of rounding errors). Because of the automatic normalization, the same biased flip could also be defined with weights or counters, i.e. numbers that are proportional to the desired probabilities:
>>> biased_flip = lea.pmf( {'Head': 1, 'Tail': 2} )
pmf(…)
.
>>> biased_flip = lea.pmf( (('Head', 1/3), ('Tail', 2/3)) )
>>> biased_flip = lea.pmf( (('Head', 1), ('Tail', 1), ('Tail', 1)) ) >>> biased_flip = lea.pmf( (('Head', 3), ('Tail', 1), ('Tail', 5)) )
biased_flip
definitions above produce the very same pmf.
Note that if you have installed matplotlib, you can display an histogram of the pmf of any probability distribution by invoking the plot
method:
>>> measure = lea.pmf({0: 0.05, 1: 0.15, 2: 0.25, 3: 0.4, 4: 0.15}) >>> measure.plot()
For more details, read the section on histograms.
More on normalization
We have explained above that all the given probabilities are automatically normalized, by dividing them by their sum. There is however an alternative way to ensure the normalization of the pmf: you can ask the pmf
function to calculate the probability of one given value by subtraction, so that the sum of probabilities equal 1. To do this, simply put a None
probability for one given value: it will be replaced automatically by 1 minus the sum of other probabilities:
>>> lea.pmf({0: 0.05, 1: 0.15, 2: 0.25, 3: 0.4, 4: None}) 0 : 0.05 1 : 0.15 2 : 0.25 3 : 0.4 4 : 0.1499999999999999
measure
defined above, if you ignore rounding errors. Note that this process does not perform any division; it keeps the given probabilities as-is, except the one set to None
. We'll see that this feature is especially convenient when you deal with Lea's symbolic computation.
As a final note, if you have good reasons to keep your probabilities unaltered, it's possible to avoid any normalization: simply specify the argument normalization=False
when invoking Lea constructors. Take care however that you can build objects that are not genuine probability distributions. For instance, you could use this feature on vals
function to build a "counter table":
>>> lea.vals(*"ABRACADABRA", normalization=False) A : 5 B : 2 C : 1 D : 1 R : 2
Probability of a single event (event
, P
functions)
Based on what has been seen above, you may ask: this probability distribution thing is nice but how can I model a simple probability for a single event? For instance, how can I model that the probability of rain tomorrow is 27 %? The answer is to consider the event "it will rain tomorrow" as a boolean random variable, i.e. with given probabilities for True and False values:
>>> rain = lea.pmf({True: 0.27, False: 0.73}) >>> rain False : 0.73 True : 0.27
event
function:
>>> rain = lea.event(0.27)
P
function:
>>> lea.P(rain) 0.27
P
method:
>>> rain.P() 0.27
rain.p(True)
.
In summary, event
and P
are convenience functions for handling the probability of a single event, hiding the Boolean probability distributions that is stored internally by Lea. We'll see later that the P
function is very useful when combining probability distributions with comparison operators, like ==
, <=
, etc. In the following, we shall assume that you have imported the P
function:
from lea import P
P
function supports multiple arguments. In such case, the arguments are combined with the logical AND operator. An example is provided in the section "Comparison and logical operations".
In addition to these generic constructors, Lea provides convenience functions for a couple of standard probability distributions, viz. Bernoulli, binomial and Poisson, as well as gambling gizmos.
In the examples above, we have seen probability distributions with Python's strings, integers and Booleans being the values having some chances to occur. Actually, there are no limitation on the types of object that can be handled, provided that they have a hashing method: floats, complexes, tuples, vectors, date/times, user-defined objects, etc. Python's tuple type is especially useful for defining joint probability distributions. Of course, depending of the type of objects you put in the distributions, some operations can become impossible to achieve; if you try to execute such operation then an exception shall be raised. For example, mean value can be calculated on a die distribution; it is impossible however to get a mean on the flip
distribution ('Head'/'Tail'), since there isn't such concept of mean among character strings!
Now, that you know how to build probability distributions, you will learn in the following what you can do with these distributions. This includes:
- calculating standard indicators,
- deriving new probability distributions,
- evaluating probabilities of conditions,
- evaluating conditional probabilities,
- generating random samples.
All the topics above are covered in the present tutorial page. Other topics including:
- joint probability tables,
- Bayesian networks,
- Monte-Carlo sampling,
- information theory,
- changing probability representation,
- symbolic probabilistic calculation
are covered in second part the tutorial.
Probability distribution indicators (mean
, var
, std
, mode
, entropy
methods)
The following standard indicators are available on Lea probability distributions: mean, variance, standard deviation, mode and entropy. The three first indicators are demonstrated in the following examples:
WARNING: in Lea 3, these indicators are accessed without the method call syntax ()
.
>>> die = lea.vals(1, 2, 3, 4, 5, 6) >>> die.mean() 3.5 >>> die.var() 2.9166666666666665 >>> die.std() 1.707825127659933
The mode
method returns a tuple giving the value(s) having the highest probability:
>>> die.mode() (1, 2, 3, 4, 5, 6) >>> heights = lea.pmf(((0.5, 1), (1.0, 2), (1.5, 4), (2.0, 5), (2.5, 5), (3.0, 2), (3.5, 1))) >>> heights.mode() (2.0, 2.5)
The entropy provides a measure of the degree of randomness of the distribution. For a given set of values, it is maximum for a uniform probability distribution (i.e. all values have equal probabilities); in this case, it is equal to the logarithm in base 2 of the number of values, the result is expressed in bits. The entropy
method is illustrated in the following statements.
>>> lea.event(0.5).entropy() 1.0 >>> die = lea.vals(*(1, 2, 3, 4, 5, 6)) >>> die.entropy() 2.584962500721156 >>> biased_die = lea.vals(*(1, 2, 3, 4, 5, 6, 6)) >>> biased_die.entropy() 2.521640636343318
biased_die
) is lower than the fair die's (die
): the chances that the loaded die gives a 6 are twice than the chances of other numbers; so it is "less random" than the fair die.
Note that the entropy is null for distributions having a certain, unique, value (i.e. with probability = 1).
>>> lea.event(1).entropy() # certain True 0.0 >>> lea.event(0).entropy() # certain False 0.0 >>> lea.vals('truth').entropy() # certain 'truth' 0.0
If you are interested in this topic, Lea provides other useful measures related to information theory, some of which involving conditional probabilities. These are presented in the section dedicated to information theory in Lea advanced tutorial.
Random sampling (random
, random_draw
methods)
Since Lea copes with randomness, you can legitimately expect means to generate random samples! The random
method serves this purpose.
Here is how to draw one random value from the fair die defined above:
>>> die.random() 3
.random()
after a probability distribution and you get the associated random variable!"
Note: since we cope with randomness, the actual values you get here (as well as in the other examples of the present section) may be different! To have repeatable random samples, you may call Python's random.seed
method (see example a bit later...).
To generate a sample with a given number of random values, you provide the required sample's size:
>>> die.random(20) (5, 6, 3, 3, 3, 4, 4, 5, 1, 2, 1, 1, 3, 4, 4, 2, 2, 2, 1, 5)
>>> biased_die.random(20) (6, 1, 6, 4, 5, 1, 2, 4, 6, 6, 3, 6, 5, 4, 6, 1, 6, 6, 4, 3)
V
function can be used as a frequency counter. To illustrate this, let us generate a sample of 42,000 random values with the fair die and let us count the frequency of each value.
>>> sample1 = die.random(42000) >>> lea.vals(*sample1) 1 : 0.1622 2 : 0.1762 3 : 0.1614 4 : 0.163 5 : 0.1671 6 : 0.1701
*
to expand the sample tuple into arguments.
Now, let us proceed similarly with the unfair die.
>>> sample2 = biased_die.random(42000) >>> lea.vals(*sample2) 1 : 0.1429 2 : 0.1402 3 : 0.1452 4 : 0.1392 5 : 0.146 6 : 0.2865
The random sampling works with any distribution, whatever the type of values. Here is an example with tossing of an unfair coin.
>>> biased_flip = lea.pmf({'Head': 1/3, 'Tail': 2/3}) >>> biased_flip.random(10) ('Tail', 'Tail', 'Tail', 'Tail', 'Tail', 'Head', 'Tail', 'Head', 'Tail', 'Tail') >>> lea.vals(*(biased_flip.random(100000))) Head : 0.333200 Tail : 0.666800
Some processes require that a given value is not drawn twice in a random sample. These are named "draws without replacement", which include many lottery games. The random_draw
method is dedicated to such processes. As an example, imagine a fair lottery game with 45 balls, from which 6 balls shall be drawn.
>>> lottery = lea.interval(1, 46) >>> lottery.random_draw(6) (13, 24, 25, 10, 1, 26)
>>> lottery.random_draw(45) (44, 42, 36, 4, 10, 2, 14, 19, 22, 39, 12, 41, 18, 7, 31, 5, 38, 45, 21, 8, 13, 1, 3, 23, 32, 35, 20, 40, 11, 29, 27, 15, 30, 37, 16, 28, 43, 17, 6, 34, 25, 24, 9, 33, 26) >>> lottery.random_draw() (40, 41, 36, 27, 15, 26, 31, 24, 9, 7, 21, 16, 13, 2, 10, 34, 35, 30, 29, 33, 3, 23, 42, 32, 37, 45, 22, 20, 5, 25, 17, 43, 28, 4, 18, 39, 14, 1, 11, 8, 19, 12, 6, 38, 44)
Note that Lea's random
and random_draw
methods internally use the standard Python random
module. This may be useful to know: you may wish to be able to produce the very same pseudo-random sequence for a demo, a course, a test suite, etc.; the Python random.seed(…)
function can be helpful for this purpose:
>>> import random >>> random.seed(0) >>> die.random(10) (4, 4, 1, 3, 5, 4, 4, 3, 4, 3) >>> random.seed(1) >>> die.random(10) (2, 5, 1, 3, 1, 4, 4, 4, 6, 4) >>> random.seed(0) >>> die.random(10) (4, 4, 1, 3, 5, 4, 4, 3, 4, 3) >>> random.seed(1) >>> die.random(10) (2, 5, 1, 3, 1, 4, 4, 4, 6, 4)
Combining probability distributions
The examples seen so far work with explicitly defined probability distributions. In many situations however, an uncertain phenomenon may results from the combination of other uncertain phenomena, which are easier to model and which have known probability distributions. Examples are:
- adding the results of N thrown dice,
- counting the number of successes of N trials,
- evaluating the truth of a given condition made on uncertain events,
- evaluating the joint occurrences of a given set of uncertain events,
- calculating a given function with random variables as arguments.
Without dedicated tools, these problems require careful calculations, made manually or by custom programs; this is usually tedious and error-prone. Fortunately, Lea can make these calculations for you, taking care of possible dependencies.
The following sections shall first present the most basic techniques, which involve independent events (this applies in particular for gambling and stateless systems). Then, following sections will involve interdependent events by introducing variable binding and conditional probabilities. Note that more complex situations requiring joint probability tables, conditional probability tables, Bayesian networks and Markov chains are covered in the advanced tutorial of Lea.
Arithmetic operations
One of the most common way to combine probability distributions together is to use the usual arithmetic operations. In Lea, all arithmetic operators are overloaded. So when you add, subtract, etc. probability distributions together, you get a new probability distribution.
The most simple cases consist in operating with constant operands, like in the following cases:
-
Subtracting 3 from the result of a die:
>>> die1 = lea.vals(1, 2, 3, 4, 5, 6) >>> die1 - 3 -2 : 0.16666666666666669 -1 : 0.16666666666666669 0 : 0.16666666666666669 1 : 0.16666666666666669 2 : 0.16666666666666669 3 : 0.16666666666666669
-
Dividing the result of a die by 3, truncated to integers
>>> die1 // 3 0 : 0.33333333333333337 1 : 0.5000000000000001 2 : 0.16666666666666669
-
Taking the residue of a die result modulo 2 (i.e. the parity bit)
These last two examples show that the number of values of the resulting distributions can be lower than the one in the distribution passed to the function. This is without surprise since several values can be mapped to the same resulting value.>>> die1 % 2 0 : 0.5 1 : 0.5
Note that Lea just propagates the given operator on the values, without putting any control on it (this actually follows "Duck Typing" philosophy of Python. This means for example that you could play with +
and *
operator on strings:
>>> flip = lea.vals('Head','Tail') >>> 'This is ' + flip This is Head : 0.5 This is Tail : 0.5 >>> die1 * 'X' X : 0.16666666666666669 XX : 0.16666666666666669 XXX : 0.16666666666666669 XXXX : 0.16666666666666669 XXXXX : 0.16666666666666669 XXXXXX : 0.16666666666666669
1 / (die1-1) # -> ZeroDivisionError: division by zero
So far, we have combined probability distributions with constants. More interesting things happen when you combine probability distributions together. To get the distribution resulting from the sum of two dice, you may be tempted to do this:
>>> die1 + die1 2 : 0.16666666666666669 4 : 0.16666666666666669 6 : 0.16666666666666669 8 : 0.16666666666666669 10 : 0.16666666666666669 12 : 0.16666666666666669
2*die1
, which is obviously not the distribution you expect. The reason shall be explained later, in the section about referential consistency. To get the right result, you have to create a new die instance, either by issuing a new vals(…)
expression or, simpler, by "cloning" the first die using the new
method:
>>> die2 = die1.new()
die2 = die1
assignment is not adequate because die2
would then refer to the same object as die1
, hence the same event (remember that, in Python, an assignment is just a binding between a name and an object). After using .new()
, die2
represents a second die having the same probability distribution as die1
but representing an event independent of the throw of the first die. Then, the following statement produces the distribution of the sum of dice.
>>> die1 + die2 2 : 0.02777777777777777 3 : 0.05555555555555554 4 : 0.08333333333333331 5 : 0.11111111111111108 6 : 0.13888888888888887 7 : 0.16666666666666666 8 : 0.13888888888888887 9 : 0.11111111111111108 10 : 0.08333333333333331 11 : 0.05555555555555554 12 : 0.02777777777777777
You can then easily make the same to subtract, multiply, … dice together. You could also add probability distributions with strings, in order to see the possible combinations:
>>> flip2 = flip.new() >>> flip + '-' + flip2 Head-Head : 0.25 Head-Tail : 0.25 Tail-Head : 0.25 Tail-Tail : 0.25
>>> (1+die1//3) * flip Head : 0.16666666666666666 HeadHead : 0.25 HeadHeadHead : 0.08333333333333333 Tail : 0.16666666666666666 TailTail : 0.25 TailTailTail : 0.08333333333333333
If you want to create several independent random variables, each having the same probability distributions, then you can specify the required number of instances to the new
method: this will return a tuple with the required objects:
>>> (die2, die3, die4) = die1.new(3)
Important note: all the examples seen in the present page produces exact probability results. However, several problems are known to be intractable using such exact approach. For such problems, you may have consider alternative approaches in Approximate inference by Monte-Carlo approaches. Also, calling the random(...)
method on a lea instance makes an implicit exact calculation of the probability distribution before the actual random sampling is done; if this is intractable, the random_mc(...)
method provides a workable alternative.
Repeating operations (times
method)
Since repeating the same random process multiple times, Lea provides a convenience method to that that job easily.
Consider the following problem: what is the probability of getting a total of 16 when throwing 4 dice?
From the techniques seen so far, you could write:
>>> dice_5 = die1 + die1.new() + die1.new() + die1.new() + die1.new() >>> dice_5.p(16) 0.09452160493827125
dice_5
could be to use the Python sum function, passing a 5-tuple created in one shot:
>>> dice_5 = sum(die1.new(5))
times
method.
>>> dice_5 = die1.times(5)
times
method uses a dynamic programming technique, a dichotomic algorithm with far less calculations than the former naive expression. To illustrate that point, you may notice that Lea has no particular problem to calculate the probability distribution of 100 thrown dice:
>>> die1.times(100) 100 : 1.5306467074865188e-78 101 : 1.530646707486519e-76 102 : 7.72976587280692e-75 … 598 : 7.72976587280692e-75 599 : 1.530646707486519e-76 600 : 1.5306467074865188e-78
sum
above is already prohibitively long for adding 10 dice.
By default, times(n)
method performs additions. Any other two-ary operator or two-arguments function can be specified as a second argument. For instance, here is how you can get the probability that the product of 4 thrown dice is 5:
>>> from operator import mul >>> die1.times(4, mul).p(5) 0.0030864197530864187
WARNING: The times
method ensures that the repeated events are independent. You must keep that in mind to avoid wrong interpretations, in particular when calculating conditional probabilities—see the warning note at the end of section below about the given
method.
Comparison and logical operations
Usual comparison operators may be used on any Lea instances, resulting in Boolean distributions giving the probabilities of each truth value.
>>> die1 == 4 False : 0.8333333333333334 True : 0.16666666666666669 >>> die1 <= 4 False : 0.3333333333333333 True : 0.6666666666666666
P
function seen earlier:
>>> P(die1 <= 4) 0.6666666666666666
>>> P(die1 < die2) 0.4166666666666667 >>> P(die1 <= die2) 0.5833333333333334 >>> P(die1 == die2) 0.16666666666666666 >>> P(die1 != die2) 0.8333333333333334
P(die1 < die2)
) shows that, after throwing two dice, there are 0.416666 chances that one given die has a value strictly less than the other's. This assumes that you are able to distinguish between the two dice, for example using colors (e.g. green die + blue die). If you make no difference between dice, then the probability that one of them is strictly less than the other is 0.833333, as shown in the last example (P(die1 != die2)
).
Logical operators (NOT, AND, OR, XOR) can be used to combine boolean probability distributions together. These are detailed in the following table, where a
and b
are standing for any expressions returning Lea boolean distributions:
Lea expression | meaning |
---|---|
~a |
not a |
a & b |
a and b |
a ⎮ b |
a or b |
a ^ b |
a xor b |
Examples:
>>> P(~(die1 == 3)) 0.8333333333333334 >>> P((die1 == 4) & (die2 == 2)) 0.02777777777777777 >>> P((die1 <= 3) & (die2 > 3)) 0.25
WARNINGS:
-
Due to Python precedence rules, it is required to embed expressions connected by logical operators with parentheses. If parentheses are missing, an exception shall be raised. For the AND operator, this can be mitigated by calling
P
with multiple arguments: for instance the previous expression is equivalent to:>>> P(die1 <= 3, die2 > 3)
-
The Python's augmented comparison between Lea instances (e.g. a < b < c) DOES NOT WORK! Actually, for several technical reasons related to Python design, such expressions do not (and cannot) return any sensible result. These can be replaced by using a logical AND (e.g. (a < b) & (b < c) ). You can find an example of this at the end of the section on referential consistency.
For testing the equality of distribution against a set of given values, the is_any_of(…)
method should be helpful as alternative to ORing equalities. As an example, here is how to calculate the probability of a "craps":
>>> P((die1+die2).is_any_of(2, 3, 12)) 0.11111111111111108
is_none_of(…)
calculates the inverse distribution of is_any_of(…)
:
>>> P((die1+die2).is_none_of(2, 3, 12)) 0.888888888888889
Applying functions (map
method, func_wrapper
, min_of
, max_of
functions)
If you need special processing beyond arithmetic or logical operator, you can apply a given function on the values of a probability distribution by using the map
method; this builds a new probability distribution on the function results. As a dummy example, this is how to transform the integer values of a die into floating point numbers by invoking the standard Python float
function:
>>> die1.map(float) 1.0 : 0.16666666666666669 2.0 : 0.16666666666666669 3.0 : 0.16666666666666669 4.0 : 0.16666666666666669 5.0 : 0.16666666666666669 6.0 : 0.16666666666666669
>>> def parity(x): … return "odd" if x%2==1 else "even" >>> die1.map(parity) even : 0.5 odd : 0.5
map
:
>>> die1.map(parity).map(len) 3 : 0.5 4 : 0.5
If you need to call functions with multiple arguments, then you can simply add the arguments, starting from the second one, as map
’s arguments. For example, suppose you want to determine the probability distribution resulting from keeping the die with the smallest value among two thrown dice. You could use the standard min
function of Python:
>>> die1.map(min, die2) 1 : 0.3055555555555556 2 : 0.25 3 : 0.19444444444444442 4 : 0.13888888888888887 5 : 0.08333333333333331 6 : 0.02777777777777777
min(die1,die2)
min
function tries to compare die1
with die2
, as if these were numbers: this does not return a boolean but a Lea instance with probabilities of True and False. Opportunely, Lea provides a remedy to this shortcoming: the function func_wrapper
allows transforming min
function into a Lea-aware function. Here is how it works:
min_lea = lea.func_wrapper(min)
min_lea
with Lea instances as arguments, as if it were the min
function itself:
>>> min_lea(die1, die2) 1 : 0.3055555555555556 2 : 0.25 3 : 0.19444444444444442 4 : 0.13888888888888887 5 : 0.08333333333333331 6 : 0.02777777777777777
map
function but with a more readable notation. As with min
function, you can have any number of arguments; you can also freely mix Lea instances with constant values:
>>> min_lea(die1, 4, 2, 3, die2) 1 : 0.3055555555555556 2 : 0.6944444444444445
min
function, min_lea
works on other types than numerical types:
min_lea(flip, flip2) Head : 0.75 Tail : 0.25
min_lea
can be used on any probability distribution having values that can be compared together. In particular, you could use it with probability distribution on datetime
objects, which is useful for planning models containing uncertainties (see example).
Because the need of min
and max
functions is common, Lea provides prebuilt functions lea.min_of
and lea.max_of
. Note that these functions scale poorly as the number of arguments grows; however, these two functions provide an optional argument that can dramatically speed up the calculation if you accept some constraints.
Referential consistency
Before introducing conditional probabilities, it's now required to speak about the key concept of referential consistency in Lea. It's important to understand this concept before going further in the tutorials. Despite the fact that it could be a bit surprising at the first sight, referential consistency is actually a very powerful idea.
Let us re-consider a simple fair die:
>>> die1 = lea.interval(1, 6)
die1 + die1
does not produce the probability distribution of the sum of two dice; what it returns is actually the same as 2 * die1
. You may observe also the following results:
>>> die1 - die1 0 : 1.0 >>> 2*die1 - die1 - die1 0 : 1.0 >>> 3*die1 - die1 == 2*die1 True : 1.0
die1
) is bound to a value (e.g. 1), this binding is the same for any occurrence of this variable. If you are knowledgeable in Probabilistic Programming research, you may remark that referential consistency is somehow related to stochastic memoization, as found for example in the Church language.
Without giving too much details, let say that Lea performs "lazy evaluation": when it evaluates a given expression (like the ones given above), it first builds a data structure that models this expression without making any calculation nor combinatorial. This structure is actually a direct acyclic graph (DAG) where each node is either an elementary probability distribution or an operation to be performed on other nodes (addition, multiplication, etc.). If the given expression is used in other expressions, new graphs (also DAG) are built in turn, referring to the already built graphs. Then, when you ask to display results, the real calculation of the pmf takes place, using the built graph to browse and process the values with their probabilities. The referential consistency is enforced by taking care that each variable occurrence is represented by only one node in the graph, with possibly multiple nodes referring to it.
These explanations should clarify the examples given above, even if the two stages (graph building and actual pmf calculation) are implicitly chained together, hence hidden. To better exhibit the mechanism, let us define three new variables for a model involving two dice:
>>> die2 = die1.new() >>> dice = die1 + die2
pmf
, vals
, interval
, …) or to copy an existing one using the new
method.
>>> checkup = dice**2 - (die1**2 + 2*die1*die2 + die2**2)
die1
, die2
, dice
and checkup
nodes together. Then, asking to evaluate checkup
shall perform the actual pmf calculation:
>>> checkup 0 : 1.0
die1
and die2
. Thanks to the referential consistency, you may build many other examples that are consistent with algebra rules, logical rules, theorems, etc. In essence, each variable "remembers" how it is built, keeping references to dependant variables. Only elementary variables, such as the ones built with lea.pmf
methods are independent.
Note that the object returned by x.new()
is NOT a clone of x
. If x
depends of other variables a
, b
, c
, … , then x.new()
first evaluates the pmf of x
, then it stores this pmf as it was an event independent of x
, a
, b
, c
, … . You can experience this if you invoke, for instance, dice.new()
. For cloning Lea instances, while sharing dependencies, you may use the clone
method.
Of course, all the contrived examples seen above are of no real use: they simply show that Lea is consistent with known mathematical rules. The referential consistency brings added value for specific problem like evaluating the probability that a given random variable is within a given range: we have warned before that augmented assignment like P(4 <= dice < 10)
does not work and shall be replaced by a conjunction of two inequalities:
>>> P((4 <= dice) & (dice < 10)) 0.75
>>> P(4 <= dice, dice < 10) 0.75
dice
occurrences refer to the same random event, i.e. that only two dice are thrown, not four. Note that the same probability could be calculated in many other ways:
>>> P(4 <= die1+die2, die1+die2 < 10) 0.75 >>> P(4 <= dice, die1 < 10-die2) 0.75
Conditional probabilities (given
method)
Lea provides several mechanisms to cope with conditional probabilities. We shall cover here the simplest one, which is calculating condition probabilities involving dependent random events. The problem of setting explicit conditional probabilities and calculating new probabilities thereof is more involved; it is covered in the "Bayesian reasoning" section of the advanced tutorial.
So, we study here the calculation of conditional probabilities, that is revising a prior probability distribution x
based on a given evidence e
. The syntax to calculate the revised probability distribution is:
x.given(e)
e
- shall be a condition expressed as a Boolean probability distribution
- shall have a non-null probability to be true (i.e. e
expresses something feasible).
Should these requirements be not fulfilled, an exception will be raised).
As a first example, imagine that your friend Nancy calls you by phone; she tells you
"Hi, you! I just threw two dice. Can you guess the result?" She tells you also "Here is a tip : the result is not higher than 6!".
OK, calm down… Let us first calculate the prior probabilities for the two dice. Nothing new here:
>>> die1 = lea.interval(1, 6) >>> die2 = die1.new() >>> dice = die1 + die2
dice
to include the fact given by Nancy that the result is less or equal to 6.
>>> revised_dice = dice.given(dice <= 6) >>> revised_dice 2 : 0.06666666666666667 3 : 0.13333333333333333 4 : 0.19999999999999998 5 : 0.26666666666666666 6 : 0.3333333333333333
It is important to note that there is not limitation on the prior distributions; in particular, it could be itself a condition. For example, here is how to calculate the probability that the dice result is greater than 4, provided that we know that it is less or equal to 6:
>>> (dice >= 4).given(dice <= 6) False : 0.19999999999999996 True : 0.8
>>> revised_dice >= 4 False : 0.19999999999999996 True : 0.8
>>> (4 <= dice) & (dice <= 6) False : 0.6666666666666667 True : 0.33333333333333337
given
construct is closely related to the definition of conditional probability: in particular, if x
is a Boolean probability distribution, then the following equality should stand:
P(x.given(e)) == P(x & e) / P(e)
>>> P((dice >= 4).given(dice <= 6)) 0.8 >>> P((dice >= 4) & (dice <= 6)) / P(dice <= 6) 0.8
given
is implemented: given
is not a simple convenience method, it uses a more efficient algorithm that eliminates evaluation of x
for all cases where e
is not satisfied. To convince yourself about this statement, you could try this:
>>> P((die1 / (die2-1) == 2).given(die2 != 1)) 1/10
>>> P(((die1 / (die2-1) == 2) & (die2 != 1))) / P(die2 != 1) ZeroDivisionError: division by zero
ZeroDivisionError
because of the expression die1 / (die2-1)
is calculated even for the cases where die2 == 1
.
The technique seen above can be extended to cope with dependent random variables. We can evaluate the probability of dice sum, assuming we got -by any means- some certain information on each individual die.
Examples:
-
assuming we know that the first die is 3:
>>> dice.given(die1 == 3) 4 : 0.16666666666666663 5 : 0.16666666666666663 6 : 0.16666666666666663 7 : 0.16666666666666663 8 : 0.16666666666666663 9 : 0.16666666666666663
-
assuming we know that at least one of the die is 3:
>>> dice.given((die1 == 3) | (die2 == 3)) 4 : 0.1818181818181818 5 : 0.1818181818181818 6 : 0.0909090909090909 7 : 0.1818181818181818 8 : 0.1818181818181818 9 : 0.1818181818181818
-
assuming we know that the first die is strictly inferior to the second die:
>>> dice.given(die1 < die2) 3 : 0.06666666666666665 4 : 0.06666666666666665 5 : 0.1333333333333333 6 : 0.1333333333333333 7 : 0.19999999999999996 8 : 0.1333333333333333 9 : 0.1333333333333333 10 : 0.06666666666666665 11 : 0.06666666666666665
-
assuming we know that the first die is even and the second one is odd:
There is a useful alternative to the ANDing conditions together. Actually, the>>> dice.given((die1%2 == 0) & (die2%2 == 1)) 3 : 0.1111111111111111 5 : 0.2222222222222222 7 : 0.3333333333333333 9 : 0.2222222222222222 11 : 0.1111111111111111
given
method accepts multiple 'evidence' arguments, which are ANDed together. For instance,produces the same result as above. Using this construct presents several advantages over ANDing conditions:>>> dice.given(die1%2 == 0, die2%2 == 1)
- The expression is simpler since parentheses can be removed.
- A set of conditions can be passed easily (with * prefix), which expresses more naturally a set of evidences.
- Performance can be improved greatly compared to ANDing evidences because the
given
method is optimized with short-circuit evaluation, i.e. the exploration can be pruned as soon as a condition is evaluated to false. Incidentally, it's advisable to put the conditions in an order where the less probable conditions are put first.
The sole functional difference between the two approaches is that an exception rose could maybe not occur with the new technique, since some value combinations could be dropped due to short-circuit evaluation (this effect is the same with Python's and
). This is illustrated below:
>>> die1.given((2<=die1) & (6//(die1-1)==2)) ZeroDivisionError: integer division or modulo by zero >>> die1.given(2<=die1, 6//(die1-1)==2) 4 : 1.0
dice
variable "remembers" that it was built up from die1
and die2
. In the vocabulary of probability theory, we could say that the random variable modeled by dice
is dependent of the random variables modeled by die1
and die2
(these last two being independent from each other). The lazy evaluation technique explained above allows taking into account the latest given information.
Now, the fact that two random variables are dependent is a symmetric relationship: the probability of one die should be revised given information on the sum of dice. Assuming that we know that this sum is 4, without further information on the values of any individual die, we get the following probability distribution on the first die:
>>> die1.given(dice == 4) 1 : 0.3333333333333333 2 : 0.3333333333333333 3 : 0.3333333333333333
The given
method shall be used extensively in the "Bayesian reasoning" section of tutorial [2/3].
For factorizing out evidences, avoiding tedious repetitions of given
, you may use evidence contexts, covered in tutorial [3/3].
Note also that the given
method has a more general flavor, allowing you to specify conditions that are not 100% sure - see given_prob
method in tutorial [3/3].
WARNING: The times
method shall be used with care when combined with given
. It is tempting to write
>>> die1.times(2).given(die1<=2)
>>> die1.times(2)
times
are independent and hidden from you, so these cannot be conditioned whatsoever. One way to fix things is simply swapping the order of calls:
>>> die1.given(die1<=2).times(2) 2 : 0.25 3 : 0.5 4 : 0.25
Getting raw distribution data (support
, ps
, pmf_tuple
, pmf_dict
, cdf_tuple
, cdf_dict
methods)
You may need at some stage to export calculated probability data to Python standard types. You may want for example to write these data on files, to process them in your own functions or to export them to other libraries. Lea provides several methods to get the distribution values and probabilities at once.
Consider for example the following height distribution of a sample of smurfs:
>>> heights = lea.pmf(((0.5, 1), (1.0, 2), (1.5, 4), (2.0, 5), (2.5, 5), (3.0, 2), (3.5, 1))) >>> heights 0.5 : 0.05 1.0 : 0.1 1.5 : 0.2 2.0 : 0.25 2.5 : 0.25 3.0 : 0.1 3.5 : 0.05
WARNING: in Lea 3, these indicators are accessed without the method call syntax ()
.
-
support
: a tuple containing only the values of the domain>>> heights.support() (0.5, 1.0, 1.5, 2.0, 2.5, 3.0, 3.5)
-
ps
: a tuple containing only the probabilities>>> heights.ps() (0.05, 0.1, 0.2, 0.25, 0.25, 0.1, 0.05)
-
pmf_tuple
: the probability mass function as a tuple containing of tuple pairs (v, P(x = v))>>> heights.pmf_tuple() ((0.5, 0.05), (1.0, 0.1), (1.5, 0.2), (2.0, 0.25), (2.5, 0.25), (3.0, 0.1), (3.5, 0.05))
-
pmf_dict
: the probability mass function as an OrderedDict { v: P(x = v) }>>> heights.pmf_dict() OrderedDict([(0.5, 0.05), (1.0, 0.1), (1.5, 0.2), (2.0, 0.25), (2.5, 0.25), (3.0, 0.1), (3.5, 0.05)])
-
cdf_tuple
: the cumulative distributive function as a tuple containing of tuple pairs (v, P(x <= v))>>> heights.cdf_tuple() ((0.5, 0.05), (1.0, 0.15), (1.5, 0.35), (2.0, 0.60), (2.5, 0.85), (3.0, 0.95), (3.5, 1.00))
-
cdf_dict
: the probability mass function as an OrderedDict { v: P(x <= v) }Note that the last probability value returned by>>> heights.cdf_dict() OrderedDict([(0.5, 0.05), (1.0, 0.15), (1.5, 0.35), (2.0, 0.60), (2.5, 0.85), (3.0, 0.95), (3.5, 1.00)])
cdf_…()
is always 1.0 (or something close to 1.0, in case of rounding error!). Note also that all pmf and cdf values above can be retrieved individually by using comparison operators:Important: the returned data in tuples and OrderedDict are provided in an order consistent with the order of values returned by>>> P(heights == 2.0) 0.25 >>> P(heights <= 2.0) 0.6
support
. For instance, the first value ofps
corresponds to the first value ofsupport
, etc. What is this order? If you've passed `ordered=True' in the constructor method, then the order used at construction is kept; otherwise, the order depends on the values: -
if the values are sortable, then the values are sorted in increasing order (e.g. integers, floats, strings, …),
- if the values are not sortable, then an arbitrary order, consistent from call to call, is used (e.g. complexes, matrices, …)
JSON serialization
An interesting application of the pmf_tuple
method seen above consists in serializing a pmf in a string using the JSON format (an idea of Gianpaolo Galli). Here is how to do that:
>>> import json >>> heights_json = json.dumps(heights.pmf_tuple())
heights_json
is a string in JSON format, that is is easily readable:>>> heights_json '[[0.5, 0.05], [1.0, 0.1], [1.5, 0.2], [2.0, 0.25], [2.5, 0.25], [3.0, 0.1], [3.5, 0.05]]'
lea.pmf
function. This is made in a simple one-liner: >>> heigths2 = lea.pmf(json.loads(heights_json))
What's next?
Thank you for reading the present tutorial page! We hope that you found it enough clear and informative.
You should be able now to define your own probability distributions and combine them to model many phenomena with uncertainties. You can find more examples in the Examples page.
However, Lea can do many, many more than the topics explained on the present page! If you are looking for advanced topics, jump to
-
tutorial [2/3] - standard distributions, changing probability representation, joint distributions, Bayesian networks, Markov chains, … (also available online in the Jupyter Notebook Tutorial #2)
-
tutorial [3/3] - plotting, drawing without replacement, machine learning, information theory, MC estimation, symbolic calculation, …
For enhancements or bugs, please create issues on bitbucket Lea page.
Feel free also to send an e-mail to pie.denis@skynet.be (English or French spoken). Use cases of Lea in your projects/researches are especially appreciated! A "references" page listing the projects using Lea could greatly improve the reputation of the library.
Updated