# Overview

Atlassian Sourcetree is a free Git and Mercurial client for Windows.

Atlassian Sourcetree is a free Git and Mercurial client for Mac.

# Lea - Discrete probability distributions in Python

## What is Lea?

Lea is a Python library aiming at working with discrete probability distributions in an intuitive way.

## Features (Lea 3)

**discrete probability distributions**- support: any object!**random sampling****probabilistic arithmetic**: arithmetic, comparison, logical operators and functions**probabilistic programming (PP)**, Bayesian reasoning, CPT, BN, JPD, MC sampling, Markov chains, …**standard indicators**+**information theory****multiple probability representations**: float, decimal, fraction, …**symbolic computation**, using the SymPy library**exact probabilistic inference**based on Python generators**comprehensive tutorials**(Wiki)**Python 2.6+ / Python 3**supported**lightweight**,*pure*Python module**open-source**- LGPL license

## Some samples…

Let's start by modeling a biased coin and make a random sample of 10 throws:

import lea flip1 = lea.pmf({ 'Head': 0.75, 'Tail': 0.25 }) print (flip1) # -> Head : 0.75 # Tail : 0.25 print (flip1.random(10)) # -> ('Head', 'Tail', 'Tail', 'Head', 'Head', 'Head', 'Head', 'Head', 'Head', 'Head')

You can then throw another coin, which has the same bias, and get the probabilities of combinations:

flip2 = flip1.new() flips = lea.joint(flip1,flip2) print (flips) # -> ('Head', 'Head') : 0.5625 # ('Head', 'Tail') : 0.1875 # ('Tail', 'Head') : 0.1875 # ('Tail', 'Tail') : 0.0625 print (flips.count('Head')) # -> 0 : 0.0625 # 1 : 0.375 # 2 : 0.5625 print (P(flips == ('Head', 'Tail'))) # -> 0.1875 print (P(flip1 == flip2)) # -> 0.625 print (P(flip1 != flip2)) # -> 0.375

You can also calculate conditional probabilities, based on given information or assumptions:

print (flips.given(flip2 == 'Tail')) # -> ('Head', 'Tail') : 0.75 # ('Tail', 'Tail') : 0.25 print (P((flips == ('Tail', 'Tail')).given(flip2 == 'Tail'))) # -> 0.25 print (flip1.given(flips == ('Head', 'Tail'))) # -> Head : 1.0

With these examples, you can notice that Lea performs *lazy evaluation*, so that `flip1`

, `flip2`

, `flips`

form a network of variables that "remember" their causal dependencies (this is referred in the literature as a *probabilistic graphical model* or a *generative model*). Based on such feature, Lea can build more complex relationships between random variables and perform advanced inference like Bayesian reasoning. For instance, the classical "Rain-Sprinkler-Grass" Bayesian network (Wikipedia) is modeled in a couple of lines:

rain = lea.event(0.20) sprinkler = lea.if_(rain, lea.event(0.01), lea.event(0.40)) grass_wet = lea.joint(sprinkler,rain).switch({ (False,False): False, (False,True ): lea.event(0.80), (True ,False): lea.event(0.90), (True ,True ): lea.event(0.99)})

Then, this Bayesian network can be queried in different ways, including forward or backward reasoning, based on given observations or logical combinations of observations:

print (P(rain.given(grass_wet))) # -> 0.35768767563227616 print (P(grass_wet.given(rain))) # -> 0.8019000000000001 print (P(grass_wet.given(sprinkler & ~rain))) # -> 0.9000000000000001 print (P(grass_wet.given(~sprinkler & ~rain))) # -> 0.0

The floating-point number type is a standard although limited way to represent probabilities. Lea 3 proposes alternative representations, which can be more expressive for some domain and which are very easy to set. For example, you could use fractions:

flip1_frac = lea.pmf({ 'Head': '75/100', 'Tail': '25/100' }) flip2_frac = flip1_frac.new() flips_frac = lea.joint(flip1_frac,flip2_frac) print (flips_frac) # -> ('Head', 'Head') : 9/16 # ('Head', 'Tail') : 3/16 # ('Tail', 'Head') : 3/16 # ('Tail', 'Tail') : 1/16

You could also put variable names, which enables *symbolic computation* of probabilities (requires the SymPy library):

flip1_sym = lea.pmf({ 'Head': 'p', 'Tail': None }) flip2_sym = lea.pmf({ 'Head': 'q', 'Tail': None }) print (flip1_sym) # -> Head : p # Tail : -p + 1 print (P(flip1_sym == flip2_sym)) # -> 2*p*q - p - q + 1 flips_sym = lea.joint(flip1_sym,flip2_sym) print (flips_sym) # -> ('Head', 'Head') : p*q # ('Head', 'Tail') : -p*(q - 1) # ('Tail', 'Head') : -q*(p - 1) # ('Tail', 'Tail') : (p - 1)*(q - 1)

# To learn more...

The above examples show only a very, very small subset of Lea 3 capabilities. To learn more, you can read:

- Lea 3 Tutorial [1/3] - basics: building/displaying pmf, arithmetic, random sampling, conditional probabilities, …
- Lea 3 Tutorial [2/3] - standard distributions, joint distributions, Bayesian networks, Markov chains, changing probability representation, …
- Lea 3 Tutorial [3/3] - plotting, drawing without replacement, machine learning, information theory, MC estimation, symbolic computation, …
- Lea 3 Examples

Note that Lea 2 tutorials are still available here although these are no longer maintained. You can also get Lea 2 presentation materials (note however that the syntax of Lea 3 is *not backward compatible*):

- Lea, a probability engine in Python - presented at FOSDEM 15/Python devroom
- Probabilistic Programming with Lea - presented at PyCon Ireland 15

## On the algorithm …

The very beating heart of Lea resides in the *Statues* algorithm, which is a new exact probabilistic marginalization algorithm used for almost all probability calculations of Lea. If you want to understand how this algorithm works, then you may read a short introduction or have a look at MicroLea, an independent Python implementation that is much shorter and much simpler than Lea. For a more academic description, the paper "Probabilistic inference using generators - the Statues algorithm" presents the algorithm in a general and language-independent manner.

# Bugs / enhancements / feedback / references …

If you have enhancements to propose or if you discover bugs, you are kindly invited to create an issue on bitbucket Lea page. All issues will be answered!

Don't hesitate to send your comments, questions, … to pie.denis@skynet.be, in English or French. You are welcome / *bienvenus* !

Also, if you use Lea in your developments or researches, please tell about it! So, your experience can be shared and the project can gain recognition. Thanks!