Wiki
Clone wikiLea / Lea3_Examples
Table of Content
- Introduction
- Sicherman dice
- Nontransitive dice
- Name generator
- The Monty Hall problem
- Job scheduling - 1 (integration with datetime)
- Job scheduling - 2 (Bayesian reasoning)
- The 3 coins problem
- Lottery
- Character frequency counter
- The lifetime problem
- 2D random walk
- RPG combat
- Bayesian network - the Rain-Sprinkler-Grass example
- A murder case
- The problem of Chevalier de Méré
- Rock-paper-scissors game
- von Neumann extractor
- Central limit law
- Bullshit generator
Introduction
The following page provides several examples using Lea 3.
Unless otherwise specified, all the examples just require Lea 3 module on standard Python 2.6+ or 3 installation. These examples can be executed in a Python console or by saving them in a .py files to be executed. It's assumed that the following import has been done:
import lea
Most of the examples presented here just require basic knowledge of the Python language. If some Lea constructs are unclear, then you should refer to the tutorial pages (1, 2 or 3).
Sicherman dice
I discovered the strange Sicherman dice in the Scientific American's column of Martin Gardner.
Here is how you can verify that they result in the same probability distribution as two normal dice.
lea.set_prob_type('r') die_s1 = lea.vals(1,2,2,3,3,4) die_s2 = lea.vals(1,3,4,5,6,8) print (die_s1 + die_s2) # -> 2 : 1/36 # 3 : 2/36 # 4 : 3/36 # 5 : 4/36 # 6 : 5/36 # 7 : 6/36 # 8 : 5/36 # 9 : 4/36 # 10 : 3/36 # 11 : 2/36 # 12 : 1/36
Nontransitive dice
Nontransitive dice are an intriguing mathematical curiosity. It is another counterintuitive fact in the domain of probabilities.
In the following example, we consider the 3 special dice A, B and C (see face labels below).
lea.set_prob_type('r') die_a = lea.vals(3,3,5,5,7,7) die_b = lea.vals(2,2,4,4,9,9) die_c = lea.vals(1,1,6,6,8,8)
- die A beats die B, with probability 5/9
- die B beats die C, with probability 5/9
- die C beats die A, with probability 5/9
print (P(die_a > die_b)) # -> 5/9 print (P(die_b > die_c)) # -> 5/9 print (P(die_c > die_a)) # -> 5/9
Other examples (Grime's dice), with thorough explanation and video, can be found here.
Name generator
For the fun, let us generate random (hype?) company/product names...
prefix = lea.vals('Alpha', 'Binary', 'Bit', 'Blue', 'Brain', 'Cool', 'Cosmo', 'Cross', 'Cyber', 'Deep', 'Delta', 'Dual', 'Easy', 'Eye', 'Face', 'Fast', 'Free', 'Good', 'High', 'Hot', 'Inner', 'Key', 'Life', 'Light', 'Lotus', 'Lucky', 'Magic', 'Master', 'Mech', 'Metal', 'Micro', 'Mini', 'Mister', 'Nano', 'Net', 'Plane', 'Pop', 'Power', 'Purple', 'Py', 'Red', 'Small', 'Soft', 'Source', 'Space', 'Spicy', 'Star', 'Steel', 'Sunny', 'Tele', 'Thin', 'Tiny', 'Token', 'Top', 'Turbo', 'Uber', 'Watch', 'Well', 'Wild', 'X-', 'X-', 'Zero') suffix = lea.vals('-Z', 'X', 'ball', 'beat', 'bird', 'bit', 'boo', 'book', 'bot', 'box', 'cat', 'cell', 'cloud', 'club', 'core', 'data', 'doh', 'draw', 'drive', 'finger', 'fish', 'flow', 'fly', 'foo', 'food', 'force', 'forge', 'frame', 'free', 'fun', 'head', 'house', 'hub', 'lea', 'line', 'lith', 'look', 'love', 'magic', 'mango', 'mount', 'mountain', 'node', 'note', 'one', 'phone', 'pix', 'plane', 'planet', 'ray', 'ring', 'robot', 'room', 'scape', 'side', 'sky', 'soft', 'source', 'space', 'star', 'strike', 'sun', 'tag', 'talk', 'tals', 'token', 'vibe', 'view', 'vision', 'ware', 'word', 'work', 'world', 'zen') name = prefix + suffix
print (name.random_mc(100)) # -> 'Crossframe', 'Magicmountain', 'Bittag', 'Hotfoo', 'Spacebit', 'Thinfinger', 'Facemango', 'Eyeside', 'Powerball', 'Powerbeat', 'Innermount', 'Mistertag', 'Hotfinger', 'Tokenfly', 'Starside', 'Dualsoft', 'Turbolove', 'Lifepix', 'X-planet', 'Alphabox', 'Spacevibe', 'Lightlith', 'Binarystrike', 'Magicscape', 'Highforce', 'Thinscape', 'Wildpix', 'Innerrobot', 'Hotcat', 'Spicydrive', 'Dualbeat', 'Deepmount', 'Watchnote', 'Tokenmango', 'Microdraw', 'Bitworld', 'Crossdrive', 'Coolfish', 'Nettag', 'Softfood', 'X-star', 'Uberfood', 'Masterscape', 'Ubersoft', 'Keyplane', 'Keystrike', 'Goodfood', 'Bluefun', 'Eyevision', 'Popview', 'Cosmosource', 'Spacespace', 'Lotusray', 'Coolsource', 'Purplemango', 'Dualdrive', 'Hotcell', 'Watchcat', 'Steelbeat', 'Coolcat', 'Lifehub', 'Blueware', 'Topsky', 'Planepix', 'Thincell', 'Fastlove', 'CyberX', 'Steelbird', 'Lighthub', 'Starphone', 'Cyberfun', 'Mechware', 'Luckydoh', 'Keylove', 'Uberhouse', 'Mechcore', 'Keystrike', 'Masterrobot', 'Microlook', 'Easyhub', 'Brainvibe', 'Sunnyboo', 'Crosssky', 'Cyberplane', 'Dualwork', 'Lifemount', 'Deepfoo', 'Keyring', 'Innerclub', 'Metaltalk', 'Innerbox', 'Starwork', 'Thinview', 'Tokenbeat', 'Starsky', 'Lotusdraw', 'Softbox', 'Zerobeat', 'Tinytag', 'Facebox')
By the way, do you know why we preferred random_mc
over random
?
The answer is here.
A slight variation consists in capitalizing the first character of suffix.
cap_name = prefix + suffix.title() print (cap_name.random_mc(100)) # -> ('FreeBall', 'EasyMango', 'SpicyBot', 'LuckyFinger', 'X-Box', 'SoftLea', 'MechStrike', 'SunnyLine', 'InnerX', 'PowerSpace', 'PowerHead', 'WatchStar', 'GoodRing', 'SpicyTals', 'TokenFrame', 'SmallMagic', 'ThinSky', 'NanoRing', 'X-Planet', 'CosmoView', 'MiniToken', 'UberCat', 'WatchSide', 'BrainPhone', 'InnerCore', 'NetSoft', 'LuckyMango', 'TokenLove', 'CosmoVibe', 'BitCell', 'FreeHouse', 'HighLove', 'WildView', 'NetWord', 'SteelCell', 'TurboFly', 'CyberHub', 'AlphaBall', 'X-Ware', 'MasterPix', 'MechClub', 'MechWork', 'FastPlanet', 'ThinTals', 'BlueFly', 'SpaceBird', 'SmallFlow', 'BinaryStrike', 'BrainHead', 'NetCell', 'WildRoom', 'DualBoo', 'TeleLine', 'DeltaSoft', 'LotusFrame', 'FastSpace', 'AlphaNote', 'UberNote', 'SmallClub', 'PowerZen', 'MisterSpace', 'SunnyRing', 'BitLook', 'WellBoo', 'NetCell', 'MicroLine', 'StarTals', 'NetFrame', 'WellPlane', 'TeleBird', 'SunnyMount', 'MechHouse', 'MetalFly', 'HotPlanet', 'EyeView', 'BlueMagic', 'Face-Z', 'MetalBook', 'PopVision', 'EyeLea', 'RedBird', 'LotusOne', 'FreeSource', 'FastLith', 'FaceCloud', 'FreeClub', 'MasterLook', 'InnerBook', 'AlphaPlane', 'MasterToken', 'SpaceClub', 'KeyWare', 'BinaryBot', 'SoftFrame', 'HighTals', 'DualMount', 'MetalRoom', 'PyClub', 'EyeFlow', 'InnerBox')
If you enjoy such kind of random verbalization, you should have a look at a far more elaborated example, namely the bullshit generator.
The Monty Hall problem
The Monty Hall Problem is one of the greatest examples of counter-intuitive probability result. Let's start by defining door_with_prize
as the unknown door with a prize and door_chosen_1st
as the initial door chosen by the player.
lea.set_prob_type('r') doors = ('Door 1', 'Door 2', 'Door 3') door_with_prize = lea.vals(*doors) door_chosen_1st = lea.vals(*doors)
print (P(door_chosen_1st==door_with_prize)) # -> 1/3
switch
method: door_open_empty = lea.joint( door_with_prize, door_chosen_1st).switch( { ( 'Door 1' , 'Door 1') : lea.vals('Door 2','Door 3'), ( 'Door 1' , 'Door 2') : 'Door 3', ( 'Door 1' , 'Door 3') : 'Door 2', ( 'Door 2' , 'Door 2') : lea.vals('Door 1','Door 3'), ( 'Door 2' , 'Door 1') : 'Door 3', ( 'Door 2' , 'Door 3') : 'Door 1', ( 'Door 3' , 'Door 3') : lea.vals('Door 1','Door 2'), ( 'Door 3' , 'Door 1') : 'Door 2', ( 'Door 3' , 'Door 2') : 'Door 1' })
Now, we are ready to evaluate probability of winning by changing the initial choice. Suppose for example that player has chosen Door 1 and Monty Hall has opened Door 2:
print (door_with_prize.given(door_chosen_1st=='Door 1', door_open_empty=='Door 2')) # -> Door 1 : 1/3 # Door 3 : 2/3
As of Lea 3.1, there is a better approach for creating door_open_empty
, which avoids the burden of creating a CPT. It uses the switch_func
method, to which we pass a function that expresses the way Monty Hall selects the door to open:
def selected_door(eliminated_doors): return lea.vals(*(d for d in doors if d not in eliminated_doors)) door_open_empty = lea.joint(door_with_prize,door_chosen_1st).switch_func(selected_door)
selected_door
function to create the dictionary passed to switch
method. Note however that switch_func
is called on-the-fly, without expanding the CPT; so it's more memory-efficient than the switch
method (this my be interesting for larger problems).
Job scheduling - 1 (integration with datetime)
Suppose you have to do 2 tasks T1 and T2, which can last uncertain durations. After careful evaluations, you estimate the following probability distributions for the duration of each task:
- T1 : 2 days (20%), 3 days (60%), 4 days (20%)
- T2 : 3 days (20%), 4 days (80%)
To handle time calculation, we shall use the handy datetime built-in Python module. For convenience, we define the constant DAY representing the duration of one day:
import datetime lea.set_prob_type('f') DAY = datetime.timedelta(1) t1 = lea.pmf({2: 0.2, 3: 0.6, 4: 0.2}) * DAY t2 = lea.pmf({3: 0.2, 4: 0.8}) * DAY print (t1) # -> 2 days, 0:00:00 : 0.2 # 3 days, 0:00:00 : 0.6 # 4 days, 0:00:00 : 0.2 print (t2) # -> 3 days, 0:00:00 : 0.2 # 4 days, 0:00:00 : 0.8
start_time = datetime.datetime(2020,9,28) end_time = start_time + t1 + t2 print (end_time) # -> 2020-10-03 00:00:00 : 0.04000000000000001 # 2020-10-04 00:00:00 : 0.28 # 2020-10-05 00:00:00 : 0.52 # 2020-10-06 00:00:00 : 0.16000000000000003
print (end_time.mean) # -> 2020-10-04 19:12:00
mean
method works by calculating the mean of the differences between each values and the first value; then this mean difference is added to the first value. With numbers, this provides the same results as the simple algorithm, with the advantage of avoiding any risk of overflow; in the case of datetime
objects, it sums actually timedelta
objects (i.e. durations), thus avoiding a raised exception.
Here is how to produce figures on the overall duration of the tasks:
print (end_time - start_time) # -> 5 days, 0:00:00 : 0.04000000000000001 # 6 days, 0:00:00 : 0.28 # 7 days, 0:00:00 : 0.52 # 8 days, 0:00:00 : 0.16000000000000003 print ((end_time - start_time).mean) # -> 6 days, 19:12:00
max(...)
function does not know how to handle Lea instances (actually, it produces a wrong result) ; you have to use the lea.max_of(…)
method:
end_time_p = start_time + lea.max_of(t1,t2) print (end_time_p) # -> 2020-10-01 00:00:00 : 0.16 # 2020-10-02 00:00:00 : 0.8400000000000001 print (end_time_p.mean) # -> 2020-10-01 20:09:36
Job scheduling - 2 (Bayesian reasoning)
We present here another job scheduling problem with tasks having uncertain durations. In contrast to Job scheduling #1, the focus is now on Bayesian reasoning.
There are 3 tasks to schedule: A, B and C. There is only one precedence constraint: task B shall not be started before the end of task A; we assume that there is enough resources to execute two tasks in parallel.
Durations of tasks A and B are characterized by known pmf (see below); duration of task C is conditioned by three possible scenarii, viz. CONSERVATIVE / EVOLUTIVE / DISRUPTIVE, each having a known probability to happen; the duration of task C is then modeled as a CPT giving a specific pmf for each scenario. We want then to calculate the shortest makespan and the total effort spent to complete the 3 tasks.
d_A = lea.pmf({ 3: '0.10', 4: '0.80', 5: '0.10' }) d_B = lea.pmf({ 2: '0.50', 3: '0.50' }) s = lea.pmf({'CONSERVATIVE': '0.60', 'EVOLUTIVE': '0.30', 'DISRUPTIVE': '0.10'}) d_C = s.switch({ 'CONSERVATIVE': lea.pmf({ 2: '0.70', 3: '0.30'} ), 'EVOLUTIVE' : lea.pmf({ 3: '0.50', 4: '0.50'} ), 'DISRUPTIVE' : lea.pmf({ 7: '0.20', 8: '0.70', 9: '0.10' }) }) makespan = lea.max_of(d_A+d_B,d_C) efforts = d_A + d_B + d_C
lea.max_of
function on the two possible paths. From this probabilistic job scheduling model, one can now make several queries to calculate probability distributions of makespan and efforts, possibly integrating new information, assumptions or constraints.
print (makespan) # 5 : 0.045 # 6 : 0.405 # 7 : 0.424 # 8 : 0.116 # 9 : 0.01 print (makespan.given(s == 'CONSERVATIVE')) # -> 5 : 0.05 # 6 : 0.45 # 7 : 0.45 # 8 : 0.05 print (makespan.given(s != 'CONSERVATIVE')) # -> 5 : 0.0375 # 6 : 0.3375 # 7 : 0.385 # 8 : 0.215 # 9 : 0.025 print (efforts.given((s == 'DISRUPTIVE') & (efforts <= 14))) # -> 12 : 0.01834862385321100917431192661 # 13 : 0.2293577981651376146788990826 # 14 : 0.7522935779816513761467889908 print (makespan.given(efforts == 8)) # -> 5 : 0.08029197080291970802919708029 # 6 : 0.9197080291970802919708029197
print (s.given((makespan <= 7) & (efforts <= 9))) # -> CONSERVATIVE : 0.8556430446194225721784776903 # EVOLUTIVE : 0.1443569553805774278215223097 print (s.given(makespan == 9)) # -> DISRUPTIVE : 1 print (lea.joint(d_A,d_B,d_C).given((makespan == 6) & (s == 'CONSERVATIVE'))) # -> (3, 3, 2) : 0.07777777777777777777777777778 # (3, 3, 3) : 0.03333333333333333333333333333 # (4, 2, 2) : 0.6222222222222222222222222222 # (4, 2, 3) : 0.2666666666666666666666666667 print (lea.joint(d_A,d_B,d_C).given((makespan == 5) & (s == 'CONSERVATIVE'))) # -> (3, 2, 2) : 0.7 # (3, 2, 3) : 0.3
The 3 coins problem
(taken from Visual Explanations of Probabilistic Reasoning)
Three coins are flipped. Given that two of the coins have come up heads, what is the probability that the third has come up tails? Many people respond that the probability is 50% but...
lea.set_prob_type('r') flip = lea.vals('Head','Tail') flip3 = flip.draw(3,replacement=True) nb_heads = flip3.count('Head') print (Pf((nb_heads == 2).given(nb_heads >= 2))) # -> 0.75
In order to convince yourself that this result is right, you can make several queries to get clues on intermediate steps.
print (flip3) # -> ('Head', 'Head', 'Head') : 1/8 # ('Head', 'Head', 'Tail') : 1/8 # ('Head', 'Tail', 'Head') : 1/8 # ('Head', 'Tail', 'Tail') : 1/8 # ('Tail', 'Head', 'Head') : 1/8 # ('Tail', 'Head', 'Tail') : 1/8 # ('Tail', 'Tail', 'Head') : 1/8 # ('Tail', 'Tail', 'Tail') : 1/8 print (lea.joint(nb_heads,flip3)) # -> (0, ('Tail', 'Tail', 'Tail')) : 1/8 # (1, ('Head', 'Tail', 'Tail')) : 1/8 # (1, ('Tail', 'Head', 'Tail')) : 1/8 # (1, ('Tail', 'Tail', 'Head')) : 1/8 # (2, ('Head', 'Head', 'Tail')) : 1/8 # (2, ('Head', 'Tail', 'Head')) : 1/8 # (2, ('Tail', 'Head', 'Head')) : 1/8 # (3, ('Head', 'Head', 'Head')) : 1/8 print (lea.joint(nb_heads,flip3).given(nb_heads >= 2)) # -> (2, ('Head', 'Head', 'Tail')) : 1/4 # (2, ('Head', 'Tail', 'Head')) : 1/4 # (2, ('Tail', 'Head', 'Head')) : 1/4 # (3, ('Head', 'Head', 'Head')) : 1/4 print (nb_heads.given(nb_heads >= 2)) # -> 2 : 3/4 # 3 : 1/4 print (P((nb_heads == 2).given(nb_heads >= 2))) # -> 3/4
Lottery
lottery = lea.interval(1,46) lottery.random_draw(6) # -> (13, 24, 25, 10, 1, 26)
Character frequency counter
In the following example, we count the number of occurrences of each letter in a famous Latin sentence. This shows how a frequency analysis can be done.
lea.set_prob_type('r') freq_latin = lea.vals(*"ALEA JACTA EST") print (freq_latin) # -> : 2/14 # A : 4/14 # C : 1/14 # E : 2/14 # J : 1/14 # L : 1/14 # S : 1/14 # T : 2/14
print (''.join(freq_latin.random(20))) # -> 'AJ LTC TAJA ATLEALJ'
The lifetime problem
Suppose we have a device that has 1% to break every time it is switched on. Let u represents the probability that it is not broken after one switch on:
lea.set_prob_type('f') u = lea.event(0.99) print (P(u)) # -> 0.99
P(u & u.new()) # -> 0.9801
import operator >>> print (P(u.times(10,operator.and_))) # -> 0.9043820750088044
print (0.99 ** 10) # -> 0.9043820750088044
2D random walk
We can use Python complex numbers to model a body (e.g. a drunk rabbit) that moves randomly in a plane : in a complex plane, the North direction is the imaginary 1j value, the East direction is the value 1, etc. Let us assume that the body moves at each time step East, North, West or South, with equal probabilities:
lea.set_prob_type('r') b = lea.vals(1,1j,-1,-1j) print (b) # -> 1 : 1/4 # -1j : 1/4 # 1j : 1/4 # -1 : 1/4
print (b.times(3)) # -> 1 : 9/64 # 1j : 9/64 # (-1+2j) : 3/64 # (2+1j) : 3/64 # (1+2j) : 3/64 # 3j : 1/64 # (-2-1j) : 3/64 # 3 : 1/64 # (-2+1j) : 3/64 # (-0-3j) : 1/64 # (-1-2j) : 3/64 # -1j : 9/64 # (1-2j) : 3/64 # (2-1j) : 3/64 # -3 : 1/64 # -1 : 9/64
print (P(abs(b.times(3)) <= 2.5)) # -> 15/16
print (abs(b.times(3))) # -> 1 : 9/16 # 2.23606797749979 : 6/16 # 3.0 : 1/16
RPG combat
Imagine a Role-Playing Game combat situation opposing you, a warrior, against a giant troll. We shall use here a simplified version of the "d20 System" (Open Game License). To know whether you can hit the troll, you have to calculate your "attack roll" (AR) and the "armour class" (AC) of the troll; if your AR is greater or equal to the AC of the troll, then you hit him and causes him some "damage".
Let us first define the dice we need (using RPG standards: "D6", "2D6" and "D20"):
lea.set_prob_type('r') D6 = lea.interval(1,6) T2D6 = D6.times(2) D20 = lea.interval(1,20)
target_armor_class = 24 attack_roll = D20 + 8 hit0 = attack_roll >= target_armor_class print (P(hit0)) # -> 1/4
hit = D20.switch( { 1: False, 20: True }, hit0 )
print (P(hit)) # -> 1/4
hit0
but it could have on a more unbalanced combat.
Note that you could model the same rule using the cpt
method:
hit = lea.cpt( ( D20 == 1, False ), ( D20 == 20, True ), ( None , hit0 ))
hit = (hit0 & (D20 != 1)) | (D20 == 20)
Now, if you can hit the troll, your magic sword shall cause him a damage of 2D6 + a bonus of 3:
damage_if_hit = T2D6 + 3 print (damage_if_hit) # -> 5 : 1/36 # 6 : 2/36 # 7 : 3/36 # 8 : 4/36 # 9 : 5/36 # 10 : 6/36 # 11 : 5/36 # 12 : 4/36 # 13 : 3/36 # 14 : 2/36 # 15 : 1/36 print (damage_if_hit.mean_f) # -> 10.0
If you want to know the overall damage distribution considering that the hit is uncertain, then we could define a CPT as above or use simply the lea.if_
convenience method:
damage = lea.if_( hit, damage_if_hit, 0)
print (damage) # -> 0 : 108/144 # 5 : 1/144 # 6 : 2/144 # 7 : 3/144 # 8 : 4/144 # 9 : 5/144 # 10 : 6/144 # 11 : 5/144 # 12 : 4/144 # 13 : 3/144 # 14 : 2/144 # 15 : 1/144 print (damage.mean_f) # -> 2.5
print (damage.times(2)) # -> 0 : 11664/20736 # 5 : 216/20736 # 6 : 432/20736 # 7 : 648/20736 # 8 : 864/20736 # 9 : 1080/20736 # 10 : 1297/20736 # 11 : 1084/20736 # 12 : 874/20736 # 13 : 668/20736 # 14 : 467/20736 # 15 : 272/20736 # 16 : 80/20736 # 17 : 104/20736 # 18 : 125/20736 # 19 : 140/20736 # 20 : 146/20736 # 21 : 140/20736 # 22 : 125/20736 # 23 : 104/20736 # 24 : 80/20736 # 25 : 56/20736 # 26 : 35/20736 # 27 : 20/20736 # 28 : 10/20736 # 29 : 4/20736 # 30 : 1/20736
print (damage.times(2).new('f')) # -> 0 : 0.5624999999999999 # 5 : 0.010416666666666664 # 6 : 0.02083333333333333 …
health_points = 4 print (P(health_points - damage.times(2) <= 0)) # -> 7/16
print (damage.random(20)) # -> (0, 0, 0, 0, 11, 0, 9, 7, 0, 0, 0, 10, 0, 0, 13, 0, 0, 9, 0, 0)
Bayesian network - the Rain-Sprinkler-Grass example
We refer here to the example of bayesian network on Wikipedia. We show hereafter one possible translation of this model in Lea:
# -- rsg_net.py -------- import lea rain = lea.event(0.20) sprinkler = lea.if_(rain, lea.event(0.01), lea.event(0.40)) grassWet = lea.joint(sprinkler,rain).switch({ (False,False): False, (False,True ): lea.event(0.80), (True ,False): lea.event(0.90), (True ,True ): lea.event(0.99)})
from rsg_net import * print (P(grassWet & sprinkler & rain)) # -> 0.0019799999999999996 print (P(rain.given(grassWet))) # -> 0.35768767563227616 print (P(grassWet.given(rain))) # -> 0.8019000000000001 print (P(grassWet.given(sprinkler & ~rain))) # -> 0.9000000000000001 print (P(grassWet.given(~sprinkler & ~rain))) # -> 0.0
A murder case
This one is a simple fictional example of using Lea's bayesian reasoning for forensic science.
Here are the facts, which are certain:
- Dr. Black has been murdered last night.
- There are three suspects: Colonel Mustard, Mrs. White and Mrs. Peacock. The killer is one of these and only one.
Let us assume that we have the following respective prior probabilities for each: 60%, 20%, 10%. Let us define then the killer variable.
killer = lea.pmf({ "Colonel Mustard": 0.60, "Mrs. White" : 0.30, "Mrs. Peacock" : 0.10 }) print (killer.as_pct()) # -> Colonel Mustard : 60.0 % # Mrs. Peacock : 10.0 % # Mrs. White : 30.0 %
if Mrs. White is the killer,
then she’ll be absent with prob. 90 %
else she’ll be absent with prob. 20 %
if Mrs. Peacock is innocent,
then she knows who’s the killer with prob. 75 %
if Mrs. Peacock is the killer or if she knows who’s the killer
then she’ll be absent with prob. 99 %
else she’ll be absent with prob. 10 %
mrs_white_is_absent = \ lea.if_( killer == "Mrs. White", lea.event(0.90), lea.event(0.20)) mrs_peacock_knows_who_is_killer = \ lea.if_( killer != "Mrs. Peacock", lea.event(0.75), True ) mrs_peacock_is_absent = \ lea.if_( (killer == "Mrs. Peacock") | mrs_peacock_knows_who_is_killer, lea.event(0.99), lea.event(0.10))
Note how we state with the True value, that Mrs. Peacock CERTAINLY knows who the killer is assuming she's HERSELF the killer (an assumption of sanity, missing in the given rule)!
Let's imagine we get now new evidences, one by one. Let us examine how the culpability probability distribution evolves as the evidences are added.
EVIDENCE 1: Mrs. Peacock is absent.
evidences = [mrs_peacock_is_absent] print (killer.given(*evidences).as_pct()) # -> Colonel Mustard : 58.3 % # Mrs. Peacock : 12.5 % # Mrs. White : 29.2 %
EVIDENCE 2: Mrs. White is present.
mrs_white_is_present = ~mrs_white_is_absent evidences.append(mrs_white_is_present) print (killer.given(*evidences).as_pct()) # -> Colonel Mustard : 78.3 % # Mrs. Peacock : 16.8 % # Mrs. White : 4.9 %
EVIDENCE 3: The killer is a woman.
killer_is_woman = killer[:4] == "Mrs." evidences.append(killer_is_woman) print (killer.given(*evidences).as_pct()) # -> Mrs. Peacock : 77.5 % # Mrs. White : 22.5 %
This example may sound simplistic, far from true cases. Be aware however, that even simpler reasoning have been mistakenly used to famous cases in the past, leading to wrong results and tragic consequences (see Dreyfus affair and, more recently, Sally Clarke affair).
The problem of Chevalier de Méré
The problem of points is at the very root of probability theory, with the famous developments of Blaise Pascal starting from a question posed by Chevalier de Méré in 1654. This problem and resolution is discussed in details in the paper of P. Gorroochurn.
Let's assume that
- chances of both players to win at any round are equal (1/2)
- the first player having a total of 6 rounds wins a prize
- at the time the game stops, player one has won 5 rounds and player two has won 3 rounds
How the prize should be be divided between the players?
Here is, translated in Lea, the method of recursion invented by Pascal, for two players with given probabilities to win a round.
def player_1_wins(prob_player_1_wins_one_round, player_1_rounds_short, player_2_rounds_short): if player_1_rounds_short == 0: return True if player_2_rounds_short == 0: return False return lea.if_(lea.event(prob_player_1_wins_one_round), player_1_wins(prob_player_1_wins_one_round, player_1_rounds_short-1, player_2_rounds_short), player_1_wins(prob_player_1_wins_one_round, player_1_rounds_short, player_2_rounds_short-1))
print (player_1_wins('1/2',1,3)) # -> False : 1/8 # True : 7/8
If you have installed the SymPy library, you can obtain a general formula based on probability p for player 1 to win at any round:
print (player_1_wins('p',1,3)) # -> False : -(p - 1)**3 # True : p*(p**2 - 3*p + 3)
For generalizing to n players, each having a given probability to win at each round, the translation is Lea is a bit more involved:
def winner(prob_player_wins_one_round, player_rounds_short): for (player,rounds_short) in player_rounds_short.items(): if rounds_short == 0: return player def winner_if_player_wins_one_round(player): new_player_rounds_short = player_rounds_short.copy() new_player_rounds_short[player] -= 1 return winner(prob_player_wins_one_round,new_player_rounds_short) return lea.pmf(prob_player_wins_one_round) \ .switch_func(winner_if_player_wins_one_round).calc()
print (winner({'A': '1/2', 'B': '1/2'}, {'A': 1, 'B': 3})) # -> A : 7/8 # B : 1/8
print (winner({'A': '1/6', 'B': '1/3', 'C': '1/2'}, {'A': 2, 'B': 3, 'C': 3})) # -> A : 29/144 # B : 36/144 # C : 79/144
Rock-paper-scissors game
Let's play the rock-paper-scissors game! First, let us define a class that allows us to represent the three objects and the nontransitive win relationship:
class RPS(object): _cur_val = 0 def __init__(self,name): object.__init__(self) self.name = name self.val = RPS._cur_val RPS._cur_val += 1 def __str__(self): return self.name __repr__ = __str__ def __hash__(self): return hash(self.name) def _cmp(self,other): delta = other.val - self.val if delta == 0: return 0 if delta == 1 or delta == -2: return -1 return +1 def __eq__(self,other): return self._cmp(other) == 0 def __ne__(self, other): return self._cmp(other) != 0 def __gt__(self, other): return self._cmp(other) == 1 def __ge__(self, other): return self._cmp(other) >= 0 def __lt__(self,other): return self._cmp(other) == -1 def __le__(self, other): return self._cmp(other) <= 0 r = RPS('rock') p = RPS('paper') s = RPS('scissors')
print ((r,p,s)) # -> (rock, paper, scissors) print (r < p) # -> True print (p < s) # -> True print (s < r) # -> True
- balanced : the 3 objects have equal probabilities
- morerock : the rock has twice more chances to be chosen than the others
- morepaper : the paper has twice more chances to be chosen than the others
lea.set_prob_type('r') balanced = lea.pmf(((r,1), (p,1), (s,1)), ordered=True) morerock = lea.pmf(((r,2), (p,1), (s,1)), ordered=True) morepaper = lea.pmf(((r,1), (p,2), (s,1)), ordered=True) print (balanced) # -> rock : 1/3 # paper : 1/3 # scissors : 1/3 print (morerock) # -> rock : 2/4 # paper : 1/4 # scissors : 1/4 print (morepaper) # -> rock : 1/4 # paper : 2/4 # scissors : 1/4
ordered
argument, to keep the order of the displayed values the same as the order of input (the sorting made by default would be a bit fuzzy considering the nontransitive order defined!).
Now, we are able to compare these behaviors to get the probabilities of win. The easier approach is to use the usual comparison operators. For example, the chances of win of morepaper over morerock are given by
print (P(morepaper > morerock)) # -> 3/8
print (P(morepaper == morerock)) # -> 5/16
cmp = RPS._cmp
lea_cmp = lea.func_wrapper(cmp) odds = lea_cmp(morepaper,morerock) print (odds) # -> -1 : 5/16 # 0 : 5/16 # 1 : 6/16
The skeptical reader can use the joint
method to verify this result, by consulting each possible draws of the game and add the atomic probabilities - provided that such reader is not skeptical on the joint
method! -:
print (lea.joint(morepaper,morerock)) # -> (scissors, paper ) : 1/16 # (scissors, scissors) : 1/16 # (scissors, rock ) : 2/16 # (rock , rock ) : 2/16 # (rock , paper ) : 1/16 # (rock , scissors) : 1/16 # (paper , scissors) : 2/16 # (paper , rock ) : 4/16 # (paper , paper ) : 2/16
odds
in the joint probability distribution (as first position, so the ordering makes a smart grouping):
print (lea.joint(odds,morepaper,morerock)) # -> (-1, rock , paper ) : 1/16 # (-1, paper , scissors) : 2/16 # (-1, scissors, rock ) : 2/16 # ( 0, scissors, scissors) : 1/16 # ( 0, rock , rock ) : 2/16 # ( 0, paper , paper ) : 2/16 # ( 1, paper , rock ) : 4/16 # ( 1, scissors, paper ) : 1/16 # ( 1, rock , scissors) : 1/16
odds
provides -1,0,+1 values, it is easy to get the distribution of fails, ties and wins after a given number of games by using the times(n) method. For example, here are the scores after 3 games, where values indicates the difference won games - lost games from the "morerock" player's point of view:
print (odds.times(3)) # -> -3 : 125/4096 # -2 : 375/4096 # -1 : 825/4096 # 0 : 1025/4096 # 1 : 990/4096 # 2 : 540/4096 # 3 : 216/4096
Yes! In particular,
- choose paper with 8 more chances than the others:
evenmorepaper = lea.pmf(((r,1), (p,8), (s,1)), ordered=True) print (lea_cmp(evenmorepaper,morerock)) # -> -1 : 11/40 # 0 : 11/40 # 1 : 18/40
- choose always paper:
alwayspaper = lea.vals(p) print (lea_cmp(alwayspaper,morerock)) # -> -1 : 1/4 # 0 : 1/4 # 1 : 2/4
Now, what about the balanced behavior?
- against morerock:
print (lea_cmp(balanced,morerock)) # -> -1 : 1/3 # 0 : 1/3 # 1 : 1/3
- against a fixed choice (e.g. paper):
print (lea_cmp(balanced,alwayspaper)) # -> -1 : 1/3 # 0 : 1/3 # 1 : 1/3
- against itself:
print (lea_cmp(balanced,balanced.new())) # -> -1 : 1/3 # 0 : 1/3 # 1 : 1/3
Note: you may have noticed the call .new()
in the last case. Why? The reason is simply the referential consistency of Lea: the following construct
print (lea_cmp(balanced,balanced)) # -> 0 : 1
von Neumann extractor
Imagine you have a physical device able to produce a "true" random stream of bits. Great! Unfortunately, this device has a bias : the probability of having 1 or 0, although constant in time, is not 1/2; furthermore, this probability is unknown.
Question: Can you find a mechanism to remove this bias and produce a stream of true random bits, with balanced probabilities (1/2)?
John von Neumann has invented a simple yet awesome trick to do the job. See von Neumann extractor.
Let us demonstrate the algorithm of von Neumann. Since we don't have the physical random device at our disposal, we shall first simulate the biased stream of bits. We use Lea to build an infinite random bits iterator, assuming a probability of 80% to have a 0:
biased_random_iter = lea.pmf({0: 0.80, 1: 0.20}).random_iter()
def gen_biased_random(): source = lea.pmf({0: 0.80, 1: 0.20}) while True: yield source.random() biased_random_iter = gen_biased_random()
from itertools import islice biased_sample = tuple(islice(biased_random_iter,500)) print (biased_sample) # -> (0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 1, 1, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 1, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 1, 0, 0, 1, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0) print (lea.vals(*biased_sample).as_pct()) # -> 0 : 79.40 % # 1 : 20.60 %
Now, we build the von Neumann extractor, as an infinite random bits iterator that takes the biased source as input:
from itertools import islice def gen_von_neumann_extractor(random_iter): while True: (b0,b1) = islice(random_iter,2) if b0 != b1: yield b0 von_neumann_extractor_iter = gen_von_neumann_extractor(biased_random_iter)
unbiased_sample = tuple(islice(von_neumann_extractor_iter,500)) print (unbiased_sample) # -> (0, 1, 1, 0, 1, 0, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 0, 0, 0, 0, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1, 0, 1, 0, 0, 0, 1, 1, 0, 1, 0, 1, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1, 1, 1, 1, 0, 1, 1, 1, 1, 0, 1, 1, 0, 1, 1, 0, 1, 1, 1, 0, 0, 1, 1, 1, 0, 1, 1, 0, 1, 1, 0, 1, 0, 1, 0, 0, 1, 0, 1, 1, 1, 1, 1, 0, 0, 0, 1, 1, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 1, 1, 1, 1, 0, 1, 0, 1, 1, 0, 0, 0, 1, 1, 0, 1, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 1, 1, 1, 0, 1, 0, 0, 1, 0, 1, 1, 1, 1, 0, 0, 0, 1, 1, 1, 0, 1, 0, 1, 1, 0, 0, 1, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 1, 1, 1, 0, 0, 1, 0, 1, 0, 1, 1, 1, 0, 0, 1, 0, 0, 1, 1, 1, 0, 1, 1, 1, 1, 0, 1, 0, 0, 1, 1, 1, 0, 1, 1, 0, 0, 1, 0, 1, 1, 0, 1, 1, 0, 1, 1, 1, 1, 0, 1, 1, 0, 0, 1, 0, 0, 0, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 1, 1, 0, 0, 0, 1, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 0, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 1, 0, 0, 0, 0, 1, 1, 0, 0, 1, 1, 1, 1, 1, 0, 1, 1, 1, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 1, 1, 0, 1, 1, 1, 1, 0, 1, 0, 1, 0, 1, 1, 1, 1, 1, 0, 1, 1, 0, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 1, 1, 1, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 1, 0, 0, 0, 1, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 1, 1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0, 1, 0, 0, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1, 1) print (lea.vals(*unbiased_sample).as_pct()) # -> 0 : 47.2 % # 1 : 52.8 %
Thank you, Doctor Miraculis !
Central limit law
a contribution of Nicky van Foreest
Requires: SciPy, numpy and matplotlib (see SciPy Stack)
From theory we know that the distribution of the sum of identical and independent random variables converges to the normal distribution. We can test this with Lea.
First we need some basic inputs.
import numpy as np from scipy.stats import norm
die = lea.vals(0, 1, 2, 3) dieN = die.times(5) dieN.plot()
We need the support and the cumulative distribution of dieN
.
T = np.array(tuple(zip(*dieN.cdf_tuple))) # the support is stored in T[0,:] supp = T[0,:] # and the distribution in T[1,:] F = T[1,:]
F
to the cumulative distribution of the normal distribution. We need a continuity correction, for otherwise the normal distribution seems to be shifted to the right.
mu = dieN.mean std = dieN.stdev FN = norm(loc = mu , scale = std).cdf(supp + 0.5)
print (np.max(np.abs(F-FN))) # -> 0.0049414764166033631
Bullshit generator
Some years ago, out of desperation, I wrote a bullshit generator in Python.
Here is a typical sample of what it produces:
As a matter of fact, the engine is provided by the interface where the actor on top of a peer-to-peer interoperability under the specification derivation rule is performed by the available SSO. The timing thread is generated by a full XML when the Web 2.0 that executes the issuer is generated by the artifact. The world-leading Web browser orchestrates a SOAP message. An aware issuer is envisioned in the generic design operator, which competes with the factory, in the aggregator that is unaffected. The authorized entity provides an interface to the fat client. The engine updates the integration. The algorithm of the model that is efficient sent to a technology thin client, which triggers a cluster, is provided by the version. The approach is repeatedly required by the action timing security under a space when the ancillary owner generates the XML-based UML model.
The program works with a a small dictionary of words and a simplified English grammar (very close to the one wired in my brain, since I am not a native English speaker!). The words are classified in categories -nouns, verbs, etc-; to simplify, these words are chosen with equal probability, which was easily done with the built-in random.choice
of Python (see TerminalNode
class). For the grammar rules however, in order to avoid too long sentences, I put a probability weight on each derivation rule, so to make the simplest constructions the likeliest. Part of the complexity of the program was then in the algorithm to make the random choices according to the probability weights (see Node
class).
In what follows, I show how to use Lea to greatly simplify the Node
and TerminalNode
classes. The statements to define English words and grammars are unchanged from the original version, excepting the swapping of probability weights and grammar rules to accommodate the Lea.pmf
method. Note that Python's random
module is no longer imported (although used by Lea internally).
'''python ====================================================================== Bullshit Generator by Pierre Denis, 2009, 2014 ====================================================================== ''' # -------------------------------------------------- # grammar engine # -------------------------------------------------- import lea class Node(object): def setTermsChoices(self,*termsChoices): self.termsChoices = lea.pmf(termsChoices) def getWords(self): terms = self.termsChoices.random() for term in terms: if isinstance(term,str): yield term else: for word in term.getWords(): yield word def getString(self): res = " ".join(self.getWords()) res = ", ".join(w.strip() for w in res.split(",") if w.strip()) if res.endswith(", "): res = res[:-2] return res[0].upper() + res[1:] + "." class TerminalNode(object): def __init__(self,*words): self.words = lea.vals(*words) def getWords(self): yield self.words.random() # -------------------------------------------------- # grammar # -------------------------------------------------- verb = TerminalNode( "accesses", "activates", "administrates", "aggregates", "builds", "calculates", "checks", "competes with", "completes", "complies with", "controls", "covers", "delivers", "dispatches", "eases", "encapsulates", "encompasses", "executes", "extracts", "features", "generates", "gets", "governs", "guides", "has", "increases", "inherits from", "is", "keeps track of", "leverages", "makes", "manages", "manages", "maximizes", "mitigates", "monitors", "must have", "needs", "offers", "opens", "operates on", "optimizes", "orchestrates", "overwrites", "performs", "populates", "precludes", "provides", "provides", "provides an interface to", "reads", "receives", "reduces", "reduces the need of", "registers", "regulates", "relies on", "requires", "resides on", "resides within", "retrieves", "retrieves the data in", "runs on", "schedules", "integrates with", "sends", "shall be", "shall have", "should be", "should have", "starts", "stores", "streamlines", "subscribes to", "subscribes to", "supersedes", "takes", "targets", "triggers", "updates", "validates", "writes") passiveVerb = TerminalNode( "accessed by", "achieved by", "aggregated by", "applicable for", "asserted by", "authorized by", "based upon", "built from", "built upon", "collected by", "controlled by", "dedicated to", "deployed on", "derived from", "dispatched by", "driven by", "eased by", "enabled by", "envisioned in", "extracted from", "generated by", "in the scope of", "installed on", "integrated in", "located in", "managed by", "maximized by", "monitored by", "opened by", "optimized by", "orchestrated by", "packaged in", "performed by", "populated by", "processed by", "provided by", "provided by", "received by", "refreshed by", "registered in", "related to", "required by", "responsible for", "scheduled by", "sent to", "serialized by", "serialized in", "started in", "stored by", "stored in", "stored on", "the interface of", "updated by", "validated by") aSimpleName = TerminalNode( "COTS", "GRID processing", "Java program", "LDAP registry", "Portal", "RSS feed", "SAML token", "SOAP message", "SSO", "TCP/IP", "UML model", "URL", "W3C", "Web", "Web 2.0", "Web browser", "Web page", "Web service", "back-end", "backbone", "bandwidth", "bean", "bridge", "browser", "bus", "business", "business model", "call", "catalogue", "class", "client", "cluster", "collection", "communication", "component", "compression", "concept", "conceptualization", "connection", "console", "content", "context", "cookie", "customization", "data", "database", "datastore", "deployment", "derivation rule", "design", "development", "device", "directory", "discovery", "dispatcher", "document", "domain", "factory", "fat client", "feature", "file", "form", "frame", "framework", "function", "gateway", "genericity", "geomanagement", "goal", "governance", "granularity", "guideline", "header", "key", "layer", "leader", "library", "link", "list", "log file", "logic", "look-and-feel", "manager", "market", "mechanism", "message", "meta-model", "metadata", "model", "modeling", "module", "network", "performance", "persistence", "personalization", "plug-in", "policy", "port", "portal", "practice", "presentation layer", "privacy", "private key", "procedure", "process", "processor", "processing", "product", "protocol", "recommendation", "registration", "registry", "relationship", "resource", "responsibility", "role", "rule", "scenario", "scenario", "scheduler", "schema", "security", "server", "service", "service provider", "servlet", "session", "skeleton", "software", "solution", "source", "space", "specification", "standard", "state", "statement", "streaming", "style sheet", "subscriber", "subsystem", "system", "system", "table", "target", "task", "taxonomy", "technique", "technology", "template", "thin client", "thread", "throughput", "timing", "tool", "toolkit", "topic", "unit", "usage", "use case", "user", "user experience", "validation", "value", "version", "vision", "work", "workflow") anSimpleName = TerminalNode( "API", "IP address", "Internet", "UDDI", "XML", "XML file", "abstraction", "access", "acknowledgment", "action", "actor", "administrator", "aggregator", "algorithm", "application", "approach", "architecture", "artifact", "aspect", "authentication", "availability", "encapsulation", "end-point", "engine", "engine", "entity", "entity", "environment", "event", "identifier", "information", "integration", "interface", "interoperability", "issuer", "object", "ontology", "operation", "operator", "operator", "opportunity", "orchestration", "owner") aAdjective = TerminalNode( "BPEL", "DOM", "DTD", "GRID", "HTML", "J2EE", "Java", "Java-based", "Java-based", "UML", "SAX", "WFS", "WSDL", "basic", "broad", "bug-free", "business-driven", "client-side", "coarse", "coherent", "compatible", "complete", "compliant", "comprehensive", "conceptual", "consistent", "control", "controller", "cost-effective", "custom", "data-driven", "dedicated", "distributed", "dynamic", "encrypted", "event-driven", "fine-grained", "first-class", "free", "full", "generic", "geo-referenced", "global", "global", "graphical", "high-resolution", "high-level", "individual", "invulnerable", "just-in-time", "key", "layered", "leading", "lightweight", "logical", "main", "major", "message-based", "most important", "multi-tiers", "narrow", "native", "next", "next-generation", "normal", "password-protected", "operational", "peer-to-peer", "performant", "physical", "point-to-point", "polymorphic", "portable", "primary", "prime", "private", "proven", "public", "raw", "real-time", "registered", "reliable", "remote", "respective", "right", "robust", "rule-based", "scalable", "seamless", "secondary", "semantic", "server-side", "service-based", "service-oriented", "simple", "sole", "specific", "state-of-the-art", "stateless", "storage", "sufficient", "technical", "thread-safe", "uniform", "unique", "used", "useful", "user-friendly", "virtual", "visual", "web-based", "web-centric", "well-documented", "wireless", "world-leading", "zero-default") anAdjective = TerminalNode( "AJAX", "OO", "XML-based", "abstract", "ancillary", "asynchronous", "authenticated", "authorized", "auto-regulated", "available", "aware", "efficient", "international", "interoperable", "off-line", "official", "online", "open", "operational", "other", "own", "unaffected", "up-to-date") adverb = TerminalNode( "basically", "comprehensively", "conceptually", "consistently", "definitely", "dramatically", "dynamically", "expectedly", "fully", "generally", "generically", "globally", "greatly", "individually", "locally", "logically", "mainly", "mostly", "natively", "officially", "physically", "practically", "primarily", "repeatedly", "roughly", "sequentially", "simply", "specifically", "surely", "technically", "undoubtly", "usefully", "virtually") sentenceHead = TerminalNode( "actually", "as a matter of fact", "as said before", "as stated before", "basically", "before all", "besides this", "beyond that point", "clearly", "conversely", "despite these facts", "for this reason", "generally speaking", "if needed", "in essence", "in other words", "in our opinion", "in the long term", "in the short term", "in this case", "incidentally", "moreover", "nevertheless", "now", "otherwise", "periodically", "roughly speaking", "that being said", "then", "therefore", "to summarize", "up to here", "up to now", "when this happens") (name, aName, anName, nameTail, adjective, nameGroup, simpleNameGroup, verbalGroup, simpleVerbalGroup, sentence, sentenceTail) = [Node() for i in range(11)] aName.setTermsChoices( (( aSimpleName, ), 50 ), (( aSimpleName, name ), 5 ), (( aSimpleName, name ), 8 ), (( aName, nameTail ), 5 )) anName.setTermsChoices( (( anSimpleName, ), 50 ), (( anSimpleName, name ), 8 ), (( anName, nameTail ), 5 )) nameTail.setTermsChoices( (( "of", nameGroup ), 2 ), (( "from", nameGroup ), 2 ), (( "under", nameGroup ), 1 ), (( "on top of", nameGroup ), 1 )) name.setTermsChoices( (( aName, ), 1 ), (( anName, ), 1 )) adjective.setTermsChoices( (( aAdjective, ), 1 ), (( anAdjective, ), 1 )) nameGroup.setTermsChoices( (( simpleNameGroup, ), 10 ), (( simpleNameGroup, passiveVerb, nameGroup ), 1 ), (( simpleNameGroup, "that", simpleVerbalGroup ), 1 ), (( simpleNameGroup, ", which", simpleVerbalGroup, "," ), 1 )) simpleNameGroup.setTermsChoices( (( "the", name ), 40 ), (( "the", adjective, name ), 20 ), (( "a", aName ), 10 ), (( "an", anName ), 10 ), (( "a", aAdjective, name ), 5 ), (( "an", anAdjective, name ), 5 )) verbalGroup.setTermsChoices( (( verb, nameGroup ), 10 ), (( adverb, verb, nameGroup ), 1 ), (( "is", passiveVerb, nameGroup ), 10 ), (( "is", adverb, passiveVerb, nameGroup ), 1 ), (( "is", adjective ), 1 ), (( "is", adverb, adjective ), 1 )) simpleVerbalGroup.setTermsChoices( (( verb, simpleNameGroup ), 2 ), (( "is", adjective ), 1 )) sentence.setTermsChoices( (( nameGroup, verbalGroup ), 20 ), (( sentenceHead, "," , nameGroup, verbalGroup ), 4 ), (( sentence, sentenceTail ), 4 )) sentenceTail.setTermsChoices( (( "in", nameGroup ), 12 ), (( "within", nameGroup ), 5 ), (( "where", nameGroup, verbalGroup ), 5 ), (( "when", nameGroup, verbalGroup ), 5 ), (( "because it", verbalGroup ), 2 ), (( "; that's why it", verbalGroup ), 1 )) # -------------------------------------------------- # main program # -------------------------------------------------- try: import win32com.client voice = win32com.client.Dispatch("sapi.SPVoice") except: voice = None print ("press <enter> to resume, 'q'+<enter> to quit\n") while True: for i in range(8): generatedSentence = sentence.getString() print (generatedSentence) if voice: voice.speak(generatedSentence) # PY2: cmd = raw_input() cmd = input() if cmd.strip().lower() == "q": break
Note: for Python 2 users, just replace the statement marked with # PY2:
You can run this program and verify that it produces the same kind of gobbledygook than the original version. Let him the last word:
The official operator is performed by the most important actor. The interoperability is off-line. The environment is authorized by the rule-based identifier that needs the availability thread. The W3C is located in a high-resolution style sheet because it is surely point-to-point. A session sends the use case owner that guides the asynchronous content. The approach aggregator, which shall have the algorithm, shall be an identifier. The OO link administrates a controller information that executes an identifier when the thread-safe opportunity engine user experience that populates a bridge roughly extracts the most important API. The environment is derived from the dispatcher from a unit acknowledgment of an interface.
Updated