markov-mania is a project to sample various programming languages by implementing a naive Markov-chain text generator in each of them.
The idea is to take the "hello world" demo a step further. For a given programming language/environment, Hello World covers:
- the tool chain
- string literals
On top of this, Markov adds:
- control structures (loops are pretty necessary)
- filesystem access
- some non-trivial data structure (usually maps and lists)
- pseudo-random numbers
So gives a better feel for actually *doing* something. If you don't find it useful, no worries. It is mostly for my own edification while playing with new languages.
A Markov chain is a sequence in which the probabilities of possible future states are determined by the present state. For a Markov text generator, we treat the most recent N words W(0)..W(N-1) as the present state, and the most recent words after adding one more as the future state; that is, W(1)..W(N). The probability function for W(N) to be word X is the number of times that X followed W(0)..W(N-1) in some input corpus, divided by the number of times the sequence W(0)..W(N-1) appeared in the corpus at all.
In order to make short sentences more likely, we also assume that the corpus is divided into lines, each of which is a sentence. For the purposes of frequency calculation, no words follow the end of a sentence.
If that didn't make any sense, the Wikipedia article on Markov chains may help: http://en.wikipedia.org/wiki/Markov_chain
For example, let's assume this is the input corpus:
A cat flew. A cat chewed.
With a chain length of 2, the probability distribution looks like this:
|_, _||A||2/2 = 1|
|_, A||cat||2/2 = 1|
|A, cat||flew||1/2 = 0.5|
|A, cat||chewed||1/2 = 0.5|
|cat, flew||<end>||1/1 = 1|
|cat, chewed||<end>||1/1 = 1|
For larger input corpuses, the text can be more interesting. Here is a sample from a text generator run with the text of Alice in Wonderland as the input corpus:
won't she be savage if I've kept her waiting!' Alice felt so desperate that she was quite pleased to find that her neck would bend about easily in any direction like a serpent.