Take a look at how Python implementation leaks random memory fragments to a dbhash (Berkeley) database. C version does not (hasn't been caught).

Execute make to download the word list, compile the C implementation and run tests. To compare results use:

  • vim dictionary.diff
  • vbindiff dictionary-py.db dictionary-c.db

Both databases should contain nouns as keys and articles as values but usually the database created with the Python script contains some other random fragments from the source (dictionary.txt) file.

Another script -- pushtodbm3.py, is a minimal example but it doesn't work for everyone (i.e. secrets.db doesn't always contain the secret phrase) apart from keys and values.