Database back-end

Issue #80 new
Samir Menon repo owner created an issue

The database is very useful, but is presently prone to race conditions. We avoid those by running threads at different rates, which minimizes collision. And by allowing errors in the slower (reader) threads, but not in the faster (writer) thread.

  • While this approach works, it is not scalable to a distributed platform.
  • Moreover, overcoming it will require tedious locking mechanisms, which might not be ideal for first time users who aren't expert coders.
  • Finally, any locking mechanism will slow the whole system down.

One potential solution is to have a nicely integrated actual database instead of the raw memory access. This will allow scaling to multi-threads without any issue. It will also simplify logging (dump database store to disk) and/or reloading state.

Potential candidates for the proposed back-end:

  1. Leveldb: Google's lightweight no-sql uni-process db. [git clone https://code.google.com/p/leveldb/]
  2. Redis: In-memory no-sql database that persists on disk [git clone https://github.com/antirez/redis.git]

Comments (5)

  1. Samir Menon reporter

    Updates. Seems like a good idea to serialize the data using either JSON or Protocol Buffers before storing it in the back-end.

    Leveldb: Disadvantage. Single process, multi thread. So can't publish data to another lang. Redis: Disadvantage. Potentially slower than leveldb.

    There are numerous back end storage options: http://kkovacs.eu/cassandra-vs-mongodb-vs-couchdb-vs-redis

  2. Samir Menon reporter

    Also seems like a good idea to use some serialized string compression lib (like Google snappy) to speed up store/access.

    Ideal setup seems like:

    Data class -> Protocol Buffers -> <serialized string> -> Snappy -> <compressed string> -> Redis

    Using redis could also allow us to use Python front ends to interact with the system and/or script things.

  3. Log in to comment