Clone wiki

KVSlib / Home


Welcome to the KVS library wiki.


KVSlib was developed as a hash table component for the Objective Modula-2 compiler project. However, it was designed as a fully encapsulated standalone library to make it reusable in other projects and for other purposes.

As the Objective Modula-2 project has started looking for a revision control system, the KVS library was spun off to be used as a data set for testing revision control systems and services, hence the hosting of the library on this site.

The library had just received a feature addition for which new test cases need to be written and this presented a good opportunity to evaluate revision control systems and services.

Features and Implementation

The library provides a fully encapsulated API to create, use and destroy dynamically allocated associative arrays for storing arbitrary data objects. The arrays are realised as hash tables with 32bit keys and external chaining. The initial size of a new table is user controlled. Storage and retrieval functions take keys as an argument which makes the implementation independent of the hash algorithm used to generate keys. A default algorithm is supplied but may be replaced. Calls to memory allocation and release functions are realised as macros in order to allow drop in replacement of the underlying memory management system. For instance, the Boehm garbage collector could be used by redefining these macros.

Data objects may be stored by copy or by reference. Any object can also be retrieved by copy or by reference. A reference count is maintained for each object stored in the table, regardless of whether it was stored by copy or by reference. Retrieval by reference increments the reference count and references must be released to decrement their reference count. When removal of an object is requested, it will not be removed until all retrievals by reference have been released. Until then, the object is marked for removal and can no longer be retrieved. The object is finally removed when all outstanding retrievals by reference have been released.

The hash table implementation has been optimised for fast lookups. The last lookup is cached internally to make subsequent lookups of the same object faster. This allows better clarity of code by first testing if an object exists and then retrieve its value in a subsequent API call with very little overhead even when the load factor of the hash table is very high. However, hash table resizing is not implemented. It could however be added in the future.

The library is most suitable in situations where an application needs fast constant-time access to large sets of data such as configuration data, symbol tables in compilers, word tables in spell checkers or dictionaries etc. It may not be suitable in situations where table resizing is desirable.