HTTPD can use large amounts of memory for its hash table of pages

Issue #935 closed
Roland Haas created an issue

I just found that for an (stripped down!) parameter file of my production simulations HTTPD creates a Hash table via functions in CACTUS_HOME/src/util/Hash.c that consumes 1GB of memory (for the array of pointers that is the top level hash table structure) to hold about 10000 entries. I am not sure if this is due to a poor choice of hashing function (util_HashHash) or the fact that it doubles the size of the table until the number of entries is smaller than the number of hash slots in (in Util_HashRehash and Util_HashAdd). It was somewhat unexpected that a non-science thorn would use that much memory.

Alternatives to use less memory might be to increase the filling factor ie. only rehash if hash->keys > 10*hash->fill (maybe starting from some limit of keys) or to use something like the binary tree implementation in BinaryTree.c (but not that one since it is broken in at least two places).

I simple linear list might also be sufficient since HTTPD does not have to be lightning fast and serve hundreds of request per second I expect.

Keyword: HTTPD

Comments (8)

  1. Roland Haas reporter
    • removed comment

    There's also the Cactus util_Table interface, which however uses linear lists and disallows "/" characters in the table entry keys (only apparent from reading the source code comments, not in the docs). So for HTTPD it does not sound like a good choice. Otherwise: I like std::map<,> (HTTPD is C code though so needs wrappers). We already require it for Carpet anyway so it does not add extra dependencies.

  2. Erik Schnetter
    • removed comment

    Instead of wrappers I want to suggest renaming the .c files to .cc.

    We agreed some time ago that reasonable use of C++ is fine. In this case, it would significantly improve and at the same time simplify the code.

  3. Roland Haas reporter
    • removed comment

    Attached please find two patches that make HTTPD use the STL map container. The first one (use_cxx_map) turn the affected files int C++ source code and directly uses the map template class. The changes to the source files are a ugly since while HTTPD is decent C code it is rather ugly C++ code. The second one leaves the files as C files but introduces a small set of wrapper functions that map std::map's interface onto the C functions that HTTPD used. The changes to the source are essentially a search and replace of util_Hash by httpd_Map (plus removal of superfluous arguments).

    Either patch is ok with me though I wrote the second (wrapping) one after having finished the first one and realizing how awkward it was and thus have a preference for the second one.

    There's no test for HTTPD but running a quick test run on my workstation, both seem to not break the code.

    Ok to to apply (which patch)?

  4. Log in to comment